Tutorial :Best way to chop a signature off an email body



Question:

I am parsing out some emails. Mobile Mail, iPhone and I assume iPod touch append a signature as a separate boundary, making it simple to remove. Not all mail clients do, and just use '--' as a signature delimiter.

I need to chop off the '--' from a string, but only the last occurrence of it.

Sample copy

 hello, this is some email copy-- check this out   --   Tom Foolery  

I thougth about splitting on '--', removing the last part, and I would have it, but explode() and split() neither seem to return great values for letting me know if it did anything, in the event there is not a match.

I can not get preg_replace to go across more than one line. I have standardized all line endings to \n

What is the best suggestion to end up with "hello, this is some email copy-- check this out", taking not, there will be cases where there is no signature, and there are of course going to be cases where I can not cover all the cases.


Solution:1

Actually correct signature delimiter is "-- \n" (note the space before newline), thus the delimiter regexp should be '^-- $'. Although you might consider using '^--\s*$', so it'll work with OE, which gets it wrong.


Solution:2

Try this:

preg_replace('/--[\r\n]+.*/s', '', $body)  

This will remove everything after the first occurence of -- followed by one or more line break characters. If you just want to remove the last occurence, use /.*--[\r\n]+.*/s instead.


Solution:3

Instead of just chopping of everything after -- could you not cache the last few emails sent by that user or service and compare. The bit at the bottom that looks like the others can be safely removed leaving the proper message intact.


Solution:4

I think in the interest of being more bulletproof, I will take the non regex route

        echo substr($body, 0, strrpos($body, "\n--"));  


Solution:5

This seems to give me the best result:

$body = preg_replace('/\s*(.+)\s*[\r\n]--\s+.*/s', '$1', $body);

  • It will match and trim the last "(newline)--(optional whitespace/newlines)(signature)"
  • Trim all remaining newlines before the signature
  • Trim beginning/ending whitespace from the body (remaining newlines before the signature, whitespace at the start of the body, etc)
  • Will only work if there's some text (non-whitespace) before the signature (otherwise it won't strip the signature and return it intact)

Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »