Tutorial :Mathematica StringReplace to replace a substring containing newlines



Question:

I have something like the following in a string:

blah blah    BEGINIGNORE     this stuff should get stripped out  ENDIGNORE    more stuff here  

I would like to do this (perl syntax): s/BEGINIGNORE.*ENDIGNORE//s -- namely, strip out everything between BEGINIGNORE and ENDIGNORE, inclusive. You would think the following would do that in Mathematica:

StringReplace[str, re["BEGINIGNORE[.\\s]*ENDIGNORE"]->""]  

But it doesn't. How do I do this in Mathematica?

PS: I define the following alias: re = RegularExpression;


Solution:1

It turns out that for some reason "[.\\s]" and "[.\\n]" don't work but "(.|\\n)" does. So the following works:

strip[s_String] := StringReplace[s, re@"BEGINIGNORE(.|\\n)*ENDIGNORE" -> ""]  


Solution:2

Try:

StringReplace[str, re["BEGINIGNORE(.|\\n)*ENDIGNORE"]->""]  


Solution:3

Insert the (?s) modifier in the regex. That's equivalent to Perl's /s modifier and is part of standard PCRE syntax.

StringReplace[str, re["BEGINIGNORE(?s).*ENDIGNORE"]->""]  

More details in this answer to a related question: Bug in Mathematica: regular expression applied to very long string


Solution:4

As you followed up, you need parens rather than square brackets around the expression that you wanted to *.

The square brackets define a character class here, as in most regular expression languages. That's why [.\\s] isn't working as you expected, it stands for a set of characters rather than a parenthesized expression. Maybe the Mathematica use of [] for expressions got you thinking in that direction?


Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »