Tutorial :Help with C++ Boost::regex


I'm trying to get all words inside a string using Boost::regex in C++.

Here's my input :

"Hello there | network - bla bla hoho"

using this code :

      regex rgx("[a-z]+",boost::regex::perl|boost::regex::icase);          regex_search(input, result, rgx);           for(unsigned int j=0; j<result.size(); ++j)         {           cout << result[j] << endl;         }  

I only get the first word "Hello".. whats wrong with my code ? result.size() returns 1.

thank you.


regex_search only finds the first match. To iterate over all matches, use regex_iterator


Try rgx("(?:(\\w+)\\W+)+"); as your regex. (?: will start a non-marking group which is finished by the matching )+ which will match the words in the string 1 or more times (\\w+) will match alpha, digits and underscores 1 or more times as a marked group, i.e. typical word like characters which are returned to you in result[i] \\W+ will match one or more contiguous non-word characters, i.e. whitespace, |, - etc.


You're only searching for alphabetic characters, not spaces, pipes or hyphens. regex_search() probably just returns the first match.


Perhaps you could try using repeated captures with the following regex "(?:([a-z]+)\\b\\s*)+".


To match words, try this regex:

regex rgx("\\<[a-z]+\\>",boost::regex::perl|boost::regex::icase);  

According to the docs, \< denotes the start of a word and \> denotes the end of a word in the Perl variety of Boost regex matching.

I'm afraid someone else has to explain how to iterate the matches. The Boost documentation makes my brain hurt.


You would need to capture any set of [a-z]+ (or some other regex for matching "words") bound by spaces or string boundaries. You could try something like this:


In any event, this isn't really a boost::regex problem, it's just a regex problem. use perl or the bash shell (or any number of web tools) to get your regex figured out, then use in your code.

Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Next Post »