Tutorial :Getting contents of square brackets with regex, including nested ones



Question:

Is there any way to have this:

[one[two]][three]  

And extract this with a regex?

Array (      [0] => one[two]      [1] => two      [2] => three  


Solution:1

For PHP you can use recursion in regular expressions that nearly gives you what you want:

$s = 'abc [one[two]][three] def';  $matches = array();  preg_match_all('/\[(?:[^][]|(?R))*\]/', $s, $matches);  print_r($matches);  

Result:

Array  (      [0] => Array          (              [0] => [one[two]]              [1] => [three]          )    )  

For something more advanced than this, it's probably best not to use regular expressions.


Solution:2

You can apply the regex with a loop, for example,

  1. Match all \[([^\]]*)\].
  2. For each match, replace \x01 with [ and \x02 with ] and output the result.
  3. Replace all of \[([^\]]*)\] into \x01$1\x02 (warning: assumes \x01 and \x02 are not used by the string.)
  4. Repeat 1 until there's no match.

But I'd write a string scanner for this problem :).


Solution:3

#!/usr/bin/perl  use Data::Dumper;  @a = ();  $re = qr/\[((?:[^][]|(??{$re}))*)\](?{push@a,$^N})/;  '[one[two]][three]' =~ /$re*/;  print Dumper \@a;  # $VAR1 = [  #           'two',  #           'one[two]',  #           'three'  #         ];  

Not exactly what you asked for, but it's kinda doable with (ir)regular expression extensions. (Perl 5.10's (?PARNO) can replace the usage of (??{CODE}).)


Solution:4

In Perl 5.10 regex, you can use named backtracking and a recursive subroutine to do that:

#!/usr/bin/perl    $re = qr  /       (                      # start capture buffer 1          \[                  #   match an opening brace          (                   # capture buffer 2          (?:                 #   match one of:              (?>             #     don't backtrack over the inside of this group                  [^\[\]]+    #       one or more non braces              )               #     end non backtracking group          |                   #     ... or ...              (?1)            #     recurse to bracket 1 and try it again          )*                  #   0 or more times.          )                   # end buffer 2          \]                  #   match a closing brace       )                      # end capture buffer one      /x;      print "\n\n";     sub strip {  my ($str) = @_;  while ($str=~/$re/g) {      $match=$1; $striped=$2;      print "$striped\n";      strip($striped) if $striped=~/\[/;      return $striped;  }  }      $str="[one[two]][three][[four]five][[[six]seven]eight]";    print "start=$str\n";    while ($str=~/$re/g) {       strip($1) ;  }  

Output:

start=[one[two]][three][[four]five][[[six]seven]eight]  one[two]  two  three  [four]five  four  [[six]seven]eight  [six]seven  six  

Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »