How can I store captures from a Perl regular expression into separate variables?


I have a regex:


How would I capture the results of each parentheses into 3 different variables, one for each parentheses? Right now I using one array to capture all the results, they come out sequential but then I have to parse them and the list could be huge.

@results = ($string =~ /abc(def)ghi(jkl)mno(pqr)/igs);  


Your question is a bit ambiguous to me, but I think you want to do something like this:

my (@first, @second, @third);  while( my ($first, $second, $third) = $string =~ /abc(def)ghi(jkl)mno(pqr)/igs) {      push @first, $first;      push @second, $second;      push @third, $third;  }  


Starting with 5.10, you can use named capture buffers as well:

#!/usr/bin/perl    use strict; use warnings;    my %data;    my $s = 'abcdefghijklmnopqr';    if ($s =~ /abc (?<first>def) ghi (?<second>jkl) mno (?<third>pqr)/x ) {      push @{ $data{$_} }, $+{$_} for keys %+;  }    use Data::Dumper;  print Dumper \%data;  


$VAR1 = {            'first' => [                         'def'                       ],            'second' => [                          'jkl'                        ],            'third' => [                         'pqr'                       ]          };

For earlier versions, you can use the following which avoids having to add a line for each captured buffer:

#!/usr/bin/perl    use strict; use warnings;    my $s = 'abcdefghijklmnopqr';    my @arrays = \ my(@first, @second, @third);    if (my @captured = $s =~ /abc (def) ghi (jkl) mno (pqr) /x ) {      push @{ $arrays[$_] }, $captured[$_] for 0 .. $#arrays;  }    use Data::Dumper;  print Dumper @arrays;  


$VAR1 = [            'def'          ];  $VAR2 = [            'jkl'          ];  $VAR3 = [            'pqr'          ];

But I like keeping related data in a single data structure, so it is best to go back to using a hash. This does require an auxiliary array, however:

my %data;  my @keys = qw( first second third );    if (my @captured = $s =~ /abc (def) ghi (jkl) mno (pqr) /x ) {      push @{ $data{$keys[$_]} }, $captured[$_] for 0 .. $#keys;  }  

Or, if the names of the variables really are first, second etc, or if the names of the buffers don't matter but only order does, you can use:

my @data;  if ( my @captured = $s =~ /abc (def) ghi (jkl) mno (pqr) /x ) {      push @{ $data[$_] }, $captured[$_] for 0 .. $#captured;  }  


An alternate way of doing it would look like ghostdog74's answer, but using an array that stores hash references:

my @results;  while( $string =~ /abc(def)ghi(jkl)mno(pqr)/igs) {      my ($key1, $key2, $key3) = ($1, $2, $3);      push @results, {           key1 => $key1,          key2 => $key2,          key3 => $key3,      };  }    # do something with it    foreach my $result (@results) {      print "$result->{key1}, $result->{key2}, $result->{key3}\n";  }  

with the main advantage here of using a single data structure, AND having a nice readable loop.


@OP, when parenthesis are captured, you can use the variables $1,$2....these are backreferences

$string="zzzabcdefghijklmnopqrsssszzzabcdefghijklmnopqrssss";  while ($string =~ /abc(def)ghi(jkl)mno(pqr)/isg) {      print "$1 $2 $3\n";  }  


$ perl perl.pl  def jkl pqr  def jkl pqr  


You could have three different regex's each focusing on specific groups. Obviously, you would like to just assign different groups to different arrays in the regex, but I think your only option is to split the regex up.

