Tutorial :How can I store captures from a Perl regular expression into separate variables?



Question:

I have a regex:

/abc(def)ghi(jkl)mno(pqr)/igs  

How would I capture the results of each parentheses into 3 different variables, one for each parentheses? Right now I using one array to capture all the results, they come out sequential but then I have to parse them and the list could be huge.

@results = ($string =~ /abc(def)ghi(jkl)mno(pqr)/igs);  


Solution:1

Your question is a bit ambiguous to me, but I think you want to do something like this:

my (@first, @second, @third);  while( my ($first, $second, $third) = $string =~ /abc(def)ghi(jkl)mno(pqr)/igs) {      push @first, $first;      push @second, $second;      push @third, $third;  }  


Solution:2

Starting with 5.10, you can use named capture buffers as well:

#!/usr/bin/perl    use strict; use warnings;    my %data;    my $s = 'abcdefghijklmnopqr';    if ($s =~ /abc (?<first>def) ghi (?<second>jkl) mno (?<third>pqr)/x ) {      push @{ $data{$_} }, $+{$_} for keys %+;  }    use Data::Dumper;  print Dumper \%data;  

Output:

$VAR1 = {            'first' => [                         'def'                       ],            'second' => [                          'jkl'                        ],            'third' => [                         'pqr'                       ]          };

For earlier versions, you can use the following which avoids having to add a line for each captured buffer:

#!/usr/bin/perl    use strict; use warnings;    my $s = 'abcdefghijklmnopqr';    my @arrays = \ my(@first, @second, @third);    if (my @captured = $s =~ /abc (def) ghi (jkl) mno (pqr) /x ) {      push @{ $arrays[$_] }, $captured[$_] for 0 .. $#arrays;  }    use Data::Dumper;  print Dumper @arrays;  

Output:

$VAR1 = [            'def'          ];  $VAR2 = [            'jkl'          ];  $VAR3 = [            'pqr'          ];

But I like keeping related data in a single data structure, so it is best to go back to using a hash. This does require an auxiliary array, however:

my %data;  my @keys = qw( first second third );    if (my @captured = $s =~ /abc (def) ghi (jkl) mno (pqr) /x ) {      push @{ $data{$keys[$_]} }, $captured[$_] for 0 .. $#keys;  }  

Or, if the names of the variables really are first, second etc, or if the names of the buffers don't matter but only order does, you can use:

my @data;  if ( my @captured = $s =~ /abc (def) ghi (jkl) mno (pqr) /x ) {      push @{ $data[$_] }, $captured[$_] for 0 .. $#captured;  }  


Solution:3

An alternate way of doing it would look like ghostdog74's answer, but using an array that stores hash references:

my @results;  while( $string =~ /abc(def)ghi(jkl)mno(pqr)/igs) {      my ($key1, $key2, $key3) = ($1, $2, $3);      push @results, {           key1 => $key1,          key2 => $key2,          key3 => $key3,      };  }    # do something with it    foreach my $result (@results) {      print "$result->{key1}, $result->{key2}, $result->{key3}\n";  }  

with the main advantage here of using a single data structure, AND having a nice readable loop.


Solution:4

@OP, when parenthesis are captured, you can use the variables $1,$2....these are backreferences

$string="zzzabcdefghijklmnopqrsssszzzabcdefghijklmnopqrssss";  while ($string =~ /abc(def)ghi(jkl)mno(pqr)/isg) {      print "$1 $2 $3\n";  }  

output

$ perl perl.pl  def jkl pqr  def jkl pqr  


Solution:5

You could have three different regex's each focusing on specific groups. Obviously, you would like to just assign different groups to different arrays in the regex, but I think your only option is to split the regex up.


Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »