Perl regex: how to know number of matches

2023-01-14 01:37 问答作者：

I'm looping through a series of regexes and matching it against lines in a file, like this:

for my $regex (@{$regexs_ref}) {
    LINE: for (@rawfile) {
        /@$开发者_高级运维regex/ && do {
            # do something here
            next LINE;
        };
    }
}

Is there a way for me to know how many matches I've got (so I can process it accordingly..)?

If not maybe this is the wrong approach..? Of course, instead of looping through every regex, I could just write one recipe for each regex. But I don't know what's the best practice?

If you do your matching in list context (i.e., basically assigning to a list), you get all of your matches and groupings in a list. Then you can just use that list in scalar context to get the number of matches.

Or am I misunderstanding the question?

Example:

my @list = /$my_regex/g;
if (@list)
{
  # do stuff
  print "Number of matches: " . scalar @list . "\n";
}

You will need to keep track of that yourself. Here is one way to do it:

#!/usr/bin/perl

use strict;
use warnings;

my @regexes = (
    qr/b/,
    qr/a/,
    qr/foo/,
    qr/quux/,
);

my %matches = map { $_ => 0 } @regexes;
while (my $line = <DATA>) {
    for my $regex (@regexes) {
        next unless $line =~ /$regex/;
        $matches{$regex}++;
    }
}

for my $regex (@regexes) {
    print "$regex matched $matches{$regex} times\n";
}

__DATA__
foo
bar
baz

In CA::Parser's processing associated with matches for /$CA::Regex::Parser{Kills}{all}/, you're using captures $1 all the way through $10, and most of the rest use fewer. If by the number of matches you mean the number of captures (the highest n for which $n has a value), you could use Perl's special @- array (emphasis added):

@LAST_MATCH_START

@-

$-[0] is the offset of the start of the last successful match. $-[n] is the offset of the start of the substring matched by n-th subpattern, or undef if the subpattern did not match. Thus after a match against $_, $& coincides with substr $_, $-[0], $+[0] - $-[0]. Similarly, $n coincides with
substr $_, $-[n], $+[n] - $-[n]
if $-[n] is defined, and $+ coincides with
substr $_, $-[$#-], $+[$#-] - $-[$#-]
One can use $#- to find the last matched subgroup in the last successful match. Contrast with $#+, the number of subgroups in the regular expression. Compare with @+.

This array holds the offsets of the beginnings of the last successful submatches in the currently active dynamic scope. $-[0] is the offset into the string of the beginning of the entire match. The n-th element of this array holds the offset of the nth submatch, so $-[1] is the offset where $1 begins, $-[2] the offset where $2 begins, and so on.

After a match against some variable $var:

$` is the same as substr($var, 0, $-[0])

$& is the same as substr($var, $-[0], $+[0] - $-[0])

$' is the same as substr($var, $+[0])

$1 is the same as substr($var, $-[1], $+[1] - $-[1])

$2 is the same as substr($var, $-[2], $+[2] - $-[2])

$3 is the same as substr($var, $-[3], $+[3] - $-[3])

Example usage:

#! /usr/bin/perl

use warnings;
use strict;

my @patterns = (
  qr/(foo(bar(baz)))/,
  qr/(quux)/,
);

chomp(my @rawfile = <DATA>);

foreach my $pattern (@patterns) {
  LINE: for (@rawfile) {
    /$pattern/ && do {
      my $captures = $#-;
      my $s = $captures == 1 ? "" : "s";
      print "$_: got $captures capture$s\n"; 
    };
  }
}

__DATA__
quux quux quux
foobarbaz

Output:

foobarbaz: got 3 captures
quux quux quux: got 1 capture

How about below code:

my $string = "12345yx67hjui89";    
my $count = () = $string =~ /\d/g;
print "$count\n";

It prints 9 here as expected.

继续阅读：extract perl regex

Perl regex: how to know number of matches

@LAST_MATCH_START

@-

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？