Perl regex: how to know number of matches
I'm looping through a series of regexes and matching it against lines in a file, like this:
for my $regex (@{$regexs_ref}) {
LINE: for (@rawfile) {
/@$开发者_高级运维regex/ && do {
# do something here
next LINE;
};
}
}
Is there a way for me to know how many matches I've got (so I can process it accordingly..)?
If not maybe this is the wrong approach..? Of course, instead of looping through every regex, I could just write one recipe for each regex. But I don't know what's the best practice?
If you do your matching in list context (i.e., basically assigning to a list), you get all of your matches and groupings in a list. Then you can just use that list in scalar context to get the number of matches.
Or am I misunderstanding the question?
Example:
my @list = /$my_regex/g;
if (@list)
{
# do stuff
print "Number of matches: " . scalar @list . "\n";
}
You will need to keep track of that yourself. Here is one way to do it:
#!/usr/bin/perl
use strict;
use warnings;
my @regexes = (
qr/b/,
qr/a/,
qr/foo/,
qr/quux/,
);
my %matches = map { $_ => 0 } @regexes;
while (my $line = <DATA>) {
for my $regex (@regexes) {
next unless $line =~ /$regex/;
$matches{$regex}++;
}
}
for my $regex (@regexes) {
print "$regex matched $matches{$regex} times\n";
}
__DATA__
foo
bar
baz
In CA::Parser's processing associated with matches for /$CA::Regex::Parser{Kills}{all}/, you're using captures $1 all the way through $10, and most of the rest use fewer. If by the number of matches you mean the number of captures (the highest n for which $n has a value), you could use Perl's special @- array (emphasis added):
@LAST_MATCH_START
@-
$-[0]is the offset of the start of the last successful match.$-[n]is the offset of the start of the substring matched by n-th subpattern, orundefif the subpattern did not match. Thus after a match against$_,$&coincides withsubstr $_, $-[0], $+[0] - $-[0]. Similarly,$ncoincides withsubstr $_, $-[n], $+[n] - $-[n]if
$-[n]is defined, and$+coincides withsubstr $_, $-[$#-], $+[$#-] - $-[$#-]One can use
$#-to find the last matched subgroup in the last successful match. Contrast with$#+, the number of subgroups in the regular expression. Compare with@+.This array holds the offsets of the beginnings of the last successful submatches in the currently active dynamic scope.
$-[0]is the offset into the string of the beginning of the entire match. The n-th element of this array holds the offset of the nth submatch, so$-[1]is the offset where$1begins,$-[2]the offset where$2begins, and so on.After a match against some variable
$var:
- $` is the same as
substr($var, 0, $-[0])$&is the same assubstr($var, $-[0], $+[0] - $-[0])$'is the same assubstr($var, $+[0])$1is the same assubstr($var, $-[1], $+[1] - $-[1])$2is the same assubstr($var, $-[2], $+[2] - $-[2])$3is the same assubstr($var, $-[3], $+[3] - $-[3])
Example usage:
#! /usr/bin/perl
use warnings;
use strict;
my @patterns = (
qr/(foo(bar(baz)))/,
qr/(quux)/,
);
chomp(my @rawfile = <DATA>);
foreach my $pattern (@patterns) {
LINE: for (@rawfile) {
/$pattern/ && do {
my $captures = $#-;
my $s = $captures == 1 ? "" : "s";
print "$_: got $captures capture$s\n";
};
}
}
__DATA__
quux quux quux
foobarbaz
Output:
foobarbaz: got 3 captures quux quux quux: got 1 capture
How about below code:
my $string = "12345yx67hjui89";
my $count = () = $string =~ /\d/g;
print "$count\n";
It prints 9 here as expected.
加载中,请稍侯......
精彩评论