Extracting string with regex stored in hash

2023-03-17 14:38 问答作者：

I'm trying to parse out specific values from a text file, and output those to a different file.

I'm using regular expressions stored in a hash (matched up with their descriptive name) to search through a string (scalar), and then storing the discovered values in an array, which is then written out to 开发者_如何学JAVAa file.

I've got everything working, except for the searching/extracting part. (I've only just learned Perl in the past couple days, so I wouldn't be surprised if I was making some really simple mistakes.)

$inputstring = 'Lorem ipsum dolor Date: 20110131 quis semper egestas.';

%myregexhash = ( Date => '/([12][09][0-9][0-9][0-1][0-2][0-9][0-9])/' );

@foundvaluesarray=();

while ( ($thefieldname, $theregex) = each (%myregexhash))
{
    if ($inputstring =~ $theregex) 
    {
        push(@foundvaluesarray, "$thefieldname: $&\n");
        $inputstring = $';
    }
}

print "@foundvaluesarray";

The array fills up with the field names ("Date:"), but not the values I'm looking for ("20110131").

Any idea what I'm doing wrong?

Make one small change:

%myregexhash = ( Date => qr/([12][09][0-9][0-9][0-1][0-2][0-9][0-9])/ );

Note the use of qr//, which compiles a regex.

You're new, so I'd recommend a few other changes.

Any non-trivial program should begin with the following front matter:

#! /usr/bin/env perl

use strict;
use warnings;

The strict pragma has nice benefits such as catching misspelled variable names at compile time and checking your use of references. The warnings pragma turns on extra warning diagnostics that can alert you to questionable cases in your code.

Now must predeclare:

my $inputstring = 'Lorem ipsum dolor Date: 20110131 quis semper egestas.';

my %myregexhash = ( Date => qr/([12][09][0-9][0-9][0-1][0-2][0-9][0-9])/ );

my @foundvaluesarray=();

The = () is implied in an array or hash declaration, so you don't see it in idiomatic Perl.

You don't want to use $& if you can help it because it slows down your entire program.

WARNING: Once Perl sees that you need one of $&, $`, or $' anywhere in the program, it has to provide them for every pattern match. This may substantially slow your program. Perl uses the same mechanism to produce $1, $2, etc., so you also pay a price for each pattern that contains capturing parentheses. (To avoid this cost while retaining the grouping behaviour, use the extended regular expression (?: ... ) instead.) But if you never use $&, $` or $', then patterns without capturing parentheses will not be penalized. So avoid $&, $' , and $` if you can, but if you can't (and some algorithms really appreciate them), once you've used them once, use them at will, because you've already paid the price. As of 5.005, $& is not so costly as the other two.

Because you surrounded your pattern with parentheses, the substring that matched is captured in $1, so grab it from there.

Also, the way you chopped off the front of $inputstring is much more naturally expressed in Perl with s///.

while (my ($thefieldname, $theregex) = each (%myregexhash))
{
    if ($inputstring =~ s/$theregex//) 
    {
        push(@foundvaluesarray, "$thefieldname: $1\n");
    }
}

print "@foundvaluesarray";

Output:

Date: 20110131

继续阅读：hash perl regex

Extracting string with regex stored in hash

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？