Simple Perl Regex parser

2023-01-16 09:23 问答作者：

Hey, I am working on a very basic parser. I am almost certain that my regex is correct, but values do not seem to be being stored in my $1 and $2. Am I doing something wrong? I am just looking for tips to alter my code. Thanks for any advice! Also, I am new to Perl, so if I did something wrong, I am looking to get off on the right foot and develop solid habits.

Sample line from file:

Sat 02-August-2008 20:47 - 123.112.3.209 - "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1;

I am just getting the hours from the times.

foreach my $lin开发者_如何学Goe (@lines)
{   
my $match =~ /\d\d-\w+-\d{4} (\d)(\d):\d\d/;

if( $1 == 0)
{
    $times[$2] = $times[$2] + 1;
}
else
{   
    my $time = $1.$2;
    $times[$time] = $times[$time]+ 1;
}
 }


print "\n";
for(my $i=0;$i<24;$i++)
{
print "$i: $times[$i]\n";
}

If you want to match on $line shouldn't the code read

$line =~ /\d\d-\w+-\d{4} (\d)(\d):\d\d/;

See here.

Can you give some example of what kind of pattern you are try to match? Otherwise I won't be able to tell if your regex matches your pattern or not. However there are some improvements you can make about your code:

First off, always test if a match is successful if you want to use $1, $2 etc

if($match =~ /\d\d-\w+-\d{4} (\d)(\d):\d\d/) {

    if( $1 == 0)
    {
        $times[$2] = $times[$2] + 1;
    }
    else
    {   
        my $time = $1.$2;
        $times[$time] = $times[$time]+ 1;
    }
} else {
    warn "no match!\n";
}

Second, always use the '-w' switch. In this case, you will probably get the warning message about $1 and $2 are not initialized due to failed match:

#!/usr/bin/perl -w

First, if you are new to Perl, one of the strengths is CPAN and the many solutions there. Don't reinvent the wheel!

There is a great module called Date::Parse that will parse the time part for you. Then the only regex problem that you have is separating out the time part of your line.

Based on your one line sample, this code will do that:

use strict;
use warnings;

use Date::Parse;

my $line="Sat 02-August-2008 20:47 - 123.112.3.209 - \"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1;";
my $tmpart;

if ($line=~ /^(.*\d+:\d+) -/) {
    $tmpart=$1;

    print "Time part = $tmpart\n";

    my $time=str2time($tmpart);
    my ($ss,$mm,$hh,$day,$month,$year,$zone) = strptime($tmpart);

    $year+=1900;
    $month+=1;

    print "Unix time: $time\n";
    print "Parsed time: $month/$day/$year $hh:$mm:$ss  \n\n";
} 
else {
   warn "no match!\n";
}

This will return a Unix time number that is then easy to work with. Or (as shown) you can parse the individual components of the time.

继续阅读：parsing perl regex

Simple Perl Regex parser

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？