How to start writing a web log analyzer in Perl?
Taking information from a file that ou开发者_运维技巧tputs entry after entry in this format: IPAddress x x [date:time -x] "method url httpversion" statuscode bytes "referer" "useragent"
How would you go about accessing that file as a command-line argument and storing that information so that you could arrange it alphabetically by the IP addresses while keeping all of the information together? I assume I would need to use hashes and arrays somehow.
You could theoretically have as many text files as you want as command-line arguments but so far I haven't gotten that part to work, I just have:
./logprocess.pl monster.log #monster.log is the file that contains entries
then in the code, assume all variables not specified have been declared as scalars
my $x = 0;
my @hashstuff;
my $importPage = $ARGV[0];
my @pageFile = `$importPage`;
foreach my $line (@pageFile)
{
$ipaddy, $date, $time, $method, $url, $httpvers, $statuscode, $bytes, $referer, $useragent =~ m#(\d+.\d+.\d+.\d+) \S+ \S+ [(\d+/\S+/\d+):(\d+:\d+:\d+) \S+] "(\S+) (\S+) (\S+)" (\d+) (\d+) "(\S+)" "(\S+ \S+ \S+ \S+ \S+)"#
%info = ('ipaddy' => $ipaddy, 'date' => $date, 'time' => $time, 'method' => $method, 'url' => $url, 'httpvers' => $httpvers, 'statuscode' => $statuscode, 'bytes' => $bytes, 'referer' => $referer, 'useragent' => $useragent);
$hashstuff[$x] = %info;
$x++;
}
There is definitely a better way to do this, as my compiler says I have global symbol errors like:
Ambiguous use of % resolved as operator % at ./logprocess.pl line 51 (#2) (W ambiguous)(S) You said something that may not be interpreted the way you thought. Normally it's pretty easy to disambiguate it by supplying a missing quote, operator, parenthesis pair or declaration.
and it won't execute. I can't use any modules.
If the log is produced by Apache, you could utilize Apache::ParseLog module. Look at examples at the end of the page for inspiration.
Regarding the error you mention, you should declare your array with my
:
my @hashstuff;
and adding there a references. Also single item is accessed with $hashstuff[$x]
(note the dollar at the beginning):
$hashstuff[$x] = { %info };
or you can get rid of $x
completely:
push @hashstuff, { %info };
精彩评论