How to analyze log files with regexp? alternatives?
I want to analyze some logs for some statisics of usage. Basically what I wanna do is use regexp to ease the pain of analysis
So I have a text file with logs something along this
2011-09-17 09:16:33,531 INFO [someJava.class.special] sendRequest: fromGevoName=null, ctrlPageId=fooBar, actionId=search,
2011-09-17 09:16:33,976 INFO [someJavaB.class] fooBar
2011-09-17 09:16:33,982 DEBUG [someOtherJava.class] abc blabala
2011-09-17 09:16:33,987 INFO [someJava.class.special] sendRequest completed: fromGevoName=XYZ, toPageId=fooBar, userId=someUser
.... I want to count the occurrences of all words at position
[someJava.class.special] ctrlPageId=....
in this case fooBar and only this occurrences. There are many different fooBar and I want to count how often one occurred.
My idea was to replace with a matching group and repeat it, something along this
((?s).*\[someJava.class.special\] sendRequest: fr开发者_开发知识库omGevoname=.* ctrlPageId=([^,]*)(?-s).*)*
and replace it with the matching group \2
Afterwards analyse the list in excel. But my greptool does not repeat the regexp, it only matches once. I use grepWin, is there maybe a different tool / regexp for this?
Well it basically was a problem of wingrep or grepwin. The modifier (?s) which enables linebreaks on dots or disables it (?-s) does not work if you use it repeatedly. So I exchanged the regexp with something along this:
([\n-\[\(\]\.,:0-9a-zA-Z]).*\[someJava.class.special\] sendRequest: fromGevoname=.* ctrlPageId ([^,]*)(?-s).*
so basically i exchanged the first linebreakmatching dot with all symbols which might occur in the string including linebreaks. It works... i'm sure there is a better solution, always open for it
I'm not sure I understand, but if the output you are looking for is:
someJava fooBar
Something like this should work (php script):
<?php
$log = file_get_contents('file.log')
preg_match_all("#\[(?<className>\w+)\.class(.special)?\](.*?)ctrlPageId=(?<controllerName>\w+)#i", $log, $m);
for ($i=0; $i < count($m[0]); $i++) {
echo $m['className'][$i] . ' ' . $m['controllerName'][$i] . "\n";
}
精彩评论