开发者

PHP/regex to parse NGINX error log

The error entry looks like:

2011/06/10 13:30:10 [error] 23263#0: *1 directory index of "/var/www/ssl/" is forbidden, client: 86.186.86.232, server: hotelpublisher.com, request: "GET / HTTP/1.1", host: "hotelpublisher.com"

I need to parse:

date/time
error type
error message
client
server
request
host

The first bit (parsing date) is easy using substr. Though my REGEX is not too good and I am hoping to hear a better solution. Simply exploding by , won't work as well, I guess, since error can pot开发者_如何学Goentially contain a comma as well.

What is the most efficient way to do this?


What about:

$str = '2011/06/10 13:30:10 [error] 23263#0: *1 directory index of "/var/www/ssl/" is forbidden, client: 86.186.86.232, server: hotelpublisher.com, request: "GET / HTTP/1.1", host: "hotelpublisher.com"';
preg_match('~^(?P<datetime>[\d+/ :]+) \[(?P<errortype>.+)\] .*?: (?P<errormessage>.+), client: (?P<client>.+), server: (?P<server>.+), request: (?P<request>.+), host: (?P<host>.+)$~', $str, $matches);
print_r($matches);

output:

Array
(
    [0] => 2011/06/10 13:30:10 [error] 23263#0: *1 directory index of "/var/www/ssl/" is forbidden, client: 86.186.86.232, server: hotelpublisher.com, request: "GET / HTTP/1.1", host: "hotelpublisher.com"
    [datetime] => 2011/06/10 13:30:10
    [1] => 2011/06/10 13:30:10
    [errortype] => error
    [2] => error
    [errormessage] => *1 directory index of "/var/www/ssl/" is forbidden
    [3] => *1 directory index of "/var/www/ssl/" is forbidden
    [client] => 86.186.86.232
    [4] => 86.186.86.232
    [server] => hotelpublisher.com
    [5] => hotelpublisher.com
    [request] => "GET / HTTP/1.1"
    [6] => "GET / HTTP/1.1"
    [host] => "hotelpublisher.com"
    [7] => "hotelpublisher.com"
)


This is how I did it.

$error      = array();

$error['date']          = strtotime(substr($line, 0, 19));

$line                   = substr($line, 20);
$error_str              = explode(': ', strstr($line, ', client:', TRUE), 2);

$error['message']       = $error_str[1];

preg_match("|\[([a-z]+)\] (\d+)#(\d+)|", $error_str[0], $matches);

$error['error_type']    = $matches[1];


$args_str   = explode(', ', substr(strstr($line, ', client:'), 2));
$args       = array();

foreach($args_str as $a)
{
    $name_value = explode(': ', $a, 2);

    $args[$name_value[0]]   = trim($name_value[1], '"');
}

$error  = array_merge($error, $args);

die(var_dump( $error ));

Which will produce:

array(7) {
  ["date"]=>
  int(1307709010)
  ["message"]=>
  string(50) "*1 directory index of "/var/www/ssl/" is forbidden"
  ["error_type"]=>
  string(5) "error"
  ["client"]=>
  string(13) "86.186.86.232"
  ["server"]=>
  string(18) "hotelpublisher.com"
  ["request"]=>
  string(14) "GET / HTTP/1.1"
  ["host"]=>
  string(18) "hotelpublisher.com"
}

Just want to see few votes to know which is the preferred option regarding performance/reliability.


Try this code:

$str = '2011/06/10 13:30:10 [error] 23263#0: *1 directory index of "/var/www/ssl/" is forbidden, client: 86.186.86.232, server: hotelpublisher.com, request: "GET / HTTP/1.1", host: "hotelpublisher.com"';
preg_match('~^(\d{4}/\d{2}/\d{2}\s\d{2}:\d{2}:\d{2})\s\[([^]]*)\]\s[^:]*:\s(.*?)\sclient:\s([^,]*),\sserver:\s([^,]*),\srequest:\s"([^"]*)",\shost:\s"([^"]*)"~', $str, $m );
list($line, $dateTime, $type, $msg, $client, $server, $request, $host ) = $m;

var_dump($dateTime);
var_dump($type);
var_dump($msg);
var_dump($client);
var_dump($server);
var_dump($request);
var_dump($host);

OUTPUT

string(19) "2011/06/10 13:30:10"
string(5) "error"
string(60) "*1 directory index of "/var/www/ssl/" is forbidden,"
string(13) "86.186.86.232"
string(18) "hotelpublisher.com"
string(14) "GET / HTTP/1.1"
string(18) "hotelpublisher.com"


If you don't have access to formatting the log file, this will do:

$regex = '~(\d{4}/\d{2}/\d{2}) (\d{2}:\d{2}:\d{2}) \[(\w+)\] (.*?) client: (\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}), server: (.*?), request: "(.*?)", host: "(.*?)"~';
preg_match($regex, $line, $matches);
list($all,$date,$time,$type,$message,$client,$server,$request,$host) = $matches;

If you do have access to how the log is formatted, put the message at the end instead of the middle, then you can do:

$log_arr = explode(', ', $line, 7);
list($date,$time,$type,$client,$server,$request,$host,$message) = $matches;

The secret is that explode takes an optional third argument, limiting the number of elements to split apart. So by setting it to 8, the remainder of the line will be stored as the last element in the returned array. See the manual for more information on this.


Please check Nginx Error Log Reader; a php reader/parser for Nginx error log file. the script is able to read error logs recursively and display them in a user friendly table. Script configuration includes the number of bytes to read per page and allow pagination through the error log.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜