PHP regex that finds PHP errors
I want a PHP regex that can find errors on a page. So when I visit a site and crawl the page that I can list the errors on the site.
Currently I have the following code:
preg_match('/<b>.+<\/b>:.+ in <b>\/.+<\/b> on line <b>[0-9]+<\/b><br( \/)?>/msi',$html,$errors);
It can show if errors occurred, but it will not list them! I get the full html page in the array ($errors[0]
)
Could anybody help?
EDIT: So I have a page with for example the following HTML-source, from which I want to extract the PHP errors:
<b>Warning</b>: session_start() [<a href='function.session-start'>function.session-start</a>]: The session id contains invalid characters, valid characters are only a-z, A-Z and 0-9 in <b>/home/.../public_html/articlescript/init.php</b> on line <b>127</b><br />
<br />
<b>Warning</b>: session_start() [<a href='function.session-start'>function.session-start</a>]: Cannot send session cache limiter - headers already sent (output started at /home/.../public_html/articlescript/init.php:127) in <b>/home/.../public_ht开发者_JS百科ml/articlescript/init.php</b> on line <b>127</b><br />
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<title>...
Since – well, you know – you shouldn’t use regular expressions to parse HTML, try this using PHP’s DOM library:
libxml_use_internal_errors(true);
$doc = new DOMDocument();
$doc->loadHTML($str);
$messages = array();
foreach ($doc->getElementsByTagName('b') as $elem) {
if (in_array($elem->textContent, array('Error', 'Warning', 'Notice'))) {
$buffer = $elem->textContent;
while ($elem->nextSibling !== null && strtolower($elem->nextSibling->localName) !== 'br') {
$elem = $elem->nextSibling;
$buffer .= $elem->textContent;
}
$messages[] = $buffer;
}
}
This will search for B
elements that’s content is one of “Error”, “Warning”, or “Notice” and take the textual contents from there up to the next BR
element. The initial call of libxml_use_internal_errors
will prevent that parsing errors will be reported.
Forgive my language but it's quite foolish to attempt to parse HTML with regular expressions, especially potentially-malformed HTML. Use an HTML parsing library instead.
For HTML parsing and validation in HTML, I would refer to this answer; also check out the tidy extension.
Remember to escape your \
in strings.
preg_match_all('#<b>(.+?)</b>:(.+?) in <b>(.+?)</b> on line <b>([0-9]+)</b><br(?: /)?>#is',$string,$errors);
This code on ideone
Put brackets ()
around the bits of regex that you want to be stored in $errors
.
You'll also want to use preg_match_all()
rather then preg_match()
.
If this is your own website you can either: set the log levels and parse your log files (easier) or run your scripts from the command line with php -l.
精彩评论