what are these regex expressions meaning?
preg_match( '/<title>(.*)<\/title>/',.....)
开发者_如何学运维
preg_match("/src=[\"']?([^\"']?.*(png|jpg|gif))[\"']?/i",....)
The first is to extract the contents from a HTML title
tag.
The second is to extract images' src
attributes from a HTML document, but is very imperfect (It won't catch references to image resources that end in .jpeg
or have no extension at all).
Regular expressions are not a good idea for parsing HTML! One should use a HTML parser instead. They are far from fireproof.
1) Matches anything between <title>
and </title>
a la an HTML page's title, so run against <title>foo</title>
results in the match being foo
.
2) Matches any string following src=
that ends in png
, jpg
or gif
. Used to extract the URL of images in HTML code.
Per @Pekka's answer: don't do this in real world code.
精彩评论