can't make a preg_match right !
I have this link inside an HTML page.
<img id="catImage" width="250" alt="" src="http://dev-server2/image2.png" />
I want to get the value of src
and am not ge开发者_运维技巧tting along with preg_match
and all of this regex stuff. Is this one right?
preg_match(
"/<img id=\"catImage\" width=\"[0-9]+\" alt=\"\" src=\"([[a-zA-Z0-9]\/-._]*)\"/",
$artist_page["content"], $matches);
I get an empty array!
First and foremost, the portion of your regex that deals with the src attribute doesn't account for the colon that appears in the URL.
I'd suggest changing the src portion (and any other attribute values) to look instead for the close quote and capture everything between:
... src=\"([^\"]*)\" ....
Does this work?
'/<img id="catImage"[^>]+src="([^"]*)"/'
I'm still really new on regex but I thought I would throw my thoughts out there and get some criticism for it. Should the expression be something like (?<=(src=")).*(?=["])
? (not quite PHP formatted, yet). This would grab the contents of the src
attribute.
"/<img id=\"catImage\" width=\"[0-9]+\" alt=\"\" src=\"([a-zA-Z0-9/.:_-]*)\"/"
Should do. Note that I edited the range [ ... ]
part. The hyphen (-
) has a special meaning so I put it last to add it as a literal in the range. Also, I added the :
char (thanks @user333699). This hints, however, that you should not try and think of any valid URL character. Instead, match anything until you know that the entire value of the src
attribute is matched:
"/<img id=\"catImage\" width=\"[0-9]+\" alt=\"\" src=\"([^\"]*)\"/"
I.e., anything that is not a quote ("
).
Note that in order to get the value of src you'll have to perform additional computation after the preg_match, as your match is going to return the entire tag.
It might be worth diving into XPath, depending on what you really want to do with it.
精彩评论