Quick help with regex in php
Im not proficient in regex at all, but I need to strip IDs from urls, that are from a large block of text.
URL look like this:
domain.com/path/ID_GOES_HERE
The problem is, its inside emails, which come in a wide variety of formats ranging from:
- <a href="http://www.domain.com/path/ID_GOES_HERE">http://www.domain.com/path/ID_GOES_HERE</a>
- www.domain.com/path/ID_GOES_HERE
- http://domain.com/path/ID_GOES
_HERE
The ID is letters and numbers only. No other characters of any kind.
EDIT: Another issue is, since Im processing emails, which are horribly formatted, sometimes the URL ends up at the end of the line, where it gets broken up between 2 lines, which puts an equal sign a开发者_开发问答t the end, like so:
http://www.domain.com/path/EE33FDE291A=
8D972
So the ID gets deformed.
This should do what you need:
<?php
$matches = array();
preg_match_all('@domain\.com/path/((?:[a-z0-9_]|=\n)*)@i', $subject, $matches);
foreach ($matches[1] as $id) {
$id = str_replace("=\n", '', $id);
// Do your processing here.
}
preg_match('/^domain\.com\/path\/([a-zA-Z0-9]*)$/', $text, $matches = array());
if(isset($matches[1]))
echo $matches[1];
try this regex
/(?:https?:\/\/)?(?:www.)?domain.com/path/([\d\w]+(?:\=?(?:\(?:[\r\n]|\r\n|)(?:[\d\w]+)?)?)/
seems to match all of your test cases
精彩评论