Problem with Regular Expression URL matching for http://localhost/
I am trying to use this project on GitHub https://github.com/ErisDS/Migrate to migrate the URL settings in my Wordpress database from a Localhost dev install to a live URL.
At the momen开发者_JS百科t the code throws an error for the URL to be replaced "http://localhost/mysitename", but does accept the new URL "http://www.mywebsitename.com"
From what I can tell the error comes from this regular expression not validating the localhost as a valid URL - any ideas how i can update this to accept the localhost URL?
The full code can be viewed on GitHub.
function checkURL($url)
{
$url_regex = '/^(http\:\/\/[a-zA-Z0-9_\-]+(?:\.[a-zA-Z0-9_\-]+)*\.[a-zA-Z]{2,4}(?:\/[a-zA-Z0-9_]+)*(?:\/[a-zA-Z0-9_]+\.[a-zA-Z]{2,4}(?:\?[a-zA-Z0-9_]+\=[a-zA-Z0-9_]+)?)?(?:\&[a-zA-Z0-9_]+\=[a-zA-Z0-9_]+)*)$/';
if($url == 'http://')
{
return false;
}
return preg_match($url_regex, $url);
}
You can add "localhost" to the acceptable hostnames by changing it to this:
/^(http\:\/\/(?:[a-zA-Z0-9_\-]+(?:\.[a-zA-Z0-9_\-]+)*\.[a-zA-Z]{2,4}|localhost)(?:\/[a-zA-Z0-9_]+)*(?:\/[a-zA-Z0-9_]+\.[a-zA-Z]{2,4}(?:\?[a-zA-Z0-9_]+\=[a-zA-Z0-9_]+)?)?(?:\&[a-zA-Z0-9_]+\=[a-zA-Z0-9_]+)*)$/
This part matches the http://
prefix:
http\:\/\/
And this part matches the hostname:
[a-zA-Z0-9_\-]+(?:\.[a-zA-Z0-9_\-]+)*\.[a-zA-Z]{2,4}
So you can just change the hostname checker to a non-capturing alternation group that explicitly includes "localhost":
(?:X|localhost)
where X
is the existing hostname matching sub-expression. The (?:
bit starts a non-capturing group, using a non-capturing group ensures that any group number references won't get messed up.
And some live examples: http://ideone.com/M0qqh
I think a simple regex might serve you better though, that one doesn't deal with CGI parameters very well. You could try this:
/(http:\/\/[^\/]+\/([^\s]+[^,.?!:;])?)/
and see if it works for your data. That one is pretty loose but probably good enough for a one-shot conversion. That should properly match the URLs in these:
'here is a URL http://localhost/x?a=b.'
'More http://example.com nonsense!.
You could also try Joseph's from the comment.
It's not working because somewhere in the regex you are asking for a dot in between http://
and /
. http://localhost/whatever
has no dot, so it fails.
You really should be using something like filter_var()
or parse_url()
instead of regexes for URL validation.
The Author of the MIGRATE scripts has updated the project on GitHub to include this fix. thanks for your help guys.
https://github.com/ErisDS/Migrate
精彩评论