Remove entire word if the word contains specific string
I would like to do the following, preferably with PHP:
Remove an entire word if a part of the word contains a specific string. This should be case insensitive and work multiple times, e.g. on a large text.
Pseudo-code:
开发者_Go百科match = "www."
lots_of_random_text = "... hello and welcome to www.stackoverflow.com! blah blah"
result = magic_function(lots_of_random_text, "www.")
result should now equal to: "... hello and welcome to blah blah"
.
How would I do this the most efficient way?
It seems that a regular expression would suit this task. Check out the docs for preg_match to start with, or the main PCRE docs for a complete overview.
php> $text="hello and welcome to www.stackoverflow.com snout pickle and while you're here, check out a unicorn at www.unicornmagicfairywonderland.net!";
php> $cleaned_text=preg_replace('#www\.[\w\d]+\.(com|net|org)#','',$text);
php> echo $cleaned_text;
hello and welcome to snout pickle and while you're here, check out a unicorn at !
The key part is the '#www.[\w\d]+.(com|net|org)#'. That means match any string that starts with www.,has any number of word characters or digits, and ends with .com, .net or .org.
If you're trying to replace any URL, the expression is going to be much more complex than this, so be warned this is incomplete. You'd want to make sure it matches words that start with http://, have no www. or have a different subdomain, and end with other domains like .co.uk or .edu, right?
Regular expressions are in general, complex and tough to get right. You may find www.regular-expressions.info helpful.
精彩评论