开发者

How to remove email addresses and links from a string in PHP?

How do I remove all email addresses and links from 开发者_运维问答a string and replace them with "[removed]"


You can use preg_replace to do it.

for emails:

$pattern = "/[^@\s]*@[^@\s]*\.[^@\s]*/";
$replacement = "[removed]";
preg_replace($pattern, $replacement, $string);

for urls:

$pattern = "/[a-zA-Z]*[:\/\/]*[A-Za-z0-9\-_]+\.+[A-Za-z0-9\.\/%&=\?\-_]+/i";
$replacement = "[removed]";
preg_replace($pattern, $replacement, $string);

Resources

PHP manual entry: http://php.net/manual/en/function.preg-replace.php

Credit where credit is due: email regex taken from preg_match manpage, and URL regex taken from: http://www.weberdev.com/get_example-4227.html


Try this:

$patterns = array('<[\w.]+@[\w.]+>', '<\w{3,6}:(?:(?://)|(?:\\\\))[^\s]+>');
$matches = array('[email removed]', '[link removed]');
$newString = preg_replace($patterns, $matches, $stringToBeMatched);

Note: you can pass an array of patterns and matches into preg_replace instead of running it twice.


The answer I was going to upvote was deleted. It linked to a Linux Journal article Validate an E-Mail Address with PHP, the Right Way that points out what's wrong with almost every email regex anyone proposes.

The range of valid forms of an email address is much broader than most people think.


My answer is a variation of Josiah's /[^@\s]*@[^@\s]*\.[^@\s]*/ for emails, which works fine but also matches any puctuation after the email address itself: demo 1

Adapt the regex as follows /[^@\s]*@[^@\s\.]*\.[^@\s\.,!?]*/ to exclude . , ! and ?: demo 2


There are a lot of characters valid in the first local part of the email (see What characters are allowed in an email address?), so these lines would replace all valid email addresses:

<?php
$c='a-zA-Z-_0-9'; // allowed characters in domainpart
$la=preg_quote('!#$%&\'*+-/=?^_`{|}~', "/"); // additional allowed in first localpart
$email="[$c$la][$c$la\.]*[^.]@[$c]+\.[$c]+";
$t = preg_replace("/\b($email)\b/", '[removed]', $t);
// or with a link:
$t = preg_replace("/\b($email)\b/", '<a href="mailto:\1">\1</a>', $t);

# replace urls:
a='A-Za-z0-9\-_';
$t = preg_replace("/[htpsftp]+[:\/\/]+[$a]+\.+[$a\.\/%&;+~=\?#]+/i", '[removed]', $t);

This will cover most valid email addresses, be informed: removing really only all valid email addresses is a bit more complex (see How can I validate an email address using a regular expression?)


Pattern for Email (10x to @bromelio)

"/[^@\s]*@[^@\s\.]*\.[^@\s\.,!?]*/"

Pattern for Url

"#((?:https?|ftp)://\S+[[:alnum:]]/?)#si"


My answer is a slight improvement of Josiah's code. Just want to combine the two code segment as one as the preg_replace() allow that the pattern can be passed as a string or as an array.

$patterns = array();

$patterns[0] = "/[^@\s]*@[^@\s]*\.[^@\s]*/"; //removes email

$patterns[1] = "/[a-zA-Z]*[:\/\/]*[A-Za-z0-9\-_]+\.+[A-Za-z0-9\.\/%&=\?\-  
_]+/i"; //removes any link


$replace =  "[removed]";

$string = "Follow the link below https://stackoverlow.com/testing/preg- 
match-replace-in-php or email me a sample code in my email 
test@mail.com";

preg_replace($pattern,s $replacement, $string); 

In the event, you want to use a different replacement text when a link is removed or the email for instance when the mail is removed you specify that [email has been removed] and [link has been removed] you can extend the above segment of the code more so on the $replacement as shown below

$replacements = array();
//replacementmessage for mails
$replacements[0] = "[Email has been removed]"; 
//replacementmessage for links
$replacements[1] = "[Link has been removed]";

And every other part of the code remains the same.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜