开发者

find and replace keywords with urls ONLY if not already wrapped in a url

So, I can easily do this:

$keywords = array(' jacket ',' sweater ',);
$urls =开发者_如何学Python array(' <a href="#">jacket</a> ',' <a href="#">sweater</a> ',);
$content = str_ireplace($keywords,$urls,$content);

But the problem is when something like jacket or sweater is already in a link tag.

I can't think of any simple solutions to this...


  • step 1) find all instances of each keyword which are within a url & replace them with a GUID
  • step 2) replace all other instances of each keyword with thier replacement
  • step 3) replace the GUID's with their origional keywords.

Hope you choose your Guid's and how you do your searching is up to you :)


You might be able to construct a fancy regex for this, but it would be extremely complicated and nearly impossible to maintain. IMO, you should take a token-based approach to solve this problem. I have a project called Lexentity that we use to replace apostrophes and quotes, etc., with their numerical entities. You could probably adapt that code for your purposes. There's also a HTMLTokenizer (and HTMLTokenset) in Habari that you might find useful.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜