开发者

strpos remove text/html

I'm parsing an XML file, the creators of it stuck in a bunch social media info which is completely useless to me. I'd like to remove it before inserting the data into the db.

Problem is that its not all the same, some occurrences are :

Be a Social Butterfly! Connect & Learn More Below: Website • Facebook • Yelp

Some have more social sites listed and some have less. Id really like to remove that entire part. also this is a vardump after running strip_tags. The original looks like this:

<strong>Be a Social Butterfly! Connect & Learn More Below:</br></strong>
<a target="_blank" href="http://www.kiran-indian.com">Website</a> •<a target="_blank" href="http://www.fa开发者_运维问答cebook.com/pages/Kiran-Indian-Cuisine/55785994435"> Facebook</a> • <a target="_blank" href="http://www.yelp.com/biz/kiran-indian-cuisine-new-york">Yelp</a>

I used preg_replace to get rid of th entire sentence "be a social butterfly...." with

$description = strip_tags(preg_replace('/\bBe a Social Butterfly! Connect & Learn More Below\b/', '', $value['redemptionLocations']['description']));

A buddy of mine suggested the use of strpos to find first/last parts and substr to remove everything in between, but sadly I am not advanced enough to figure out how to do that.

Thanks in advance!

description field:

       
Food always does one thing. It helps keep you alive. But it can do more. It can be an experience that educates, transports, and invigorates you. Lunch or dinner at <a target="_blank" href="http://www.kiran-indian.com/home.htmls">Kiran Indian Cuisine</a> a lot more than a chance to keep from starving for another day --- it’s a chance to depart from the norm with delicious homemade dishes using the freshest of ingredients and the most aromatic seasoning available. They are open 7 days a week from 11 a.m. to 11 p.m. and accept all the major credit cards, plus when you order online from the surrounding area, delivery is 100% free of charge.</br></br>

<strong>Be a Social Butterfly! Connect & Learn More Below:</br></strong>
<a target="_blank" href="http://www.kiran-indian.com">Website</a> •<a target="_blank" href="http://www.facebook.com/pages/Kiran-Indian-Cuisine/55785994435"> Facebook</a> •  <a target="_blank" href="http://www.yelp.com/biz/kiran-indian-cuisine-new-york">Yelp</a>

seems pasting that code into here automatically adjusts asci/etc.


You need to find the position of the first string in the whole text, use strpos for that, then you need to find the position at the end of the chunk you want to remove, again use strpos. Now you have the beginning and end point of the chunk you want to remove, use substr_replace to replace it with nothing ''. substr_replace takes the length of the chunk to remove as the 4th parameter, rather than the position as with the 3rd parameter, so you need to subtract the 1st position int from the 2nd position int to figure out the length.

$feedtext='<description> this part is important...  be a social butterfly .. blah blah etc etc whatever whatever </description>';

$pos1=strpos($feedtext,'be a social butterfly');
$pos2=strpos($feedtext,'</description>');
$len=$pos2-$pos1;
$newtext=substr_replace($feedtext,'',$pos1,$len);

echo $newtext;

tested: http://www.ideone.com/1X5gI

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜