开发者

What shall be the best optimized preg pattern for this data?? Regular Expression help needed

I need help with php preg pattern to extract contents from the following data :

<div class="box">
<div>
<a href="/;s=-w3NKGFjkswdkjbg0B;detail=person;id=937382/me">
<b>Smith, Johnny</b>
</a>
</div>
<div>
<a href="/;s=-w3NKGFjkswdkjbg0B/http%3aservice.myxyz.net/ch/cgi/g.fcgi/me/new?CUSTOMERNO=836327973&amp;t=i373u.1310541179.a1ecb28b&amp;TO=smithjohnny@gmail.com">smithjohnny@gmail.com</a>
</div>
<div>
<a href="/;s=-w3NKGFjkswdkjbg0B;edit=person;id=937382/me"><img src="/;m=is;f=gif89a;h=18;k=sdakjdk12eksack;w=18/it%3amfitmcsfe19/DiEDzr48XbZcjfyGLMKnzw.gif" alt="" width="18" height="18">
</a>
<a href="/;s=-w3NKGFjkswdkjbg0B;delete=person;id=937382/me">
<img src="/;m=is;f=gif89a;h=18;k=Dk3k-kVox-ads9Lopt-yBQ;w=18/it%3amfitmcsfe19/tHJTBPhousrElDf1x5aPvA.gif" alt="" width="18" height="18">
</a>
</div>

<div class="fitMlModuleLinec8fe6cf8">&nbsp;</div>

<div>
<a href="/;s=-w3NKGFjk4jkedkds8g0B;detail=person;id=327843287/me"></a>
</div>
<div>
<a href="/;s=-w3NKGFjk4jkedkds8g0B/http%3aservice.myxyz.net/ch/cgi/g.fcgi/me/new?CUSTOMERNO=98324826438&amp;t=de13929382.1310541179.a1ecb28b&amp;TO=iamtesting@gmail.com">iamtesting@gmail.com</a>
</div>
<div>
<a href="/;s=-w3NKGFjk4jkedkds8g0B;edit=person;id=327843287/me">
<img src="/;m=is;f=gif89a;h=18;k=cBoj9wS5Yp5345435EREg;w=开发者_开发百科18/it%3amfitmcsfe19/DiEDzr48XbZcjfyGLMKnzw.gif" alt="" width="18" height="18"></a> | 
<a href="/;s=-w3NKGFjk4jkedkds8g0B;delete=person;id=327843287/me">
<img src="/;m=is;f=gif89a;h=18;k=Dk3k-kVox-ads9Lopt-yBQ;w=18/it%3amfitmcsfe19/tHJTBPhousrElDf1x5aPvA.gif" alt="" width="18" height="18"></a>
</div>

<div class="fitMlModuleLinec8fe6cf8">&nbsp;</div>

<div>
<a href="/;s=-w3NKGsndqw21g0B;detail=person;id=83467836/me">
<b>Parker</b>
</a>
</div>
<div>
<a href="/;s=-w3NKGsndqw21g0B;edit=person;id=83467836/me">
<img src="/;m=is;f=gif89a;h=18;k=cBodejksa23KNKvUEREg;w=18/it%3amfitmcsfe19/DiEDzr48XbZcjfyGLMKnzw.gif" alt="" width="18" height="18"></a> | 
<a href="/;s=-w3NKGF6hSNhymOcg6uWbg0B;delete=person;id=83467836/me">
<img src="/;m=is;f=gif89a;h=18;k=Dk3k-kVox-ads9Lopt-yBQ;w=18/it%3amfitmcsfe19/tHJTBPhousrElDf1x5aPvA.gif" alt="" width="18" height="18"></a>
</div>

<div class="fitMlModuleLinec8fe6cf8">&nbsp;</div>
</div>
</div>

The above data looks like this :

What shall be the best optimized preg pattern for this data?? Regular Expression help needed

Following are the conditions :

  • I want to extract email addresses.
  • If email address is found then check for name.. if name found then fetch name of the person for that email address.
  • If name is found and no email address for that person is specified then discard the data.

The output array should look like :

Array(
[email#1]= array([name]='name'),
[email#2]= array([name]='name')
.
.
[email#n]= array([name]='name')   
)

The result from above data should look like :

Array(
[smithjohnny@gmail.com]= array([name]='Smith, Johnny'),
[iamtesting@gmail.com]= array([name]='')
)

Kindly suggest me the highest optimized preg_match for above problem.


I'm making some assumptions about the quality of your data, but you could

preg_match("/<a href=\".+?\">([^<]+)@([^<]+)</a>", "smithjonny@gmail.com", $matches);
// $matches[1]='smithjonny'
// $matches[2]='gmail.com'

This is a bit crude but if you can guarantee the href doesn't contain " (which should be excaped to &quot;) then it will extract the e-mail.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜