开发者

Use xPath or Regex?

The two methods below each serve the same purpose: scan the content of the post and determine if at least one img tag has an alt attribute which contains the "keyword" which is being tested for.

I'm new to xPath and would prefer to use it depending on how 开发者_C百科expensive that approach is compared to the regex version...

Method #1 uses preg_match

function image_alt_text_has_keyword($post)
        {
            $theKeyword = trim(wpe_getKeyword($post));
            $theContent = $post->post_content;
            $myArrayVar = array();
            preg_match_all('/<img\s[^>]*alt=\"([^\"]*)\"[^>]*>/siU',$theContent,$myArrayVar);
            foreach ($myArrayVar[1] as $theValue)
            {
                if (keyword_in_content($theKeyword,$theValue)) return true;
            }
            return false;
        }

function keyword_in_content($theKeyword, $theContent)
        {
            return preg_match('/\b' . $theKeyword . '\b/i', $theContent);
        }

Method #2 uses xPath

function keyword_in_img_alt()
{
global $post;
$keyword = trim(strtolower(wpe_getKeyword($post)));
$dom = new DOMDocument;
$dom->loadHTML(strtolower($post->post_content));
$xPath = new DOMXPath($dom);
return $xPath->evaluate('count(//a[.//img[contains(@alt, "'.$keyword.'")]])');
}


If you are parsing XML you should use XPath as it was designed exactly for this purpose. XML / XHTML is not a regular language and cannot be parsed correctly by regular expressions. You may be able to write a regular expression which works some of the time but there will be special cases where it will fail.


Using RegEx for selecting nodes in an XML document is as appropriate as using it for finding if a given number is a prime.

The fact that this is possible doesn't make it even a bit appropriate.

What is more, XPath 2.0 has RegEx support while RegEx do not have XPath support. Therefore, if both are needed, it is probably best to use XPath 2.0

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜