开发者

XSLT replacing url in text with regex

I've got an xml feed coming from Twitter which I want to transform using XSLT. What I want the xslt to do is to replace every occuring URL in an twittermessage. I've already created the following xslt template using this and this topic here on stackoverflow. How can I achieve this? If I use the template as below i'm getting an infinite loop but I don't see where. As soon as I comment out the call to the 'replaceAll'-template everything seem to work, but then ofcourse no content of the twittermessage gets replaced. I'm new to XSLT so every bit of help is welcome.

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"  xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
    <xsl:output method="text" omit-xml-declaration="yes" indent="yes"  encoding="utf-8" />
    <xsl:param name="html-content-type" />
    <xsl:variable name="urlRegex" select="8"/>
    <xsl:template match="statuses">
        <xsl:for-each select="//status[position() &lt; 2]">
            <xsl:variable name="TwitterMessage" select="text" />
            <xsl:call-template name="replaceAll">
                <xsl:with-param name="text" select="$TwitterMessage"/>
                <xsl:with-param name="replace" select="De"/> <!--This should become an regex to replace urls, maybe something like the rule below?-->
                <xsl:with-param name="by" select="FOOOO"/> <!--Here I want the matching regex value to be replaced with valid html to create an href-->
                <!--<xsl:value-of select="replace(text,'^http://(.*)\.com','#')"/>
                <xsl:value-of select="text"/>-->
            </xsl:call-template>
            <!--<xsl:value-of select="text"/>-->
            <!--<xsl:apply-templates />-->
        </xsl:for-each>
    </xsl:template>

    <xsl:template name="replaceAll">
        <xsl:param name="text"/>
        <xsl:param name="replace"/>
        <xsl:param name="by"/>
        <xsl:choose>
            <xsl:when test="contains($text,$replace)">
                <xsl:value-of select="sub开发者_StackOverflow社区string-before($text,$replace)"/>
                <xsl:value-of select="$by"/>
                <xsl:call-template name="replaceAll">
                    <xsl:with-param name="text" select="substring-after($text,$replace)"/>
                    <xsl:with-param name="replace" select="$replace"/>
                    <xsl:with-param name="by" select="$by"/>
                </xsl:call-template>
            </xsl:when>
            <xsl:otherwise>
                <xsl:value-of select="$text"/>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:template>
</xsl:stylesheet>

EDIT: This in an example of the xml feed.

<?xml version="1.0" encoding="UTF-8"?>
<statuses type="array">
<status>
  <created_at>Mon May 16 14:17:12 +0000 2011</created_at>
  <id>10000000000000000</id>
  <text>This is an message from Twitter http://bit.ly/xxxxx http://yfrog.com/xxxxx</text>
<status>

This is just the basic html twitter outputs on an url like below;

http://twitter.com/statuses/user_timeline.xml?screen_name=yourtwitterusername

This text;

This is an message from Twitter http://bit.ly/xxxxx http://yfrog.com/xxxxx

Should be converted to;

This is an message from Twitter <a href="http://bit.ly/xxxxx>http://bit.ly/xxxxx</a> <a href="http://yfrog.com/xxxxx">http://yfrog.com/xxxxx</a>


So, your question isn't about XSLT. What you want is to find out the best option for manipulating a text string in XPath. If you are using a standalone XSLT engine, you can probably use XPath 2, which just about has the power you need, though with regexs it will get a bit fiddly. If you are running this from an engine with EXSLT support, you will need to look up what functions are available there. If you are running this from PHP, text manipulation is generally very good to hand over to the PHP code; you do that by make a PHP function to do what you want, and call it from the XSLT using php:function('f-name', inputs ...) as the XPath expression.

As far as regexs go, I guess you are looking for something pretty much along these lines:

send (https?://.*?)(?=[.,:;)]*($|\s)) to <a href="$1">$1</a>.

If it doesn't match all URLs, that's fine, and you only need to handle incoming data as well as Twitter's munging. Checking for punctuation at the end (the [] in the regex) is really the only tricky thing that your users will expect you to do.


Generally, I wouldnt implement a new replace function. I'd use the one provided by EXSLT. If your XSLT processor supports exslt, you just need to set the stylesheet as follows:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:regex="http://exslt.org/regular-expressions"
                extension-element-prefixes="regex"
                version="1.0">

Otherwise download and imort the stylesheet from EXSLT.

For a global replace you can use the function as follows:

<xsl:value-of select="regexp:replace(string($TwitterMessage), 'yourppatern', 'g', 'yourreplace')" />

Sorry for the general answer, but I'm not able to test XSLT at the moment.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜