XSLT, grab just a portion of a string within a tag
alright, i have an xslt stylesheet that does most of what i need now, it looks like so:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match="//Product/Description">
<title>
<xsl:apply-templates/>
</title>
</xsl:template>
<xsl:template match="//Product/Picture">
<link>
<xsl:apply-templates/>
</link>
</xsl:template>
<xsl:template match="//Product/Caption">
<description>
<xsl:apply-templates/>
</description>
</xsl:template>
<xsl:template match="Picture">
<xsl:param name="text"/>
<xsl:choose>
<xsl:when test="contains($text, '<')">
<xsl:value-of select="substring-before($text, '<')"/>
<xsl:call-template name="strip-tags">
<xsl:with-param name="text" select="substring-after($text, 'src=')"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$text"/>
</xsl:otherwise>
</xsl:choose>
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="Caption">
<xsl:param name="text"/>
<xsl:choose>
<xsl:when test="contains($text, '<')">
<xsl:value-of select="substring-before($text, '<')"/>
<xsl:call-template name="strip-tags">
<xsl:with-param name="text" select="substring-after($text,'>')"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$text"/>
</xsl:otherwise>
</xsl:choose>
<xsl:apply-templates/>
</xsl:template>
</xsl:stylesheet>
this is probably a huge kludge because i am just grabbing the text from the 'raw' output of my xml editor because it does what i need. it is putting the correct tags in the right places. however, now the 'strip-tag' doesnt seem to work, and i tried to make another version of the 'strip-tag' that would strip everything following 'src=' and preceding '>' but obviously 'strip-tag' would be the opposite of what i am trying to do. is there something that does the opposite of 'strip-tag'? then i could just replace the word 'strip-tag' with 'strip-all-except' or whatever it would be called
EDIT:
here is the input xml file:<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE StoreExport SYSTEM "http://store.yahoo.com/doc/dtd/StoreExport.dtd">
<StoreExport>
<Settings>
<Published timestamp="1297187196"/>
<Locale code="C" name="English" encoding="iso-8859-1"/>
<StoreName>Cl33333</StoreName>
<Currency>USD</Currency>
<ShipMethods>
<ShipMethod></ShipMethod>
</ShipMethods>
<PayMethods>
</PayMethods>
</Settings>
<Products>
<Product Id="agfasu">
<Code>3616a</Code>
<Description>Ageless Fashion Suit</Description>
<Url>http://www.cl333333333d.com/agfasu.html</Url>
<Thumb><img border=0 width=50 height=70 src=http://ep.y3333333333327706119506618_2144_317652924></Thumb>
<Picture><img border=0 width=600 height=845 src=http://ep.yim3333333st-27706119506618_2144_317019111></Picture>
<Orderable>YES</Orderable>
<Taxable>YES</Taxable>
<Pricing>
<BasePrice>178.00</BasePrice>
</Pricing>
<Path>333333333333333om/wochsu.html">Womens Church Suits</ProductRef>
<ProductRef Id="2454" Url="http://www.cl33333333454.html">Aussie Austine Spring/Summer 2011</ProductRef>
</Path>
<Availability>Usually ships the next business day.</Availability>
<Caption><head> <meta content="en-us" http-equiv="Content-Language"> <style type="text/css"> .style3 { font-family: arial, helvetica; font-size: medium; font-weight: bold; } .style4 { font-size: small; } </style> </head> <p><strong>Wholesale Women&#39;s</Caption>
<OptionLists>
<OptionList name="Size">
<OptionValue>8</OptionValue>
</OptionList>
<OptionList name="Colors">
<OptionValue>Red</OptionValue>
</OptionList>
<OptionList na开发者_开发知识库me="Accessories">
<OptionValue>Suit</OptionValue>
</OptionList>
</OptionLists>
</Product>
the output i would like:
<item>
<title>
<![CDATA['DescriptionTag]]>
</title>
<description>
<![CDATA[CaptionTagStrippedofEscapedCharacters]]>
</description>
<link>'UrlTag'</link>
<g:condition>new</g:condition>
<g:price>'BasePriceTag'</g:price>
<g:product_type>Clothing, Accessories</g:product_type>
<g:image_link>'PictureTagFrom 'src=' to '>' </g:image_link>
<g:payment_accepted>Visa</g:payment_accepted>
<g:payment_accepted>Mastercard</g:payment_accepted>
<g:payment_accepted>Discover</g:payment_accepted>
</item>
some of the tags dont need to be populated from the source, but are always the same, such as 'payment accepted', 'condition', and 'product type'
One shouldn't use an XML vocabulary nor an XML consumer that expects parseable data as unparsed text node
If you do it, then you must face consequences and do proper parsing instead of some error prone RegExp or string handling.
A very basic XSLT parser for encode properly wellformed XHTML is found at https://bug98168.bugzilla.mozilla.org/attachment.cgi?id=434081
So, you could parse your unparsed data, and then apply a second phase transformation with node-set()
extension function.
精彩评论