Xsl subtrings with pattern
I would like to retrieve a value of my xml file with a xsl template but I have no idea how to retrieve these value...
I have this XML file
<BlastOutput_iterations>
<Iteration>
<Iteration_iter-num>1</Iteration_iter-num>
<Iteration_query-ID>Query_1</Iteration_query-ID>
<Iteration_query-def>Teg18as_antisens</Iteration_query-def>
<Iteration_query-len>865</Iteration_query-len>
<Iteration_hits>
<Hit>
<Hit_num>1</Hit_num>
<Hit_id>gnl|BL_ORD_ID|30</Hit_id>
<Hit_def>gi|150392480|ref|NC_009632.1| Staphylococcus aureus subsp. aureus JH1, complete genome</Hit_def>
<Hit_accession>2233</Hit_accession>
<Hit_len>2906507</Hit_len>
<Hit_hsps>
<Hsp>
<Hsp_num>1</Hsp_num>
<Hsp_bit-score>1561.20031714635</Hsp_bit-score>
<Hsp_score>1730</Hsp_score>
<Hsp_evalue>0</Hsp_evalue>
<Hsp_query-from>1</Hsp_query-from>
<Hsp_query-to>865</Hsp_query-to>
<Hsp_hit-from>355668</Hsp_hit-from>
<Hsp_hit-to>356532</Hsp_hit-to>
<Hsp_query-frame>1</Hsp_query-frame>
<Hsp_hit-frame>1</Hsp_hit-frame>
<Hsp_identity>865</Hsp_identity>
<Hsp_positive>865</Hsp_positive>
<Hsp_gaps>0</Hsp_gaps>
<Hsp_align-len>865</Hsp_align-len>
<Hsp_qseq>CAACTCGTTAGGACAATCACGATGATTGTCTACAGTTGCAGGTGGATTTGAATATACTACTAGTTATTTGTTGTCTAGGATAATAGATTTAGTATGTTGATAAGTTTGACTCAGATTTGTATTTTCTAATAAATGATAACTCACGATATCGATTAAAAAGAGTGTCGCAATTTGTGTGTTGATAAATTGATGGTCGGTATTACGCGATTGATCCGTTGTTAAAAGTACTAAATCTGCACAATCTGTAAGTTTACTACCTTCGAAATTTGTGATGGCAACGACATATGCACCATGAGATTTGGCGACTTCCGCTGCTGAAATTAATTCCGAAGTATTACCACTATTTGACATAGCAATAAACATGTCCGAATGAGATAGTAGGGATGCCGATATTTTCATTAAATGTGAATCGGTAGTAACATTACCTTTTAGCCCCATACGAATCATACGATAATAAAATTCAGTCGCTGATAAACCAGAGCTACCTAGTCCAGCAAAGAGTATATGTCGACTTGATTGGAGTTTGTCGATAAAGGTTTGGATAATGTCGTTATCAATAAATTCACCAGTTTGTTGAATGATTTGTTGATGATATTTATGAATTCTTTGAATAATTGGGCTATTTTCAATAACTGTCTCTGTCATTTCTTGTTGAATATTAAATTTTAAATCTTGGAAATTCTCATAATCTAGCTTATGACTAAAGCGTGTCATCGTTGCTGGTGATGTACCAATCGCATGGGCTAAGGAGTTAATCGTTGAAAAGGCATCGCTATAACCATTTTGTCTTATATAATTGACGATGCGTTTATCAGTTTTTGTAAATAAATGTTGATAACGTTGAACACGATTCTCAAATTTCATT</Hsp_qseq>
<Hsp_hseq>CAACTCGTTAGGACAATCACGATGATTGTCTACAGTTGCAGGTGGATTTGAATATACTACTAGTTATTTGTTGTCTAGGATAATAGATTTAGTATGTTGATAAGTTTGACTCAGATTTGTATTTTCTAATAAATGATAACTCACGATATCGATTAAAAAGAGTGTCGCAATTTGTGTGTTGATAAATTGATGGTCGGTATTACGCGATTGATCCGTTGTTAAAAGTACTAAATCTGCACAATCTGTAAGTTTACTACCTTCGAAATTTGTGATGGCAACGACATATGCACCATGAGATTTGGCGACTTCCGCTGCTGAAATTAATTCCGAAGTATTACCACTATTTGACATAGCAATAAACATGTCCGAATGAGATAGTAGGGATGCCGATATTTTCATTAAATGTGAATCGGTAGTAACATTACCTTTTAGCCCCATACGAATCATACGATAATAAAATTCAGTCGCTGATAAACCAGAGCTACCTAGTCCAGCAAAGAGTATATGTCGACTTGATTGGAGTTTGTCGATAAAGGTTTGGATAATGTCGTTATCAATAAATTCACCAGTTTGTTGAATGATTTGTTGATGATATTTATGAATTCTTTGAATAATTGGGCTATTTTCAATAACTGTCTCTGTCATTTCTTGTTGAATATTAAATTTTAAATCTTGGAAATTCTCATAATCTAGCTTATGACTAAAGCGTGTCATCGTTGCTGGTGATGTACCAATCGCATGGGCTAAGGAGTTAATCGTTGAAAAGGCATCGCTATAACCATTTTGTCTTATATAATTGACGATGCGTTTATCAGTTTTTGTAAATAAATGTTGATAACGTTGAACACGATTCTCAAATTTCATT</Hsp_hseq>
<Hsp_midline>|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||开发者_C百科||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||</Hsp_midline>
</Hsp>
</Hit_hsps>
</Hit>
</Iteration_hits>
</Iteration>
and a would like to retrieve NC_009632.1 in gi|150392480|ref|NC_009632.1| Staphylococcus aureus subsp. aureus JH1, complete genome, the structure of these line is always like "xx|number|xxx|value_to_retrieve| "
thank for help
<xsl:template match="Hit_def">
<xsl:value-of select="substring-before(
substring-after(
substring-after(
substring-after(., '|'),
'|'
),
'|'
),
'|'
)"/>
</xsl:template>
with XSLT 1.0 should do, with 2.0
<xsl:template match="Hit_def">
<xsl:value-of select="tokenize(., '|')[4]"/>
</xsl:template>
is easier.
Here is a simple tokenization. One can use the result to access any token by position:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="text()" name="tokenize">
<xsl:param name="pText" select="concat(.,'|')"/>
<xsl:if test="string-length($pText)">
<t>
<xsl:value-of select="substring-before($pText, '|')"/>
</t>
<xsl:call-template name="tokenize">
<xsl:with-param name="pText" select=
"substring-after($pText, '|')"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
when applied on:
<t>a|b|c|d|e|f|g|h</t>
produces:
<t>a</t>
<t>b</t>
<t>c</t>
<t>d</t>
<t>e</t>
<t>f</t>
<t>g</t>
<t>h</t>
Finally, we are using this tokenisation technique to solve the original problem:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:ext="http://exslt.org/common" >
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/">
<xsl:variable name="vrtfTokens">
<xsl:call-template name="tokenize">
<xsl:with-param name="pText" select=
"/*/*/*/*/Hit_def"/>
</xsl:call-template>
</xsl:variable>
<xsl:value-of select="ext:node-set($vrtfTokens)/*[4]"/>
</xsl:template>
<xsl:template match="text()" name="tokenize">
<xsl:param name="pText" select="concat(.,'|')"/>
<xsl:if test="string-length($pText)">
<t>
<xsl:value-of select="substring-before($pText, '|')"/>
</t>
<xsl:call-template name="tokenize">
<xsl:with-param name="pText" select=
"substring-after($pText, '|')"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
when this transformation is applied on the provided XML document:
<BlastOutput_iterations>
<Iteration>
<Iteration_iter-num>1</Iteration_iter-num>
<Iteration_query-ID>Query_1</Iteration_query-ID>
<Iteration_query-def>Teg18as_antisens</Iteration_query-def>
<Iteration_query-len>865</Iteration_query-len>
<Iteration_hits>
<Hit>
<Hit_num>1</Hit_num>
<Hit_id>gnl|BL_ORD_ID|30</Hit_id>
<Hit_def>gi|150392480|ref|NC_009632.1| Staphylococcus aureus subsp. aureus JH1, complete genome</Hit_def>
<Hit_accession>2233</Hit_accession>
<Hit_len>2906507</Hit_len>
<Hit_hsps>
<Hsp>
<Hsp_num>1</Hsp_num>
<Hsp_bit-score>1561.20031714635</Hsp_bit-score>
<Hsp_score>1730</Hsp_score>
<Hsp_evalue>0</Hsp_evalue>
<Hsp_query-from>1</Hsp_query-from>
<Hsp_query-to>865</Hsp_query-to>
<Hsp_hit-from>355668</Hsp_hit-from>
<Hsp_hit-to>356532</Hsp_hit-to>
<Hsp_query-frame>1</Hsp_query-frame>
<Hsp_hit-frame>1</Hsp_hit-frame>
<Hsp_identity>865</Hsp_identity>
<Hsp_positive>865</Hsp_positive>
<Hsp_gaps>0</Hsp_gaps>
<Hsp_align-len>865</Hsp_align-len>
<Hsp_qseq>CAACTCGTTAGGACAATCACGATGATTGTCTACAGTTGCAGGTGGATTTGAATATACTACTAGTTATTTGTTGTCTAGGATAATAGATTTAGTATGTTGATAAGTTTGACTCAGATTTGTATTTTCTAATAAATGATAACTCACGATATCGATTAAAAAGAGTGTCGCAATTTGTGTGTTGATAAATTGATGGTCGGTATTACGCGATTGATCCGTTGTTAAAAGTACTAAATCTGCACAATCTGTAAGTTTACTACCTTCGAAATTTGTGATGGCAACGACATATGCACCATGAGATTTGGCGACTTCCGCTGCTGAAATTAATTCCGAAGTATTACCACTATTTGACATAGCAATAAACATGTCCGAATGAGATAGTAGGGATGCCGATATTTTCATTAAATGTGAATCGGTAGTAACATTACCTTTTAGCCCCATACGAATCATACGATAATAAAATTCAGTCGCTGATAAACCAGAGCTACCTAGTCCAGCAAAGAGTATATGTCGACTTGATTGGAGTTTGTCGATAAAGGTTTGGATAATGTCGTTATCAATAAATTCACCAGTTTGTTGAATGATTTGTTGATGATATTTATGAATTCTTTGAATAATTGGGCTATTTTCAATAACTGTCTCTGTCATTTCTTGTTGAATATTAAATTTTAAATCTTGGAAATTCTCATAATCTAGCTTATGACTAAAGCGTGTCATCGTTGCTGGTGATGTACCAATCGCATGGGCTAAGGAGTTAATCGTTGAAAAGGCATCGCTATAACCATTTTGTCTTATATAATTGACGATGCGTTTATCAGTTTTTGTAAATAAATGTTGATAACGTTGAACACGATTCTCAAATTTCATT</Hsp_qseq>
<Hsp_hseq>CAACTCGTTAGGACAATCACGATGATTGTCTACAGTTGCAGGTGGATTTGAATATACTACTAGTTATTTGTTGTCTAGGATAATAGATTTAGTATGTTGATAAGTTTGACTCAGATTTGTATTTTCTAATAAATGATAACTCACGATATCGATTAAAAAGAGTGTCGCAATTTGTGTGTTGATAAATTGATGGTCGGTATTACGCGATTGATCCGTTGTTAAAAGTACTAAATCTGCACAATCTGTAAGTTTACTACCTTCGAAATTTGTGATGGCAACGACATATGCACCATGAGATTTGGCGACTTCCGCTGCTGAAATTAATTCCGAAGTATTACCACTATTTGACATAGCAATAAACATGTCCGAATGAGATAGTAGGGATGCCGATATTTTCATTAAATGTGAATCGGTAGTAACATTACCTTTTAGCCCCATACGAATCATACGATAATAAAATTCAGTCGCTGATAAACCAGAGCTACCTAGTCCAGCAAAGAGTATATGTCGACTTGATTGGAGTTTGTCGATAAAGGTTTGGATAATGTCGTTATCAATAAATTCACCAGTTTGTTGAATGATTTGTTGATGATATTTATGAATTCTTTGAATAATTGGGCTATTTTCAATAACTGTCTCTGTCATTTCTTGTTGAATATTAAATTTTAAATCTTGGAAATTCTCATAATCTAGCTTATGACTAAAGCGTGTCATCGTTGCTGGTGATGTACCAATCGCATGGGCTAAGGAGTTAATCGTTGAAAAGGCATCGCTATAACCATTTTGTCTTATATAATTGACGATGCGTTTATCAGTTTTTGTAAATAAATGTTGATAACGTTGAACACGATTCTCAAATTTCATT</Hsp_hseq>
<Hsp_midline>|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||</Hsp_midline>
</Hsp>
</Hit_hsps>
</Hit>
</Iteration_hits>
</Iteration>
</BlastOutput_iterations>
the wanted, correct result is produced:
NC_009632.1
精彩评论