Transform xml structure to another xml structure
Im working with PHP5, and I need to transform XML in the following form:
<item>
<string isNewLine="1" lineNumber="32">some text in new line</string>
<string>, more text</string>
<item>
<string isNewLine="1" lineNumber="33">some text in new line</string>
<string isNewLine="1" lineNumber="34">some text</string>
<string> in the same line</string>
<string isNewLine="1" lineNumber="35">some text in new line</string>
</item>
</item>
into something like this:
<item>
<line lineNumber="32">some text in new line, more text</string>
<item>
<line lineNumber="33">some text in new line</string>
<line lineNumber="34">some text in the same line</string>
<line lineNumber="35">some text in new line</string>
</item>
</item>
As you can see, it has joined the text contained in across multiple 'string' nodes. Al开发者_StackOverflowso note that the 'string' nodes can be nested within other nodes at any level.
What are possible solutions for transforming source xml to the target xml?
Thanks,
Here is an efficient and correct solution:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:key name="knextStrings"
match="string[not(@isNewLine)]"
use="generate-id(preceding-sibling::string
[@isNewLine][1]
)"/>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="string[@isNewLine]">
<line>
<xsl:copy-of select="@*[not(name()='isNewLine')]"/>
<xsl:copy-of select="text()
|
key('knextStrings',
generate-id()
)
/text()"/>
</line>
</xsl:template>
<xsl:template match="string[not(@isNewLine)]"/>
</xsl:stylesheet>
when this transformation is applied on the originally provided XML document:
<item>
<string isNewLine="1" lineNumber="32">some text in new line</string>
<string>, more text</string>
<item>
<string isNewLine="1" lineNumber="33">some text in new line</string>
<string isNewLine="1" lineNumber="34">some text</string>
<string> in the same line</string>
<string isNewLine="1" lineNumber="35">some text in new line</string>
</item>
</item>
the wanted, correct result is produced:
<item>
<line lineNumber="32">some text in new line, more text</line>
<item>
<line lineNumber="33">some text in new line</line>
<line lineNumber="34">some text in the same line</line>
<line lineNumber="35">some text in new line</line>
</item>
</item>
This stylesheet produces the output you are looking for:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output indent="yes" />
<!--Identity template simply copies content forward by default -->
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="string[@isNewLine and @lineNumber]">
<line>
<xsl:apply-templates select="@*"/>
<xsl:apply-templates select="text()" />
<!-- Include the text() from the string elements that come after this element,
do not have @isNewLine or @lineNumber,
and are only following this particular element -->
<xsl:apply-templates select="following-sibling::string[not(@isNewLine and @lineNumber) and generate-id(preceding-sibling::string[1]) = generate-id(current())]/text()" />
</line>
</xsl:template>
<!--Suppress the string elements that do not contain isNewLine or lineNumber attributes in normal processing-->
<xsl:template match="string[not(@isNewLine and @lineNumber)]" />
<!--Empty template to prevent attribute from being copied to output-->
<xsl:template match="@isNewLine" />
</xsl:stylesheet>
You should look into an XML Parser for this. You could use either a SAX-based or DOM-based parser.
- PHP SAX Parser
- PHP DOM parser libxml-based named SimpleXML
SAX is more efficient but DOM may suit your needs better as it's easier to work with.
Use an XSL Transformation.
From the PHP documentation:
<?php
$xml = new DOMDocument;
$xml->load('data.xml');
$xsl = new DOMDocument;
$xsl->load('trans.xsl');
$proc = new XSLTProcessor;
$proc->importStyleSheet($xsl);
echo $proc->transformToXML($xml);
?>
Use Dimitri's answer for trans.xsl
.
精彩评论