Transform RSS-Feed into another "standard" XML-Format with PHP
quick question: I need to transform a default RSS Structure into another XML-format.
The RSS File is like....
<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
<channel>
<title>Name des RSS Feed</title>
<description>Feed Beschreibung</description>
<language>de</language>
<link>http://xml-rss.de</link>
<lastBuildDate>Sat, 1 Jan 2000 00:00:00 GMT</lastBuildDate>
<item>
<title>Titel der Nachricht</title>
开发者_如何转开发 <description>Die Nachricht an sich</description>
<link>http://xml-rss.de/link-zur-nachricht.htm</link>
<pubDate>Sat, 1. Jan 2000 00:00:00 GMT</pubDate>
<guid>01012000-000000</guid>
</item>
<item>
<title>Titel der Nachricht</title>
<description>Die Nachricht an sich</description>
<link>http://xml-rss.de/link-zur-nachricht.htm</link>
<pubDate>Sat, 1. Jan 2000 00:00:00 GMT</pubDate>
<guid>01012000-000000</guid>
</item>
<item>
<title>Titel der Nachricht</title>
<description>Die Nachricht an sich</description>
<link>http://xml-rss.de/link-zur-nachricht.htm</link>
<pubDate>Sat, 1. Jan 2000 00:00:00 GMT</pubDate>
<guid>01012000-000000</guid>
</item>
</channel>
</rss>
...and I want to extract only the item-elements (with childs and attributes) XML like:
<?xml version="1.0" encoding="ISO-8859-1"?>
<item>
<title>Titel der Nachricht</title>
<description>Die Nachricht an sich</description>
<link>http://xml-rss.de/link-zur-nachricht.htm</link>
<pubDate>Sat, 1. Jan 2000 00:00:00 GMT</pubDate>
<guid>01012000-000000</guid>
</item>
...
It hasn't to be stored into a file. I need just the output.
edit: Furthermore you need to know: The RSS File could have dynamic numbers of items. This is just a sample. So it has to be looped with while, for, for-each, ...
I tried different approaches with DOMNode, SimpleXML, XPath, ... but without success.
Thanks chris
A different approach would be to use an XSLT:
$xsl = <<< XSL
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<items>
<xsl:copy-of select="//item">
<xsl:apply-templates/>
</xsl:copy-of>
</items>
</xsl:template>
</xsl:stylesheet>
XSL;
The above stylesheet has just one rule, namely deep copying all <item>
elements from the source XML to an XML file and ignore everything else from the source file. The nodes will be copied into an <items>
element for root node. To process this, you'd do
$xslDoc = new DOMDocument(); // create Doc for XSLT
$xslDoc->loadXML($xsl); // load stylesheet into it
$xmlDoc = new DOMDocument(); // create Doc for RSS
$xmlDoc->loadXML($xml); // load your XML/RSS into it
$proc = new XSLTProcessor(); // init XSLT engine
$proc->importStylesheet($xslDoc); // load stylesheet into engine
echo $proc->transformToXML($xmlDoc); // output transformed XML
Instead of outputting, you could just write the return value to file.
Further reading:
- http://de3.php.net/manual/en/class.xsltprocessor.php
- http://www.w3.org/TR/xslt#copy-of
What you ask for is hardly a transformation. You are basically just extracting the <item>
elements as they are. Also, the result you give is not valid XML, as it lacks a root node.
Apart from that, you can simple do it like this:
$dom = new DOMDocument; // init new DOMDocument
$dom->loadXML($xml); // load some XML into it
$xpath = new DOMXPath($dom); // create a new XPath
$nodes = $xpath->query('//item'); // Find all item elements
foreach($nodes as $node) { // Iterate over found item elements
echo $dom->saveXml($node); // output the item node outerHTML
}
The above would echo the <item>
nodes. You could simply buffer the output, concatenate it to a string, write to it an array and implode, etc - and write it to file.
If you want to do it properly with DOM (and a root node), the full code would be:
$dom = new DOMDocument; // init DOMDocument for RSS
$dom->loadXML($xml); // load some XML into it
$items = new DOMDocument; // init DOMDocument for new file
$items->preserveWhiteSpace = FALSE; // dump whitespace
$items->formatOutput = TRUE; // make output pretty
$items->loadXML('<items/>'); // create root node
$xpath = new DOMXPath($dom); // create a new XPath
$nodes = $xpath->query('//item'); // Find all item elements
foreach($nodes as $node) { // iterate over found item nodes
$copy = $items->importNode($node, TRUE); // deep copy of item node
$items->documentElement->appendChild($copy); // append item nodes
}
echo $items->saveXML(); // outputs the new document
Instead of saveXML()
, you'd use save('filename.xml')
to write it to a file.
Try:
<?php
$xmlFile = new DOMDocument(); //Instantiate new DOMDocument
$xmlFile->load("URL TO RSS/XML FILE"); //Load in XML/RSS file
$xmlString = file_get_contents("URL TO RSS/XML FILE");
$title[] = "";
$description[] = "";
$link[] = "";
$pubDate[] = "";
$guid[] = "";
for($i = 0; $i < substr_count($xmlString, "<item>"); $i++)
{
$title[] = $xmlFile->getElementsByTagName("title")->item(0)->nodeValue; //Get the value of the node <title>
$description[] = $xmlFile->getElementsByTagName("description")->item(0)->nodeValue;
$link[] = $xmlFile->getElementsByTagName("link")->item(0)->nodeValue;
$pubDate[] = $xmlFile->getElementsByTagName("pubDate")->item(0)->nodeValue;
$guid[] = $xmlFile->getElementsByTagName("guid")->item(0)->nodeValue;
}
?>
Untested but the arrays
$title[] $description[] $link[] $pubDate[] $guid[]
should be populated with all of the data that you need!
EDIT: OK so another approach:
<?php
$xmlString = file_get_contents("URL TO RSS/XML FILE");
$titles = preg_filter("/<title>([.]*)</title>/","\\1", mixed $xmlString);
$descriptions = preg_filter("/<description>([.]*)</description>/","\\1", mixed $xmlString);
$links = preg_filter("/<link>([.]*)</link>/","\\1", mixed $xmlString);
$pubDates = preg_filter("/<pubDate>([.]*)</pubDate>/","\\1", mixed $xmlString);
$guids = preg_filter("/<guid>([.]*)</guid>/","\\1", mixed $xmlString);
?>
In this example each variable will be filled with the correct values.
精彩评论