开发者

Transform RSS-Feed into another "standard" XML-Format with PHP

quick question: I need to transform a default RSS Structure into another XML-format.

The RSS File is like....

<?xml version="1.0" encoding="UTF-8"?>
    <rss version="2.0">
        <channel>
            <title>Name des RSS Feed</title>
            <description>Feed Beschreibung</description>
            <language>de</language>
            <link>http://xml-rss.de</link>
            <lastBuildDate>Sat, 1 Jan 2000 00:00:00 GMT</lastBuildDate>
            <item>
                <title>Titel der Nachricht</title>
        开发者_如何转开发        <description>Die Nachricht an sich</description>
                <link>http://xml-rss.de/link-zur-nachricht.htm</link>
                <pubDate>Sat, 1. Jan 2000 00:00:00 GMT</pubDate>
                <guid>01012000-000000</guid>
            </item>
            <item>
                <title>Titel der Nachricht</title>
                <description>Die Nachricht an sich</description>
                <link>http://xml-rss.de/link-zur-nachricht.htm</link>
                <pubDate>Sat, 1. Jan 2000 00:00:00 GMT</pubDate>
                <guid>01012000-000000</guid>
            </item>
            <item>
                <title>Titel der Nachricht</title>
                <description>Die Nachricht an sich</description>
                <link>http://xml-rss.de/link-zur-nachricht.htm</link>
                <pubDate>Sat, 1. Jan 2000 00:00:00 GMT</pubDate>
                <guid>01012000-000000</guid>
            </item>
        </channel>
    </rss>

...and I want to extract only the item-elements (with childs and attributes) XML like:

<?xml version="1.0" encoding="ISO-8859-1"?>
<item>
    <title>Titel der Nachricht</title>
    <description>Die Nachricht an sich</description>
   <link>http://xml-rss.de/link-zur-nachricht.htm</link>
   <pubDate>Sat, 1. Jan 2000 00:00:00 GMT</pubDate>
   <guid>01012000-000000</guid>
</item>
...

It hasn't to be stored into a file. I need just the output.

edit: Furthermore you need to know: The RSS File could have dynamic numbers of items. This is just a sample. So it has to be looped with while, for, for-each, ...

I tried different approaches with DOMNode, SimpleXML, XPath, ... but without success.

Thanks chris


A different approach would be to use an XSLT:

$xsl = <<< XSL
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<items>
  <xsl:copy-of select="//item">
    <xsl:apply-templates/>
  </xsl:copy-of>
</items>
</xsl:template>
</xsl:stylesheet>
XSL;

The above stylesheet has just one rule, namely deep copying all <item> elements from the source XML to an XML file and ignore everything else from the source file. The nodes will be copied into an <items> element for root node. To process this, you'd do

$xslDoc = new DOMDocument();           // create Doc for XSLT
$xslDoc->loadXML($xsl);                // load stylesheet into it
$xmlDoc = new DOMDocument();           // create Doc for RSS
$xmlDoc->loadXML($xml);                // load your XML/RSS into it
$proc = new XSLTProcessor();           // init XSLT engine
$proc->importStylesheet($xslDoc);      // load stylesheet into engine
echo $proc->transformToXML($xmlDoc);   // output transformed XML

Instead of outputting, you could just write the return value to file.

Further reading:

  • http://de3.php.net/manual/en/class.xsltprocessor.php
  • http://www.w3.org/TR/xslt#copy-of


What you ask for is hardly a transformation. You are basically just extracting the <item> elements as they are. Also, the result you give is not valid XML, as it lacks a root node.

Apart from that, you can simple do it like this:

$dom = new DOMDocument;           // init new DOMDocument
$dom->loadXML($xml);              // load some XML into it

$xpath = new DOMXPath($dom);      // create a new XPath
$nodes = $xpath->query('//item'); // Find all item elements
foreach($nodes as $node) {        // Iterate over found item elements
    echo $dom->saveXml($node);    // output the item node outerHTML
}

The above would echo the <item> nodes. You could simply buffer the output, concatenate it to a string, write to it an array and implode, etc - and write it to file.

If you want to do it properly with DOM (and a root node), the full code would be:

$dom = new DOMDocument;                          // init DOMDocument for RSS
$dom->loadXML($xml);                             // load some XML into it

$items = new DOMDocument;                        // init DOMDocument for new file
$items->preserveWhiteSpace = FALSE;              // dump whitespace
$items->formatOutput = TRUE;                     // make output pretty
$items->loadXML('<items/>');                     // create root node

$xpath = new DOMXPath($dom);                     // create a new XPath
$nodes = $xpath->query('//item');                // Find all item elements
foreach($nodes as $node) {                       // iterate over found item nodes
    $copy = $items->importNode($node, TRUE);     // deep copy of item node
    $items->documentElement->appendChild($copy); // append item nodes
}
echo $items->saveXML();                          // outputs the new document

Instead of saveXML(), you'd use save('filename.xml') to write it to a file.


Try:

<?php
$xmlFile = new DOMDocument(); //Instantiate new DOMDocument
$xmlFile->load("URL TO RSS/XML FILE"); //Load in XML/RSS file
$xmlString = file_get_contents("URL TO RSS/XML FILE"); 

$title[] = "";
$description[] = "";
$link[] = "";
$pubDate[] = "";
$guid[] = "";

for($i = 0; $i < substr_count($xmlString, "<item>"); $i++)
{
$title[] = $xmlFile->getElementsByTagName("title")->item(0)->nodeValue; //Get the value of the node <title>
$description[] = $xmlFile->getElementsByTagName("description")->item(0)->nodeValue;
$link[] = $xmlFile->getElementsByTagName("link")->item(0)->nodeValue;
$pubDate[] = $xmlFile->getElementsByTagName("pubDate")->item(0)->nodeValue;
$guid[] = $xmlFile->getElementsByTagName("guid")->item(0)->nodeValue;
}
?>

Untested but the arrays

$title[] $description[] $link[] $pubDate[] $guid[]

should be populated with all of the data that you need!

EDIT: OK so another approach:

<?php
$xmlString = file_get_contents("URL TO RSS/XML FILE"); 
$titles = preg_filter("/<title>([.]*)</title>/","\\1", mixed $xmlString);
$descriptions = preg_filter("/<description>([.]*)</description>/","\\1", mixed $xmlString);
$links = preg_filter("/<link>([.]*)</link>/","\\1", mixed $xmlString);
$pubDates = preg_filter("/<pubDate>([.]*)</pubDate>/","\\1", mixed $xmlString);
$guids = preg_filter("/<guid>([.]*)</guid>/","\\1", mixed $xmlString);
?>

In this example each variable will be filled with the correct values.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜