开发者

PHP, XML, Accessing Attributes

I'm having a some trouble accessing attributes in my XML. My code is below. Initially I had two loops and this was working with no problems.

I would first get the image names and then use the second loop to get the story heading and story details. Then insert everything into the database. I want to tidy up the code and use only one loop. My image name is store in the Href 开发者_如何学Cattribute. ()

Sample XML layout (http://pastie.org/1850682). The XML layout is a bit messy so that was the reason for using two loops.

$xml = new SimpleXMLElement('entertainment/Showbiz.xml', null, true);

    // Get story images
    //$i=0;
    //$image = $xml->xpath('NewsItem/NewsComponent/NewsComponent/NewsComponent/NewsComponent/NewsComponent/ContentItem');
  //  foreach($image as $imageNode){
    //  $attributeArray = $imageNode->attributes(); 
    //  if ($attributeArray != ""){
    //      $imageArray[$i] = $attributeArray;
    //      $i++;
    //  }
    //}

// Get story header & detail
$i=0;
$story = $xml->xpath('NewsItem/NewsComponent/NewsComponent/NewsComponent');
foreach($story as $contentItem){
    //$dbImage = $imageArray[$i]['Href'];
    foreach($contentItem->xpath('ContentItem/DataContent/nitf/body/body.head/hedline/hl1') as $headline){
        $strDetail = "";
        foreach($contentItem->xpath('ContentItem/DataContent/nitf/body/body.content/p') as $detail){
            $strDetail .= '<p>'.$detail.'</p>';
            foreach($contentItem->xpath('NewsComponent/NewsComponent/ContentItem') as $imageNode){
                $dbImage = $imageNode->attributes();    
            }
        }

        $link = getUnique($headline);

        $sql = "INSERT INTO tablename (headline, detail, image, link) VALUES ('".mysql_real_escape_string($headline)."', '".mysql_real_escape_string($strDetail)."', '".mysql_real_escape_string($dbImage)."', '".$link."')";
        if (mysql_query($sql, $db) or die(mysql_error())){
            echo "Loaded ";
        }else{
            echo "Not Loaded "; 
        }

    }
    $i++;
}

I think I'm close to getting it. I tried putting a few echo statements in the fourth nested foreach loop, but nothing was out. So its not executing that loop. I've been at this for a few hours and googled as well, just can't manage to get it.

If all else fails, I'll just go back to using two loops.

Regards, Stephen


This was pretty difficult to follow. I've simplified the structure so we can see the parts of the hierarchy we care about.

PHP, XML, Accessing Attributes

It appears that the NewsComponent that has a Duid attribute is what defines/contains one complete news piece. Of its two children, the first child NewsComponent contains the summary and text, while the second child NewsComponent contains the image.

Your initial XPath query is for 'NewsItem/NewsComponent/NewsComponent/NewsComponent', which is the first NewsComponent child (the one with the body text). You can't find the image from that point because the image isn't within that NewsComponent; you've gone one level too deep. (I was tipped off by the fact I got a PHP Notice: Undefined variable: dbImage.) Thus, drop your initial XPath query back a level, and add that extra level to your subsequent XPath queries where needed.

From this:

$story = $xml->xpath('NewsItem/NewsComponent/NewsComponent/NewsComponent');
foreach($story as $contentItem){
  foreach($contentItem->xpath('ContentItem/DataContent/nitf/body/body.head/hedline/hl1') as $headline){
    foreach($contentItem->xpath('ContentItem/DataContent/nitf/body/body.content/p') as $detail){
      foreach($contentItem->xpath('NewsComponent/NewsComponent/ContentItem') as $imageNode){ /* ... */ }}}}

to this:

$story = $xml->xpath('NewsItem/NewsComponent/NewsComponent');
foreach($story as $contentItem){
  foreach($contentItem->xpath('NewsComponent/ContentItem/DataContent/nitf/body/body.head/hedline/hl1') as $headline){
    foreach($contentItem->xpath('NewsComponent/ContentItem/DataContent/nitf/body/body.content/p') as $detail){
      foreach($contentItem->xpath('NewsComponent/NewsComponent/NewsComponent/ContentItem') as $imageNode){ /* ... */ }}}}

However, the image still doesn't work after that. Because you're using loops (sometimes unnecessarily), $dbImage gets reassigned to an empty string. The first ContentItem has the Href attribute, which gets assigned to $dbImage. But then it loops to the next ContentItem, which has no attributes and therefore overwrites $dbImage with an empty value. I'd recommend modifying that XPath query to find only ContentItems that have an Href attribute, like this:

->xpath('NewsComponent/NewsComponent/NewsComponent/ContentItem[@Href]')

That should do it.


Other thoughts

Refactor to clean up this code, if/where possible.

As I mentioned, sometimes you are looping and nesting when you don't need to, and it just ends up being harder to follow and potentially introducing logical bugs (like the image one). It seems that the structure of this file will always be consistent. If so, you can forgo some looping and go straight for the pieces of data you're looking for. You could do something like this:

// Get story header & detail
$stories = $xml->xpath('/NewsML/NewsItem/NewsComponent/NewsComponent');
foreach ($stories as $story) {
    $headlineItem = $story->xpath('NewsComponent/ContentItem/DataContent/nitf/body/body.head/hedline/hl1');
    $headline = $headlineItem[0];

    $detailItems = $story->xpath('NewsComponent/ContentItem/DataContent/nitf/body/body.content/p');
    $strDetail = '<p>' . implode('</p><p>', $detailItems) . '</p>';

    $imageItem = $story->xpath('NewsComponent/NewsComponent/NewsComponent/ContentItem[@Href]');
    $imageAtts = $imageItem[0]->attributes();
    $dbImage = $imageAtts['Href'];

    $link = getUnique($headline);

    $sql = "INSERT INTO tablename (headline, detail, image, link) VALUES ('".mysql_real_escape_string($headline)."', '".mysql_real_escape_string($strDetail)."', '".mysql_real_escape_string($dbImage)."', '".$link."')";
    if (mysql_query($sql, $db) or die(mysql_error())) {
        echo "Loaded ";
    } else {
        echo "Not Loaded "; 
    }
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜