开发者

PHP Table into array

I´m trying to read a table from a HTML file into an array, I'm stuck. Any help would be appreciated.

Every table element should be stored into 1 array value

example: $arr[1]= DER HE1 ges 1

PHP

<?php
      libxml_use_internal_errors(true);
      $i=0;
      // new dom object  
      $dom = new DOMDocument();  

      //load the html  
      $html = $dom->loadHTMLFile("106642new.html");  

      //discard white space   
      $dom->preserveWhiteSpace = false;   

      //the table by its tag name  
      $tables = $dom->getElementsByTagName('table');   

      //get all rows from the table  
      $rows = $tables->item(0)->getElementsByTagName('tr');   
      // $test = $tables->item(0)->getElementsByTagName('td');   

      // loop over the table rows  
      foreach ($rows as $row) {
          // get each column by tag name  
          $cols = $row->getElementsByTagName('td');  
          $i= $i + 1 ;
          $value = "Nummer: ".$i.":  ".$cols->item(0)->nodeValue.PHP_EOL;
          // $value = "test: ".$i.":  ".$cols->item(0)->nodeValue.PHP_EOL;
          $cols = array(1, 2, 3, 4, 5);
          echo $value;
          //  $cols[$i] = $row; 
          // echo the values    
          //echo $cols->item(0)->nodeValue ; 
      }   
?>

HTML:

<body bgcolor="#FFFFFF" topmargin="0" leftmargin="0" marginwidth="0" marginheight="0">

          <div align=left>

          <table BORDER=0 CELLSPACING=0 CELLPADDING=0 WIDTH="100%" height="100%">

          <tr><td valign="top">&nbsp</td></tr>

          <tr><td valign="top">

          <p font class="Header">Basisrooster schooljaar 2011 2012 (m.i.v. 12-09-11)</font></p>
          <br><div font class="lNameHeader"> </font> </div><table border=1>
          <tr class="AccentDark">
           <td align="left" width="65" class="tableHeader"></td>
           <td align="center" width="auto" class="tableHeader">Maandag</td>
           <td align="center" width="auto" class="tableHeader">Dinsdag</td>
           <td align="center" width="auto" class="tableHeader">Woensdag</td>
           <td align="center" width="auto" class="tableHeader">Donderdag</td>
           <td align="center" width="auto" class="tableHeader">Vrijdag</td>
          </tr><tr>
           <td align="left" width="50" class="tableHeader">1e uur</td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell"></td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell"></td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell"></td>
           <td align="left" width="9" class="tableCell"></td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">WAS</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HE09</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">econ</td>
           <td align="left" width="9" class="tableCell">5</td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">WIK</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HC17</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">biol</td>
           <td align="left" width="9" class="tableCell">4</td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">OTT</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HC01</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">dutl</td>
           <td align="left" width="9" class="tableCell">6</td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell"></td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell"></td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell"></td>
           <td align="left" width="9" class="tableCell"></td>
          </tr>
          </table>
          </td>
          </tr>
          <tr>
           <td align="left" width="50" class="tableHeader">2e uur</td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">KEJ</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HC02</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">wisA</td>
           <td align="left" width="9" class="tableCell">3</td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">BRT</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HE05</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">netl</td>
           <td align="left" width="9" class="tableCell"></td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">OTT</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HC01</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">dutl</td>
           <td align="left" width="9" class="tableCell">6</td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">BAU</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HG01</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">lo</td>
           <td align="left" width="9" class="tableCell"></td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">MET</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HD02</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">entl</td>
           <td align="left" width="9" class="tableCell"></td>
          </tr>
          </table>
          </td>
          </tr>
          <tr>
           <td align="left" width="50" class="tableHeader">3e uur</td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">WAS</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HE07</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">econ</td>
           <td align="left" width="9" class="tableCell">5</td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">MET</td>
           <td align="left" width="3" class="tableCel开发者_如何学Pythonl">&nbsp</td>
           <td align="left" width="75" class="tableCell">HD02</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">entl</td>
           <td align="left" width="9" class="tableCell"></td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">WAS</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HE05</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">econ</td>
           <td align="left" width="9" class="tableCell">5</td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">BAU</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HG01</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">lo</td>
           <td align="left" width="9" class="tableCell"></td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">KEJ</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HC02</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">wisA</td>
           <td align="left" width="9" class="tableCell">3</td>
          </tr>
          </table>
          </td>
          </tr>
          <tr>
           <td align="left" width="50" class="tableHeader">4e uur</td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell"></td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell"></td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell"></td>
           <td align="left" width="9" class="tableCell"></td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">DER</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HE08</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">ges</td>
           <td align="left" width="9" class="tableCell">1</td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">KEJ</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HC06</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">wisA</td>
           <td align="left" width="9" class="tableCell">3</td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">DER</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HE10</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">ges</td>
           <td align="left" width="9" class="tableCell">1</td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">CHR</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HB15</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">ckv</td>
           <td align="left" width="9" class="tableCell"></td>
          </tr>
          </table>
          </td>
          </tr>
          <tr>
           <td align="left" width="50" class="tableHeader">5e uur</td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">DOC</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HE09</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">m&o</td>
           <td align="left" width="9" class="tableCell">2</td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell"></td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell"></td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell"></td>
           <td align="left" width="9" class="tableCell"></td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">MET</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HD02</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">entl</td>
           <td align="left" width="9" class="tableCell"></td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">BRT</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HE05</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">netl</td>
           <td align="left" width="9" class="tableCell"></td>
          </tr>
          </table>
          </td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">OTT</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HC03</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">dutl</td>
           <td align="left" width="9" class="tableCell">6</td>
          </tr>
          </table>
          </td>
          </tr>
          <tr>
           <td align="left" width="50" class="tableHeader">6e uur</td>
           <td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
          <tr>
           <td align="left" width="41" class="tableCell">OTT</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="75" class="tableCell">HC03</td>
           <td align="left" width="3" class="tableCell">&nbsp</td>
           <td align="left" width="73" class="tableCell">dutl</td>
           <td align="left" width="9" class="tableCell">6</td>
          </tr>
          </table>
          </td>


If think the problem is that your first table is a container of other tables. If you want to get the contents of all the tables, than you should also iterate through the tables list.

If you just want to get the contents of a inner table, than first try to locate it in the DOM. I suggest finding the first table, than geting all table elements inside that and iterate through them.

var_dump is a good starting point for debugging, you don't need anything else than you already did, just debug and test more :)


I'm guessing that the fact that it's invalid HTML/XML is screwing you over.

You're using the loadHTMLFile() function which might support malformed HTML up to an extent, but it might also need valid HTML/XML.

If it requires valid XML, then what's probably happening is that the "<br>" doesn't get interpreted as a stand-alone node, but rather as the starting point of a node... meaning that everything after that becomes sub-nodes of "<br>".

Furthermore this line here doesn't make any sense:

<p font class="Header">Basisrooster schooljaar 2011 2012 (m.i.v. 12-09-11)</font></p>

The <font> tag has been obsolete for years and should never be used, but more importantly it's not a font tag but a p-tag, that still also gets closed as if it's a font-tag. Just do:

<p class="Header">Basisrooster schooljaar 2011 2012 (m.i.v. 12-09-11)</p>

So the solution may be that your HTML/XML is invalid.

(Dan Bizdadea also has a good point.)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜