开发者

Parsing Complexed XML returned from OPS Patent Database using PHP & SimpleXML

I have been going crazy with SimpleXML trying to get values into usable PHP variables and it's driving me insane.

I am genuinely hoping some of you more talented coders out there can help me out... I will be as thorough as I can...

I am using the Open Patent Service API. Using the following 开发者_开发问答URL I can easily generate a formatted XML file with all the Data I need.

<?php

// Patent Reference Number
$ref = "EP2359415";

// URL for XML response
$url =  "http://ops.epo.org/2.6.2/rest-services/published-data/publication/epodoc/".$ref."/biblio";

// Reading the XML Response
$sitemap = new SimpleXMLElement($url);

// Echo out values from the XML Data
foreach($needhelp as $here) {
   echo "Need Help Here!";
   // Will be taking data and placing into a database here...
 } ?>

If you see the URL... http://ops.epo.org/2.6.2/rest-services/published-data/publication/epodoc/EP2359415/biblio

You will see how complicated the XML returned is. Basically I cannot get any values out of the data via php loops...

Any help would be greatly appreciated... Dean


I know this is an old question, but I could never get SimpleXML to do anything. Given that this is the only thing that comes up in a Google search about using the European Patent OFfice API with PHP, I thought I'd document what worked for me...

Here's how I solved it:

# build query url
$patent_url = 'http://ops.epo.org/3.0/rest-services/published-data/search/full-cycle/?q='.urlencode($your_query);

# grab the contents of $patent_url
$patent_raw = file_get_contents($patent_url);

# create an XML parser
$resource = xml_parser_create();

# parse XML into array 
xml_parse_into_struct($resource, $patent_raw, $patent_array);

# close the parser - you want to do this...    
xml_parser_free($resource);

Now you have a standard PHP array ($patent_array) you can iterate through. Note that this is similar to my code, but not exactly the same - you may have to tweak it if you cut/paste... Of course, you still have to figure out what to do with the ridiculously complex designed-by-committee data structure, but at least it's in mugable form.

Edit:

While trying to get more complex results, it became clear that EPO data is not strict XML. SimpleXML & the above code both do nothing when trying to parse results. The solution was to use a DOM XML parser, which is fault tolerant. The code I used is described here: http://set.cooki.me/archives/225/php-dom-xml-to-array


$xml = simplexml_load_file($url);
$xml->registerXPathNamespace('os', $url);
foreach ($xml->children() as $child)
{
  // your insertion into database
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜