开发者

Extending DOMElement in PHP with registerNodeClass

registerNodeClass is great for extending the various DOMNode-based DOM classes in PHP, but I need to go one level deeper.

I've created an extDOMElement that extends DOMElement. This works great with registerNodeClass, but I would like to have something that works more like this: registerNodeClass("DOMElement->nodeName='XYZ'", 'extDOMXYZElement')

Consider the following XML document, animals.xml:

<animals>
    <dog name="fido" />
    <dog name="lucky" />
    <cat name="scratchy" />
    <horse name="flicka" />
</animals>

Consider the following code:

extDomDocument extends DOMDocument {
    public function processAnimals() {
        $animals = $this->documentElement->childNodes;
        foreach($animals as $animal) {
            $animal->process();
        }
    }
}


extDOMElement extends DOMElement {
    public function process() {
        if ($this->nodeName=='dog'){
            $this->bark();
        } elseif ($this->nodeName=='cat'){
            $this->meow();开发者_高级运维
        } elseif  ($this->nodeName=='horse'){
            $this->whinny();
        }
        this->setAttribute('processed','true');
    }
    private function bark () {
        echo "$this->getAttribute('name') the $this->nodeName barks!";
    }
    private function meow() {
        echo "$this->getAttribute('name') the $this->nodeName meows!";
    }
    private function whinny() {
        echo "$this->getAttribute('name') the $this->nodeName whinnies!";
    }
}

$doc = new extDOMDocument();
$doc->registerNodeClass('DOMElement', 'extDOMElement');
$doc->loadXMLFile('animals.xml');
$doc->processAnimals();
$doc->saveXMLFile('animals_processed_' . Now() . '.xml');

Output: fido the dog barks! lucky the dog barks! scratchy the cat meows! flicka the horse whinnies!

I don't want to have to put bark(), meow() and whinny() into extDOMElement - I want to put them into extDOMDogElement, extDOMCatElement and extDOMHorseElement, respectively.

I've looked at the Decorator and Strategy patterns here, but I'm not exactly sure how to proceed. The current setup works OK, but I'd prefer to have shared properties and methods in extDOMElement with separate classes for each ElementName, so that I can separate methods and properties specific to each Element out of the main classes.


I had the same problem. My solution was to write an own parser based on the XMLReader extension. The resulting AdvancedParser class works very well for me. A separate element class can be registered for each element name. By extending the AdvancedParser class and overwriting the getClassForElement() method it's also possible to dynamically calculate the name of the desired class based on the element name.

/**
 * Specialized Xml parser with element based class registry.
 * 
 * This class uses the XMLReader extension for document parsing and creates
 * a DOM tree with individual DOMElement subclasses for each element type.
 * 
 * @author       Andreas Traber < a.traber (at) rivo-systems (dot) com >
 * 
 * @since        April 21, 2012
 * @package      XML
 */
class AdvancedParser
{
    /**
     * Map with registered classes.
     * @var array
     */
    protected $_elementClasses = array();

    /**
     * Default class for unknown elements.
     * @var string
     */
    protected $_defaultElementClass = 'DOMElement';

    /**
     * The reader for Xml parsing.
     * @var XMLReader
     */
    protected $_reader;

    /**
     * The document object.
     * @var DOMDocument
     */
    protected $_document;

    /**
     * The current parsing element.
     * @var DOMElement
     */
    protected $_currentElement;

    /**
     * Gets the fallback class for unknown elements.
     * 
     * @return string
     */
    public function getDefaultElementClass()
    {
        return $this->_defaultElementClass;
    }

    /**
     * Sets the fallback class for unknown elements.
     * 
     * @param string $class
     * @return void
     * @throws Exception $class is not a subclass of DOMElement.
     */
    public function setDefaultElementClass($class)
    {
        switch (true) {
            case $class === null:
                $this->_defaultElementClass = 'DOMElement';
                break;

            case !$class instanceof DOMElement:
                throw new Exception($class.' must be a subclass of DOMElement');

            default:
                $this->_defaultElementClass = $class;
        }
    }

    /**
     * Registers the class for a specified element name.
     * 
     * @param string $elementName.
     * @param string $class.
     * @return void
     * @throws Exception $class is not a subclass of DOMElement.
     */
    public function registerElementClass($elementName, $class)
    {
        switch (true) {
            case $class === null:
                unset($this->_elementClasses[$elementName]);
                break;

            case !$class instanceof DOMElement:
                throw new Exception($class.' must be a subclass of DOMElement');

            default:
                $this->_elementClasses[$elementName] = $class;
        }
    }

    /**
     * Gets the class for a given element name.
     * 
     * @param string $elementName
     * @return string
     */
    public function getClassForElement($elementName)
    {
        return $this->_elementClasses[$elementName]
            ? $this->_elementClasses[$elementName]
            : $this->_defaultElementClass;
    }

    /**
     * Parse Xml Data from string.
     * 
     * @see XMLReader::XML()
     * 
     * @param string $source String containing the XML to be parsed.
     * @param string $encoding The document encoding or NULL.
     * @param string $options A bitmask of the LIBXML_* constants.
     * @return DOMDocument The created DOM tree.
     */
    public function parseString($source, $encoding = null, $options = 0)
    {
        $this->_reader = new XMLReader();
        $this->_reader->XML($source, $encoding, $options);
        return $this->_parse();
    }

    /**
     * Parse Xml Data from file.
     * 
     * @see XMLReader::open()
     * 
     * @param string $uri URI pointing to the document.
     * @param string $encoding The document encoding or NULL.
     * @param string $options A bitmask of the LIBXML_* constants.
     * @return DOMDocument The created DOM tree.
     */
    public function parseFile($uri, $encoding = null, $options = 0)
    {
        $this->_reader = new XMLReader();
        $this->_reader->open($uri, $encoding, $options);
        return $this->_parse();
    }

    /**
     * The parser.
     * 
     * @return DOMDocument The created DOM tree.
     */
    protected function _parse()
    {
        $this->_document = new DOMDocument('1.0', 'utf-8');
        $this->_document->_elements = array(); // keep references to elements
        $this->_currentElement = $this->_document;
        while ($this->_reader->read()) {
            switch ($this->_reader->nodeType) {
                case XMLReader::ELEMENT:
                    $this->_reader->isEmptyElement
                        ? $this->_addElement()
                        : $this->_currentElement = $this->_addElement();
                    break;

                case XMLReader::END_ELEMENT:
                    $this->_currentElement = $this->_currentElement->parentNode;
                    break;

                case XMLReader::CDATA:
                    $this->_currentElement->appendChild(
                        $this->_document->createCDATASection($this->_reader->value)
                    );
                    break;

                case XMLReader::TEXT:
                case XMLReader::SIGNIFICANT_WHITESPACE:
                    $this->_currentElement->appendChild(
                        $this->_document->createTextNode($this->_reader->value)
                    );
                    break;

                case XMLReader::COMMENT:
                    $this->_currentElement->appendChild(
                        $this->_document->createComment($this->_reader->value)
                    );
                    break;
            }
        }
        $this->_reader->close();
        return $this->_document;
    }

    /**
     * Adds the current element into the DOM tree.
     * 
     * @return DOMElement The added element.
     */
    protected function _addElement()
    {
        $element = $this->_createElement();

        // It's important to keep a reference to each element.
        // Elements without any reference were destroyed by the
        // garbage collection and loses their type.
        $this->_document->_elements[] = $element;

        $this->_currentElement->appendChild($element);
        $this->_addAttributes($element);
        return $element;
    }

    /**
     * Creates a new element.
     * 
     * @return DOMElement The created element.
     */
    protected function _createElement()
    {
        $class = $this->getClassForElement($this->_reader->localName);
        return new $class(
            $this->_reader->name,
            $this->_reader->value,
            $this->_reader->namespaceURI
        );
    }

    /**
     * Adds the current attributes to an $element.
     * 
     * @param DOMElement $element
     * @return void
     */
    protected function _addAttributes(DOMElement $element)
    {
        while ($this->_reader->moveToNextAttribute()) {
            $this->_reader->prefix && ($uri = $this->_reader->lookupNamespace($this->_reader->prefix))
                ? $element->setAttributeNS($uri, $this->_reader->name, $this->_reader->value)
                : $element->setAttribute($this->_reader->name, $this->_reader->value);
        }
    }
}


I can't really pin point it but what you're trying has a certain smell, like Gordon already pointed out.
Anyway... you could use __call() to expose different methods on your extDOMElement object depending on the actual node (type/contents/...). For that purpose your extDOMElement object could store an helper object which is instantiated according to the "type" of the element and then delegate method calls to this helper object. Personally I don't like that too much as it doesn't exactly make documentation, testing and debugging any easier. If that sounds feasible to you I can write down a self-contained example.


This certainly needs comments/documentation ...work in progress since I don't have the time right now...

<?php
$doc = new MyDOMDocument('1.0', 'iso-8859-1');
$doc->loadxml('<animals>
  <Foo name="fido" />
  <Bar name="lucky" />
  <Foo name="scratchy" />
  <Ham name="flicka" />
  <Egg name="donald" />
</animals>');
$xpath = new DOMXPath($doc);
foreach( $xpath->query('//Foo') as $e ) {
  echo $e->name(), ': ', $e->foo(), "\n";
}
echo "----\n";
foreach( $xpath->query('//Bar') as $e ) {
  echo $e->name(), ': ', $e->bar(), "\n";
}
echo "====\n";
echo $doc->savexml();


class MyDOMElement extends DOMElement {
  protected $helper;

  public function getHelper() {
    // lazy loading and caching the helper object
    // since lookup/instantiation can be costly
    if ( is_null($this->helper) ) {
      $this->helper = $this->resolveHelper();
    }
    return $this->helper;
  }

  public function isType($t) {
    return $this->getHelper() instanceof $t;
  }

  public function __call($name, $args) {
    $helper = $this->getHelper();
    if ( !method_exists($helper, $name) ) {
      var_dump($name, $args, $helper);
      throw new Exception('yaddayadda');
    }
    return call_user_func_array( array($this->helper, $name), $args);
  }

  public function releaseHelper() {
    // you might want to consider something like this
    // to help php with the circular references
    // ...or maybe not, haven't tested the impact circual references have on php's gc
    $this->helper = null;
  }

  protected function resolveHelper() {
    // this is hardcored only for brevity's sake
    // add any kind of lookup/factory/... you like
    $rv = null;
    switch( $this->tagName ) {
      case 'Foo':
      case 'Bar':
        $cn = "DOMHelper".$this->tagName;
        return new $cn($this);
      default:
        return new DOMHelper($this);
        break;
    }
  }
}

class MyDOMDocument extends DOMDocument {
  public function __construct($version=null,$encoding=null) {
    parent::__construct($version,$encoding);
    $this->registerNodeClass('DOMElement', 'MyDOMElement');
  }
}

class DOMHelper {
  protected $node;
  public function __construct(DOMNode $node) {
    $this->node = $node;
  }
  public function name() { return $this->node->getAttribute("name"); }
}

class DOMHelperFoo extends DOMHelper {
  public function foo() {
    echo 'foo';
    $this->node->appendChild(  $this->node->ownerDocument->createElement('action', 'something'));
  }
}

class DOMHelperBar extends DOMHelper {
  public function bar() {
    echo 'bar';
    $this->node->setAttribute('done', '1');
  }
}

prints

fido: foo
scratchy: foo
----
lucky: bar
====
<?xml version="1.0"?>
<animals>
  <Foo name="fido"><action>something</action></Foo>
  <Bar name="lucky" done="1"/>
  <Foo name="scratchy"><action>something</action></Foo>
  <Ham name="flicka"/>
  <Egg name="donald"/>
</animals>


EDIT for the code you show, wouldn't it be easier not to extend DOMElement at all? Just pass in the regular DOMElements to your processing Strategies, e.g.

class AnimalProcessor
{
    public function processAnimals(DOMDocument $dom) {
        foreach($dom->documentElement->childNodes as $animal) {
            $strategy = $animal->tagName . 'Strategy';
            $strategy = new $strategy($animal);
            $strategy->process();
        }
    }
}

$dom = new DOMDocument;
$dom->load('animals.xml');
$processor = new AnimalProcessor;
$processor->processAnimals($dom);

Original answer before question update

Not sure if this is what you are looking for, but if you want specialized DOMElements, you can simply create them and use them directly, e.g. bypassing createElement, so you dont have to registerNodeClass at all.

class DogElement extends DOMElement
{
    public function __construct($value) 
    {
        parent::__construct('dog', $value);
    }
}
class CatElement extends DOMElement
{
    public function __construct($value) 
    {
        parent::__construct('cat', $value);
    }
}
$dom = new DOMDocument;
$dom->loadXML('<animals/>');
$dom->documentElement->appendChild(new DogElement('Sparky'));
$dom->documentElement->appendChild(new CatElement('Tinky'));
echo $dom->saveXml();

I don't think you can easily use registerNodeClass to instantiate elements based on the tagname or influence parsing that much. But you can override DOMDocument's createElement class, e.g.

class MyDOMDocument extends DOMDocument
{
    public function createElement($nodeType, $value = NULL)
    {
        $nodeType = $nodeType . 'Element';
        return new $nodeType($value);
    }
}
$dom = new MyDOMDocument;
$dom->loadXML('<animals/>');
var_dump( $dom->createElement('Dog', 'Sparky') ); // DogElement
var_dump( $dom->createElement('Cat', 'Tinky') ); // CatElement
echo $dom->saveXml();
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜