any html/css parsing library for ruby & PHP?
I am 开发者_StackOverflowabout to finish my script that parses/scrapes website using mechanize&ruby.
I need to port my script to PHP in the future.
My question is
- if there is any library available for both ruby and php or
- if anybody can recommend any other approach to this?
There's no PHP equivalent of Ruby and Mechanize.
However, Zend_Framework offers some great scraping-related libraries including
- Zend_URI and Zend_HTTP_Client
- Zend_Dom
As standard, PHP comes with several tools for parsing XML (and the DOM one can cope with a lot of badly formed HTML)
See
http://uk3.php.net/manual/en/refs.xml.php
C.
For DOM manipulation in PHP use the DOMDocument class
Simple and easy :)
Another DOM manipulation tool for php is phpQuery.
精彩评论