Parse HTML without xpath
I'm trying to create a simple tool to parse html files.
Specifically, I开发者_开发知识库 need it to get all the name
attributes out of all the div
tags.
My HTML string varies and I don't have any control over it, so if I try and use xpath I tend to get errors as the HTML is not 100% written correctly.
Any ideas?
Thanks,
There is also a great class called PHP Simple HTML DOM Parser on http://simplehtmldom.sourceforge.net/
Works fine with invalid HTML, but needs a lot of memory for parsing long html-files.
精彩评论