Measuring HTML equivalence?
I'm wondering if anyone knows of a good library for Java to use to measure HTML equivalence?
For example <td class="one two three" name="goat">
would be equivalent to <td name="goat" class="three two one">
. I would like to compare entire many-lined strings of html in this manner using Java.
Any suggestions?
UPDATE:
so I tried the use of XmlUnit's Diff.similar() and found that I was getting that these two were similar:
<html three="3" two="2" one="1"></html>
and <html one="one" two="two"></html>
This is undesired behavior... Are 开发者_如何学JAVAthere any other options?
You could use a html parser like NekoHTML or JTidy, and then use the Diff class of XMLUnit to compare the two XML documents.
精彩评论