how to remove attribute from html tags in java
I wanted to remove the particular attribute from anchor tag:
<a id="nav-askquestion" style="cursor:default" href="/questions">
output:开发者_Go百科-
<a href="/questions">
through java program
we use htmlparser for this kind of job
you can parse and modify nodes with this untested snipplet:
NodeVisitor visitor = new NodeVisitor() {
public void visitTag(Tag tag) {
tag.removeAttribute("id");
tag.removeAttribute("style");
}
};
Parser parser = new Parser(...);
parser.visitAllNodesWith(visitor);
This little snippet will do the trick.
Ask me if you need some questions about the Regex
public class test {
public static void main(String[] args) {
String htmlFragment ="<a id=\"nav-askquestion\" style=\"cursor:default\" href=\"/questions\">";
String attributesToRemove = "id|style";
System.out.println(htmlFragment);
System.out.println(cleanHtmlFragment(htmlFragment, attributesToRemove));
}
private static String cleanHtmlFragment(String htmlFragment, String attributesToRemove) {
return htmlFragment.replaceAll("\\s+(?:" + attributesToRemove + ")\\s*=\\s*\"[^\"]*\"","");
}
}
People might suggest to use regex, but beware, you can use an XML Parser.
精彩评论