开发者

Pretty print ("indentation-only") HTML documents in Java (no JTidy)

We're generating HTML files out of apaches velocity generic template engine. The generated HTML is kind of ugly and not with correcht indentation.

In my case I've got the HTML stored in a String which I want to manipulate in this way, that it looks pretty printed.

I've already gave JTidy a try, but it changes the HTML source code w开发者_StackOverflowhen I pipe the raw HTML trough it. Sometimes it adds or removes HTML tags.

My question:

Is there a java library or something else out there which (only!) pretty prints my HTML code without adding, removing tags from my HTML document? It shall only do the indentation, so that it looks pretty printed! Nothing more, nothing less. Any ideas? :-)

Also code suggestions, hints or tips are welcome.

Best regards


Maybe a little to late, but I found a solution to this with Jsoup.

you can get the "pretty" version of the html by using only the parser, and (in case of needed) avoid the generation of the html elements by using a "custom parser"

I got the answer from this Jsoup question

And its

public static String formatHTML(String html) throws Exception{ Document doc = Jsoup.parse(html, "", Parser.xmlParser()); return doc.toString(); }

I hope this helps.

Regards


Find any SAX parser example in java. indent++ for opening tags, intent-- for closing, and write content with counted intentation.


Why don't you write a simple Java parser to pretty print HTML yourself. Here is a sketch:

  1. Track open and close tags for example and
  2. have a counter to figure out the current indentation level.
  3. Perhaps use a stack to push, pop the indentation level
  4. Just iterate thru the HTML string and push the current indentation level on stack when you see a tag
  5. If you see a nested tag then increment indentation level and keep going
  6. When you see an end of tag e.g . etc then pop the stack to go back to prev indent level

I wanted to give you a rough idea here, you can use this as a starting point. I have written many perl based pretty printers. You could use Perl to script a parse fairly quickly..

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜