开发者

batch html file editing

I have a collection of one thousand HTML files and need to somewhat trim them. I need to delete all the tags inside <body></body> area of those except for one, <div.pg>, to make them clean to be printed. the excess are nav开发者_StackOverflowigation links which make the prints messy and make the pages occupy more paper. the contents are not the same so I can't find and replace the code excerpt but the tags are the same foe example there are 3 <table> tags to be deleted each with specific class. manipulate specific tags inside batch HTML files?

Any batch processing technique or software to do this job? What an easy solution on windows?


I would use an xslt transform on each html page you have. Batch is not the tool to manipulate html files. You can use batch as a "manager" to pass the required file to the xsl transform. Also windows have a rudimentary msxml utility which you can download and install to your machine : http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=21714

That's how I would do it. I am sure there are more options.


If it is XHTML you could use XSLT to transform your HTML to "another" format. Look for example here: http://www.w3schools.com/xsl/ or here: http://help.hannonhill.com/discussions/how-do-i/269-strip-specific-html-tag-in-xslt

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜