开发者

removing <br/> from GET request

I'm using a get request to get some page data but need to strip 开发者_如何学Cthe break tags from the finished file. Basically what I'm doing is taking the output of the get request and saving it to a file but it has hundereds of break tags in it I need removed. I'm fine with running a batch or vb script after the file is saved to remove the tags but I'm not sure how on how to do that either. So far the only solutions I have seen is to remove entire lines.

EDIT: This will be deployed to multiple Windows servers so I would like to keep the requirements as minimal as possible. I.E. commands/software that Windows has by default.


If you're au fait with Python, you could use Beautiful Soup to remove <br /> elements in a fairly robust manner. See here for how to remove elements from the tree.


Unless I have misunderstood you could replace the break tags using the replace function in vbscript (assumed from the tag). For example:

cleanedText = Replace(rawText,"<br/>",""))

More information on usage can be found here

http://www.w3schools.com/Vbscript/func_replace.asp

It is worth mention though that that function acts verbatim so you might have to run through a few times to get all common tag markup:

cleanedText = Replace(rawText,"<br/>","")) //no spaces
cleanedText = Replace(cleanedText,"<br />","")) // a space
cleanedText = Replace(cleanedText,"<br>","")) // unterminated
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜