开发者

remove certain javascript from html

I want to remove following javascript from html file.

<script src="text/javascript>
alert('hello');

})();

</script>

and

<script src="text/javascript>
alert('hello');
} catch(err) {}</script>

By reading http://www.cyberciti.biz/faq/sed-howto-remove-lines-paragraphs/ I can use:

sed '/<script type="text\/javascript"/,/<\/script>/d'

but it will remove all the javascript.

My specific requirement is javascript one ending with })(); (new line)</script> and other ending with } catch(err) {}</script>

I want to use sed, if not possible then any program similar to sed so that I can run it through script.

Th开发者_JAVA百科ank you for taking your time.


Use awk or a programming language of your choice

awk -vRS="</script>" '/<script/ { if(/}\)\(\);|catch\(err\)/) { gsub(/script.*/,"");}  }1' file


sed '/text\\/javascript/{:a;N;/<\/script>/!ba;s/.*})();\n\n<\/script>|.*} catch(err) {}<\/script>//}'

It will remove all the javascript blocks end with })(); (new line)</script> or } catch(err) {}</script>.

A little explaination:

  • /text/javascript/: the block begins from a tag contains text/javascript
  • :a: create a label
  • N: read the next line to the pattern space
  • /<\script>/!ba: if it's not the close tag, branch to the lable a
  • s/pattern//:remove the pattern space if it satisfy the condition
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜