remove certain javascript from html
I want to remove following javascript from html file.
<script src="text/javascript>
alert('hello');
})();
</script>
and
<script src="text/javascript>
alert('hello');
} catch(err) {}</script>
By reading http://www.cyberciti.biz/faq/sed-howto-remove-lines-paragraphs/ I can use:
sed '/<script type="text\/javascript"/,/<\/script>/d'
but it will remove all the javascript.
My specific requirement is javascript one ending with })(); (new line)</script>
and other ending with } catch(err) {}</script>
I want to use sed, if not possible then any program similar to sed so that I can run it through script.
Th开发者_JAVA百科ank you for taking your time.
Use awk
or a programming language of your choice
awk -vRS="</script>" '/<script/ { if(/}\)\(\);|catch\(err\)/) { gsub(/script.*/,"");} }1' file
sed '/text\\/javascript/{:a;N;/<\/script>/!ba;s/.*})();\n\n<\/script>|.*} catch(err) {}<\/script>//}'
It will remove all the javascript blocks end with })(); (new line)</script>
or } catch(err) {}</script>
.
A little explaination:
- /text/javascript/: the block begins from a tag contains
text/javascript
- :a: create a label
- N: read the next line to the pattern space
- /<\script>/!ba: if it's not the close tag, branch to the lable
a
- s/pattern//:remove the pattern space if it satisfy the condition
精彩评论