Bash, perl regexp help
I have a text file (utf8):
http://d.pr/1d6T+
Please help me with regexp. I want to replace
<p>
TERRANO...
</p>
with: empty space. :)
And:
<td width="20%" align="left" class="thead">Rám:</td>
开发者_如何学CWith:
<td width="20%" align="left" class="thead">Something else:</td>
Just word "Rám" is also OK to replace.
I found this line, but I dont know how to use it:
find . -type f -exec perl -p -i -e "s/SEARCH_REGEX/REPLACEMENT/g" {} \;
assuming you want to replace text in HTML files:
cd /path/to/my/project
find . -iname '*.html' -exec perl -p -i -e "s/Rám:/Something else:/g" {} \;
find . -iname '*.html' -exec perl -p -i -e "s/TERRANO.../Something else:/g" {} \;
If you do not mind to convert your regular .txt files into .(x)html files and have HTML tidy and xmlstarlet available, you can do without regex!
tidy -v # HTML Tidy for Mac OS X released on 25 March 2009
xmlstarlet --version # 1.0.6
curl -L -o utf8file 'http://d.pr/1d6T+'
# convert HTML to XHTML with tidy
tidy -h
tidy -i -q -c -wrap 0 -numeric -asxml -utf8 --merge-divs yes --merge-spans yes utf8file > utf8file.xhtml
xmlstarlet el -a utf8file.xhtml
xmlstarlet el -v utf8file.xhtml
xmlstarlet edit --help
# edit file in-place
xmlstarlet edit -L -u "//*[local-name()='p']" -v 'EMPTY SPACE IS HERE' utf8file.xhtml
# remove <p> ... </p> completely
xmlstarlet edit -L -d "//*[local-name()='p']" utf8file.xhtml
xmlstarlet edit -L -u "//*[local-name()='td'][@width='20%' and @align='left' and @class='thead' and .='Rám:']" -v 'SOMETHING ELSE:' utf8file.xhtml
open -a Safari utf8file.xhtml
# convert XHTML to HTML with tidy
tidy -i -q -c -wrap 0 -numeric -ashtml -utf8 --merge-divs yes --merge-spans yes utf8file.xhtml > utf8file.html
open -a Safari utf8file.html
To extract just the table from utf8file.xhtml after the in-place editing steps you may use the "print copy of XPATH expression" feature of xmlstarlet:
xmlstarlet sel --help
# test
xmlstarlet sel -I -t -c "//*[local-name()='table'][@id='model-table-specifikacia']" utf8file.xhtml
xmlstarlet sel -I -t -c "//*[local-name()='table'][@id='model-table-specifikacia']" utf8file.xhtml > utf8file
Old topic, but useful: For mass search and replaces, I tend to use a Perl peewee (name based on the arguments used) rather than relying on find and then executing perl code.
That is, I use the following:
perl -pi -w -e 's/<p>\nTERRANO.+?\n<\/p>/<p>\n\n<\/p>/g;' ./*.html
and
perl -pi -w -e 's/<td (.+?) class=\"thead\">Rám:<\/td>/<td $1 class="thead">Something else:<\/td>/g;' ./*.html
Hope that helps somebody!
精彩评论