How to remove line breaks from a file?

2023-02-25 15:56 问答作者：

How to remove:

<p> (break line!!!)
text...
</p> (break line!!!)

from a file with regex?

I tried:

开发者_JAVA百科find . -type f -exec perl -p -i -e "s/SEARCH_REGEX/REPLACEMENT/g" {} \;

This stuff can really blow up in your face so be careful; try it with test data in a test dir etc.

The -0 switch will "turn off" the default record separator ($/) so you can do multiple lines at once. The s lets . match across newlines and the +? is to make it lazy up to the "TERRANO." Try this test on one of your files.

perl -0 -p -e 's/<p>.+?TERRANO[^<]*<\/p>//gs'

If that works, you can add it to your original.

find . -type f -exec perl -0 -pi -e "s/<p>.+?TERRANO[^<]*<\/p>//gs" {} \;

As mentioned in a comment, if the content is HTML, you should probably be using an HTML parser.

Several ways to do it.

First is to undef $\. Then you match something like

/\<p\>\nTERRANO.*\n\<\/p\>/

which may depend upon whether or not you are using cr/lf's, or just lf's/

Second is to use a loop to concatenate the lines (plus whatever is in $\) and match that in one regex, including matching whatever is in $\.

Third would be to use File::Slurp.

Fourth is to use several regexes and a loop to match each line, and if all three are satisfied, do your substitution.

You may also use the Unix text editor ed to remove a range of lines with regex:

str='
BEFORE MULTILINE PATTERN 1
<p> (break line!!!)
text...
</p> (break line!!!)
AFTER MULTILINE PATTERN 1
BEFORE MULTILINE PATTERN 2 
<p> (break line!!!)
text...
</p> (break line!!!)
AFTER MULTILINE PATTERN 2
'

# for in-place file editing use "ed -s file" and replace ",p" with "w"
# cf. http://wiki.bash-hackers.org/howto/edit-ed

cat <<-'EOF' | sed -e 's/^ *//' -e 's/ *$//' -e '/^ *#/d' | ed -s <(echo "$str")
  H
  # only remove the first match
  #/<p>/,/<\/p>/d
  # remove all matches
  g/<p>/+0,/<\/p>/+0d
  ,p
  q
EOF

You may want to use multi-line regexp:

s/regexp/replacement/m

See here

继续阅读：bash perl programming-languages regex

How to remove line breaks from a file?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？