How do I remove part of a line in a multi-line chunk using sed or Perl?

2023-02-07 12:50 问答作者：

I have some data that looks like this. It comes in chunk of four. Each chunk starts with a @ character.

@SRR037212.1 FC30L5TAA_102708:7:1:741:1355 length=27
AAAAAAAAAAAAAAAAAAAAAAAAAAA
+SRR037212.1 FC30L5TAA_102708:7:1:741:1355 length=27
::::::::::::::::::::::::;;8
@SRR037212.2 FC30L5TAA_102708:7:1:开发者_运维问答1045:1765 length=27
TATAACCAGAAAGTTACAAGTAAACAC
+SRR037212.2 FC30L5TAA_102708:7:1:1045:1765 length=27
888888888888888888888888888

At the third line of each chunk, I want to remove the text that comes after the + character, resulting in:

@SRR037212.1 FC30L5TAA_102708:7:1:741:1355 length=27
AAAAAAAAAAAAAAAAAAAAAAAAAAA
+
::::::::::::::::::::::::;;8
@SRR037212.2 FC30L5TAA_102708:7:1:1045:1765 length=27
TATAACCAGAAAGTTACAAGTAAACAC
+
888888888888888888888888888

Is there a compact way to do that in sed or Perl?

Assuming you just don't want to blindly remove the rest of every line starting with a +, then you can do this:

sed '/^@/{N;N;s/\n+.*/\n+/}' infile

Output

$ sed '/^@/{N;N;s/\n+.*/\n+/}' infile
@SRR037212.1 FC30L5TAA_102708:7:1:741:1355 length=27
AAAAAAAAAAAAAAAAAAAAAAAAAAA
+
::::::::::::::::::::::::;;8
@SRR037212.2 FC30L5TAA_102708:7:1:1045:1765 length=27
TATAACCAGAAAGTTACAAGTAAACAC
+
888888888888888888888888888
+Dont remove me

*Note: Although the above command keys on the @ to determine if a line with a + should be altered, it will still alter the 2nd line if it happens to also start with a +. It doesn't sound like this is the case, but if you want to exclude this corner case as well, the following minor alteration will protect against that:

sed '/^@/{N;N;s/\(.*\)\n+.*/\1\n+/}' infile

Output

$ sed '/^@/{N;N;s/\(.*\)\n+.*/\1\n+/}' ./infile
@SRR037212.1 FC30L5TAA_102708:7:1:741:1355 length=27
+AAAAAAAAAAAAAAAAAAAAAAAAAAA
+
::::::::::::::::::::::::;;8
@SRR037212.2 FC30L5TAA_102708:7:1:1045:1765 length=27
TATAACCAGAAAGTTACAAGTAAACAC
+
888888888888888888888888888
+Dont remove me

If there is never a + on the first or second lines and always one on the third line:

perl -0100pi -e's/\+.*/+/' datafile

Otherwise:

perl -0100pi -e's/^((?:.*\n){2}.*?\+).*/$1/' datafile

or on 5.10+:

perl -0100pi -e's/^(?:.*\n){2}.*?\+\K.*//' datafile

All those assume @ only appears at the start of a chunk. If it may appear other places, then:

perl -pi -e's/\+.*/+/ if $. % 4 == 3' datafile

If you can use awk, you can do:

 gawk '{if ($0 ~ /^@/ ) { print ; getline ; print ; getline ; print "+"}}' INPUTFILE

So if gawk sees an @ at the start of the line, it will be printed, then the next line will be slurped && printed, and finally slurping the 3rd line (after the @), and printing only the +.

If the + is not on the start of the line, you can use gensub(/\+.*/,"+",$0) instead of the "+" in the last print.

(And if you have perl installed, most probably there will be an a2p executable, which can convert the above awk script to perl, if you want to...)

HTH

UPDATE (on missing 4th line):

 gawk '{if ($0 ~ /^@/ ) { print ; getline ; print ; getline ; print "+"; getline; print }}' INPUTFILE

This should print the 4th line as well.

maybe just sed '/^@/+2 s/+.*/+/'

edit: this will not work, but as a vim command it should work:

vim file -c ':g/^@/+2s/+.*/+/' -c 'wq'

This might work for you:

sed '/^@/{$!N;$!N;$!N;s/\n+[^\n]*/\n+/g}' file

or with GNU sed:

sed '/^@/,+3s/^+.*/+/' file

继续阅读：perl sed

How do I remove part of a line in a multi-line chunk using sed or Perl?

Output

Output

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

Output

Output

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？