Removing Parts of String With Sed

2023-01-04 08:50 问答作者：

I have lines of data that looks like this:

sp_A0A342_ATPB_COFAR_6_+_contigs_full.fasta
sp_A0A342_ATPB_COFAR_9_-_contigs_full.fasta
sp_A0A373_RK16_COFAR_10_-_contigs_full.fasta
sp_A0A373_RK16_COFAR_8_+_contigs_full.fasta
sp_A0A4W3_SPEA_GEOSL_15_-_contigs_full.fasta

How can I use sed to delete parts of string after 4th column (_ separated) for each line. Finally yielding:

sp_A0A342_ATPB_COFAR
sp_A0A342_ATPB_COFAR
sp_A0A373_RK16_COFAR
s开发者_开发技巧p_A0A373_RK16_COFAR
sp_A0A4W3_SPEA_GEOSL

cut is a better fit.

cut -d_ -f 1-4 old_file

This simply means use _ as delimiter, and keep fields 1-4.

If you insist on sed:

sed 's/\(_[^_]*\)\{4\}$//'

This left hand side matches exactly four repetitions of a group, consisting of an underscore followed by 0 or more non-underscores. After that, we must be at the end of the line. This is all replaced by nothing.

sed -e 's/\([^_]*\)_\([^_]*\)_\([^_]*\)_\([^_]*\)_.*/\1_\2_\3_\4' infile > outfile

Match "any number of not '_'", saving what was matched between \( and \), followed by '_'. Do this 4 times, then match anything for the rest of the line (to be ignored). Substitute with each of the matches separated by '_'.

Here's another possibility:

sed -E -e 's|^([^_]+(_[^_]+){3}).*$|\1|'

where -E, like -r in GNU sed, turns on extended regular expressions for readability.

Just because you can do it in sed, though, doesn't mean you should. I like cut much much better for this.

AWK likes to play in the fields:

awk 'BEGIN{FS=OFS="_"}{print $1,$2,$3,$4}' inputfile

or, more generally:

awk -v count=4 'BEGIN{FS="_"}{for(i=1;i<=count;i++){printf "%s%s",sep,$i;sep=FS};printf "\n"}'

sed -e 's/_[0-9][0-9]*_[+-]_contigs_full.fasta$//g'

Still the cut answer is probably faster and just generally better.

Yes, cut is way better, and yes matching the back of each is easier.

I finally got a match using the beginning of each line:

 sed -r 's/(([^_]*_){3}([^_]*)).*/\1/' oldFile > newFile

继续阅读：bash sed

Removing Parts of String With Sed

更多精彩内容

精彩评论

最新问答

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

哪里医院专治输卵管堵塞好？

外语基础薄弱的人出国自由行，带哪种翻译器比较好？？

输卵管积液手术价格？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？