using sed to copy lines and delete characters from the duplicates

2023-04-04 09:04 问答作者：

I have a file that looks like this:

@"Afghanistan.png",
@"Albania.png",
@"Algeria.png",
@"American_Samoa.png",

I want it to look like this

@"Afghanistan.png",
@"Afghanistan",
@"Albania.png",
@"Albania",
@"Algeria.png",
@"Algeria",
@"American_Samoa.png",
@"American_Samoa",

I thought I could use sed to do this but I can't figure out how to store something in a buf开发者_如何学Cfer and then modify it.

Am I even using the right tool?

Thanks

You don't have to get tricky with regular expressions and replacement strings: use sed's p command to print the line intact, then modify the line and let it print implicitly

sed 'p; s/\.png//'

Glenn jackman's response is OK, but it also doubles the rows which do not match the expression.

This one, instead, doubles only the rows which matched the expression:

sed -n 'p; s/\.png//p'

Here, -n stands for "print nothing unless explicitely printed", and the p in s/\.png//p forces the print if substitution was done, but does not force it otherwise

That is pretty easy to do with sed and you not even need to use the hold space (the sed auxiliary buffer). Given the input file below:

$ cat input 
@"Afghanistan.png",
@"Albania.png",
@"Algeria.png",
@"American_Samoa.png",

you should use this command:

sed 's/@"\([^.]*\)\.png",/&\
@"\1",/' input

The result:

$ sed 's/@"\([^.]*\)\.png",/&\
@"\1",/' input 
@"Afghanistan.png",
@"Afghanistan",
@"Albania.png",
@"Albania",
@"Algeria.png",
@"Algeria",
@"American_Samoa.png",
@"American_Samoa",

This commands is just a replacement command (s///). It matches anything starting with @" followed by non-period chars ([^.]*) and then by .png",. Also, it matches all non-period chars before .png", using the group brackets \( and \), so we can get what was matched by this group. So, this is the to-be-replaced regular expression:

@"\([^.]*\)\.png",

So follows the replacement part of the command. The & command just inserts everything that was matched by @"\([^.]*\)\.png", in the changed content. If it was the only element of the replacement part, nothing would be changed in the output. However, following the & there is a newline character - represented by the backslash \ followed by an actual newline - and in the new line we add the @" string followed by the content of the first group (\1) and then the string ",.

This is just a brief explanation of the command. Hope this helps. Also, note that you can use the \n string to represent newlines in some versions of sed (such as GNU sed). It would render a more concise and readable command:

sed 's/@"\([^.]*\)\.png",/&\n@"\1",/' input

I prefer this over Carles Sala and Glenn Jackman's:

sed '/.png/p;s/.png//'

Could just say it's personal preference.

or one can combine both versions and apply the duplication only on lines matching the required pattern

sed -e '/^@".*\.png",/{p;s/\.png//;}' input

继续阅读：regex sed

using sed to copy lines and delete characters from the duplicates

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？