开发者

using sed to copy lines and delete characters from the duplicates

I have a file that looks like this:

@"Afghanistan.png",
@"Albania.png",
@"Algeria.png",
@"American_Samoa.png",

I want it to look like this

@"Afghanistan.png",
@"Afghanistan",
@"Albania.png",
@"Albania",
@"Algeria.png",
@"Algeria",
@"American_Samoa.png",
@"American_Samoa",

I thought I could use sed to do this but I can't figure out how to store something in a buf开发者_如何学Cfer and then modify it.

Am I even using the right tool?

Thanks


You don't have to get tricky with regular expressions and replacement strings: use sed's p command to print the line intact, then modify the line and let it print implicitly

sed 'p; s/\.png//'


Glenn jackman's response is OK, but it also doubles the rows which do not match the expression.

This one, instead, doubles only the rows which matched the expression:

sed -n 'p; s/\.png//p'

Here, -n stands for "print nothing unless explicitely printed", and the p in s/\.png//p forces the print if substitution was done, but does not force it otherwise


That is pretty easy to do with sed and you not even need to use the hold space (the sed auxiliary buffer). Given the input file below:

$ cat input 
@"Afghanistan.png",
@"Albania.png",
@"Algeria.png",
@"American_Samoa.png",

you should use this command:

sed 's/@"\([^.]*\)\.png",/&\
@"\1",/' input 

The result:

$ sed 's/@"\([^.]*\)\.png",/&\
@"\1",/' input 
@"Afghanistan.png",
@"Afghanistan",
@"Albania.png",
@"Albania",
@"Algeria.png",
@"Algeria",
@"American_Samoa.png",
@"American_Samoa",

This commands is just a replacement command (s///). It matches anything starting with @" followed by non-period chars ([^.]*) and then by .png",. Also, it matches all non-period chars before .png", using the group brackets \( and \), so we can get what was matched by this group. So, this is the to-be-replaced regular expression:

@"\([^.]*\)\.png",

So follows the replacement part of the command. The & command just inserts everything that was matched by @"\([^.]*\)\.png", in the changed content. If it was the only element of the replacement part, nothing would be changed in the output. However, following the & there is a newline character - represented by the backslash \ followed by an actual newline - and in the new line we add the @" string followed by the content of the first group (\1) and then the string ",.

This is just a brief explanation of the command. Hope this helps. Also, note that you can use the \n string to represent newlines in some versions of sed (such as GNU sed). It would render a more concise and readable command:

sed 's/@"\([^.]*\)\.png",/&\n@"\1",/' input 


I prefer this over Carles Sala and Glenn Jackman's:

sed '/.png/p;s/.png//'

Could just say it's personal preference.


or one can combine both versions and apply the duplication only on lines matching the required pattern

sed -e '/^@".*\.png",/{p;s/\.png//;}' input
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜