Vim Regex Duplicate Lines Grouping
I have a log file like this:
开发者_如何转开发12 adsflljl
12 hgfahld
12 ash;al
13 a;jfda
13 asldfj
15 ;aljdf
16 a;dlfj
19 adads
19 adfasf
20 aaaadsf
And I would like to "group" them like one of these two:
12 adsfllj, 12 hgfahld, 12 ash;al
13 a;jfda, 13 asldfj
15 ;aljdf
16 a;dlfj
19 adads, 19 adfasf
20 aaaadsf
Or
12 adsfllj, hgfahld, ash;al
13 a;jfda, asldfj
15 ;aljdf
16 a;dlfj
19 adads, adfasf
20 aaaadsf
And I am totally stuck. And if vim doesn't do it, I have sed and awk and bash too. I just don't really want to write a bash script, I want to increase my regex-fu
In Vim you can use:
:%s/\(\(\d\+\) .*\)\n\2/\1, \2/g
which means: if a group of numbers is matched after a new line, remove the newline and place a comma instead. If you are not familiar with them, \1
and \2
are backreferences.
Unfortunately this only merges two occurrences at a time, so you'll have to run it multiple times before achieving your goal.
EDIT: one way to do it in a single go would be to cycle and exploit the fact that the as soon as the file doesn't match anymore an error is issued. The error is a bit annoying though, but I couldn't do better with a one-liner:
:while 1 | :%s/\(\(\d\+\) .*\)\n\2/\1, \2/g | :endwhile
I'd just use awk:
awk '
{
sep = val[$1] ? ", " : ""
val[$1] = val[$1] sep $2
}
END {for (v in val) print v, val[v]}
' log.file | sort > new.file
In Vim, I would use the command
:g/^\d\+/y|if+@"==getline(line('.')-1)|s//,/|-j!
if it is guaranteed that the first column always contains digital ids.
Otherwise, I would modify that if-condition as follows.
:g/^\S\+/y|if matchstr(@",@/)==matchstr(getline(line('.')-1),@/)|s//,/|-j!
Another way to do this, with a macro this time (I advise you to use another solution, this one just shows that there are plenty of ways to do it):
gg:%s/$/,
enterqa0V?
ctrl-rctrl-w\>\&^
enterJjq100@a:%s/.$//
return
explanation:
gg
=> go to start of file:%s/$/,
=> append comma to every lineqa
=> start recording a macro into registera
0V
=> go to first column, and start linewise selection?
=> lookup backwards (you must haveset wrapscan
)- ctrl-r ctrl-w inserts word under cursor.
\>
ensures end of word\&^
ensures pattern matches at start of line. You cannot put^
at the beginning of the pattern, because ifincsearch
is set, then as soon as you have typed^
then ctrl-r ctrl-w will print the word under cursor, which will have moved to the previous line.
J
will join all lines from the visual selection with spaces.j
will go to next lineq
will stop recording macro100@a
will play macro 100 times.:%s/.$//
will remove trailing commas.
I do not think that this is a good idea to use regular expressions here. Same idea that you can find in @glenn jackman's solution written in vimscript will be the following:
function JoinLog()
let d={}
g/\v^\S+\s/let [ds, k, t; dl]=matchlist(getline('.'), '\v^(\S+)\s+(.*)') |
\let d[k]=get(d, k, [])+[t]
%delete _
call setline(1, map(sort(keys(d)), 'v:val." ".join(d[v:val], ", ")'))
endfunction
You can keep the order instead of sorting:
function JoinLog()
let d={}
let ordered=[]
g/\v^\S+\s/let [ds, k, t; dl]=matchlist(getline('.'), '\v^(\S+)\s+(.*)') |
\if has_key(d, k) | let d[k]+=[t] |
\else | let ordered+=[k] | let d[k]=[t] |
\endif
%delete _
call setline(1, map(copy(ordered), 'v:val." ".join(d[v:val], ", ")'))
endfunction
精彩评论