Combining multiple lines into one line

2022-12-24 09:45 问答作者：

I have this use case of an xml fi开发者_如何学编程le with input like

Input:
<abc a="1">
   <val>0.25</val>
</abc> 
<abc a="2">
    <val>0.25</val>
</abc> 
<abc a="3">
   <val>0.35</val>
</abc> 
 ...

Output:
<abc a="1"><val>0.25</val></abc> 
<abc a="2"><val>0.25</val></abc>
<abc a="3"><val>0.35</val></abc>

I have around 200K lines in a file in the Input format, how can I quickly convert this into output format.

In vim you could do this with

:g/<abc/ .,/<\/abc/ join!

Normally :join will add a space at the end of each line before joining, but the ! suppresses that.

In general I would recommend using a proper XML parsing library in a language like Python, Ruby or Perl for manipulating XML files (I recommend Python+ElementTree), but in this case it is simple enough to get away with using a regex solution.

In Vim:

position on first line
qq: start recording macro
gJgJ: joins next two lines without adding spaces
j: go down
q: stop recording
N@q: N = number of lines (actually around 1/3rd of all lines as they get condensed on the go)

$ awk '
    /<abc/ && NR > 1 {print ""}
    {gsub(" +"," "); printf "%s",$0}
' file
<abc a="1"> <val>0.25</val></abc>
<abc a="2"> <val>0.25</val></abc>
<abc a="3"> <val>0.35</val></abc>

Bash:

while read s; do echo -n $s; read s; echo -n $s; read s; echo $s; done < file.xml

You can record a macro. Basically what I would do is begin with my cursor at the start of the first line. Press 'qa' (records macro to the a register). The press shift-V to being line-wise visual mode. Then search for the ending tag '//abc'. Then press shift-J to join the lines. Then you would have to move the cursor to the next tag, probably with 'j^' and press 'q' to stop recording. You can then rerun the recording with '@a' or specify 10000@a if you like. If the tags are different or not right after each other you just need to change how you find the opening and closing tags to searches or something like that.

sed '/^<abc/{N;N;s/\n\| //g}'

# remove \n or "space" 
# Result

<abca="1"><val>0.25</val></abc>
<abca="2"><val>0.25</val></abc>
<abca="3"><val>0.35</val></abc>

inelegant perl one-liner which should do the trick, though not particularly quickly.

cat file | perl -e '
    $x=0;
    while(<>){
        s/^\s*(\S*(?:\s+\S+)*)\s*$/$1/g;
        print;
        $x++;
    if($x==3){
        print"\n";
        $x=0;
    }
}' > output

You can do this:

perl -e '$i=1; while(<>){chomp;$s.=$_;if($i%3==0){$s=~s{>\s+<}{><};print "$s\n";$s="";}$i++;}' file

sed '/<abc/,/<\/abc>/{:a;N;s/\n//g;s|<\/abc>|<\/abc>\n|g;H;ta}'  file

tr "\n" " "<myfile|sed 's|<\/abc>|<\/abc>\n|g;s/[ \t]*<abc/<abc/g;s/>[ \t]*</></g'

This should work in ex mode:

:%s/\(^<abc.*>\)^M^\(.*\)^M^\(^<\/abc>\).*^M/\1\2\3^M/g

I should have extra spaces (or a tab in between the value), but you coud remove it depending on what it is (\t or \ \ \ \ ).

What you are searching/replacing is here is (pattern1)[enter](pattern2)[enter](pattern3)[enter] and replacing it with (pattern1)(pattern2)(pattern3)[enter]

The ^M is done with ctrl+v CTRL+m

继续阅读：scripting sed vim

Combining multiple lines into one line

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？