Use matching value of a RegExp to name the output file

2022-12-26 18:05 问答作者：

I have this file "file.txt" which I want to split into many smaller ones. This a piece of it:

0 id:2293 7:0.78235 12:0.69205 17:0.79421 21:0.77818 ..

4 id:2293 7:0.78235 8:0.97904 12:0.69205 17:0.31709 ..

1 id:2294 7:0.78235 8:0.90994 17:0.49058 21:0.59326 ..

Each line of the file has an id field which looks like "id:1" for a line belonging to id 1. For each id in the file, I like to create a file named idid.txt and put all lines that belong to thi开发者_JAVA百科s id in that file. My brute force bash script solution reads as follows.

count=1

while [ $count -lt 19945 ] do

cat file.txt | grep "id:$count " >> ./sets/id$count.txt

count='expr $count + 1'

done

Now this is very inefficient as I have do read through the file about 20.000 times. Is there a way to do the same operation with only one pass through the file? - What I'm probably asking for is a way to use the value that matches for a regular expression to name the associated output file.

$ cat file
0 id:2293 7:0.78235 12:0.69205 17:0.79421 21:0.77818 ..
4 id:2293 7:0.78235 8:0.97904 12:0.69205 17:0.31709 ..
1 id:2294 7:0.78235 8:0.90994 17:0.49058 21:0.59326 ..

$ awk -F"[: ]" '{print $0 > "id_"$3".txt"}' file

$ more id_2293.txt
0 id:2293 7:0.78235 12:0.69205 17:0.79421 21:0.77818 ..
4 id:2293 7:0.78235 8:0.97904 12:0.69205 17:0.31709 ..

$ more id_2294.txt
1 id:2294 7:0.78235 8:0.90994 17:0.49058 21:0.59326 ..

You can build a solution similar to this

Creating multiple csv files from data within a csv file

Try this AWK script:

#!/usr/bin/awk -f
{
    if (match($0, /id:([0-9]+)/, a))
        print $0 >> "file" a[1] ".txt";
}

继续阅读：regex shell

Use matching value of a RegExp to name the output file

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？