Using grep to replace every instance of a pattern after the first in bbedit

2023-03-27 12:35 问答作者：

So I've got a really long txt file that follows this pattern:

},
"303" :
{
   "id" : "4k4hk2l",
   "color" : "red",
   "moustache" : "no"
},
"303" :
{
   "id" : "4k52k2l",
   "color" : "red",
   "moustache" : "yes"
},
"303" :
{
   "id" : "fask2l",
   "color" : "green",
   "moustache" : "yes"
},
"304" :
{
   "id" : "4k4hf4f4",
   "color" : "red",
   "moustache" : "yes"
},
"304" :
{
   "id" : "tthj2l",
   "color" : "red",
   "moustache" : "yes"
},
"304" :
{
   "id" : "hjsk2l",
   "color" : "green",
   "moustache" : "no"
},
"305" :
{
   "id" : "h6shgfbs",
   "color" : "red",
   "moustache" : "no"
},
"305" :
{
   "id" : "fdh33hk7",
   "color" : "cyan",
   "moustache" : "yes"
},

and I'm trying to format it to be a proper json object with the following structure....

"303" :
   { "list" : [
     {
      "id" : "4k4hk2l",
      "color" : "red",
      "moustache" : "no"
     },
     {
      "id" : "4k52k2l",
      "color" : "red",
      "moustache" :开发者_如何学编程 "yes"
     },
     {
      "id" : "fask2l",
      "color" : "green",
      "moustache" : "yes"
     }
    ]
   }
"304" :
   { "list" : [
 etc...

meaning I look for all patterns of ^"\d\d\d" : and leave the first unique one , but remove all the subsequent ones (example, leave first instance of "303" :, but completely remove the rest of them. then leave the first instance of "304" :, but completely remove all the rest of them, etc.).

I've been attempting to do this within the bbedit application, which has a grep option for search/replace. My pattern matching fu is too weak to accomplish this. Any ideas? Or a better way to accomplish this task?

You can't capture repeating capturing group. The capture will always contain only last match of a group. So there's no way you can do this with a single search/replace except of dumb repeating your group in pattern. But even that can be a solution only if you know a max count of elements in resulting groups.

Say we have a tring that is a simplified version of your data:

1a;1b;1c;1d;1e;2d;2e;2f;2g;3x;3y;3z;

We see that maximum count of element is 5, so we repeat the capturing group 5 times.

/([0-9])([a-z]*);?(\1([a-z]);)?(\1([a-z]);)?(\1([a-z]);)?(\1([a-z]);)?/

And replace that with

\1:\2\4\6\8\10;

Then we get desired result:

1:abcde;2:defg;3:xyz;

You can apply this technique to your data if you're in great hurry (and after 2 days I suppose you don't), but using some scripting language will be better and cleaner solution.

For my simplified example you have to iterate through matches of /([0-9])[a-z];?(\1[a-z];?)*/. Those will be:

1a;1b;1c;1d;1e;
2d;2e;2f;2g;
3x;3y;3z;

And there you can capture all values and bind them to responsive key, which is only one for each iteration.

继续阅读：formatting regex

Using grep to replace every instance of a pattern after the first in bbedit

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？