开发者

Using grep to replace every instance of a pattern after the first in bbedit

So I've got a really long txt file that follows this pattern:

},
"303" :
{
   "id" : "4k4hk2l",
   "color" : "red",
   "moustache" : "no"
},
"303" :
{
   "id" : "4k52k2l",
   "color" : "red",
   "moustache" : "yes"
},
"303" :
{
   "id" : "fask2l",
   "color" : "green",
   "moustache" : "yes"
},
"304" :
{
   "id" : "4k4hf4f4",
   "color" : "red",
   "moustache" : "yes"
},
"304" :
{
   "id" : "tthj2l",
   "color" : "red",
   "moustache" : "yes"
},
"304" :
{
   "id" : "hjsk2l",
   "color" : "green",
   "moustache" : "no"
},
"305" :
{
   "id" : "h6shgfbs",
   "color" : "red",
   "moustache" : "no"
},
"305" :
{
   "id" : "fdh33hk7",
   "color" : "cyan",
   "moustache" : "yes"
},

and I'm trying to format it to be a proper json object with the following structure....

"303" :
   { "list" : [
     {
      "id" : "4k4hk2l",
      "color" : "red",
      "moustache" : "no"
     },
     {
      "id" : "4k52k2l",
      "color" : "red",
      "moustache" :开发者_如何学编程 "yes"
     },
     {
      "id" : "fask2l",
      "color" : "green",
      "moustache" : "yes"
     }
    ]
   }
"304" :
   { "list" : [
 etc...

meaning I look for all patterns of ^"\d\d\d" : and leave the first unique one , but remove all the subsequent ones (example, leave first instance of "303" :, but completely remove the rest of them. then leave the first instance of "304" :, but completely remove all the rest of them, etc.).

I've been attempting to do this within the bbedit application, which has a grep option for search/replace. My pattern matching fu is too weak to accomplish this. Any ideas? Or a better way to accomplish this task?


You can't capture repeating capturing group. The capture will always contain only last match of a group. So there's no way you can do this with a single search/replace except of dumb repeating your group in pattern. But even that can be a solution only if you know a max count of elements in resulting groups.

Say we have a tring that is a simplified version of your data:

1a;1b;1c;1d;1e;2d;2e;2f;2g;3x;3y;3z;

We see that maximum count of element is 5, so we repeat the capturing group 5 times.

/([0-9])([a-z]*);?(\1([a-z]);)?(\1([a-z]);)?(\1([a-z]);)?(\1([a-z]);)?/

And replace that with

\1:\2\4\6\8\10; 

Then we get desired result:

1:abcde;2:defg;3:xyz;

You can apply this technique to your data if you're in great hurry (and after 2 days I suppose you don't), but using some scripting language will be better and cleaner solution.

For my simplified example you have to iterate through matches of /([0-9])[a-z];?(\1[a-z];?)*/. Those will be:

1a;1b;1c;1d;1e;
2d;2e;2f;2g;
3x;3y;3z;

And there you can capture all values and bind them to responsive key, which is only one for each iteration.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜