Using grep to replace every instance of a pattern after the first in bbedit
So I've got a really long txt file that follows this pattern:
},
"303" :
{
"id" : "4k4hk2l",
"color" : "red",
"moustache" : "no"
},
"303" :
{
"id" : "4k52k2l",
"color" : "red",
"moustache" : "yes"
},
"303" :
{
"id" : "fask2l",
"color" : "green",
"moustache" : "yes"
},
"304" :
{
"id" : "4k4hf4f4",
"color" : "red",
"moustache" : "yes"
},
"304" :
{
"id" : "tthj2l",
"color" : "red",
"moustache" : "yes"
},
"304" :
{
"id" : "hjsk2l",
"color" : "green",
"moustache" : "no"
},
"305" :
{
"id" : "h6shgfbs",
"color" : "red",
"moustache" : "no"
},
"305" :
{
"id" : "fdh33hk7",
"color" : "cyan",
"moustache" : "yes"
},
and I'm trying to format it to be a proper json object with the following structure....
"303" :
{ "list" : [
{
"id" : "4k4hk2l",
"color" : "red",
"moustache" : "no"
},
{
"id" : "4k52k2l",
"color" : "red",
"moustache" :开发者_如何学编程 "yes"
},
{
"id" : "fask2l",
"color" : "green",
"moustache" : "yes"
}
]
}
"304" :
{ "list" : [
etc...
meaning I look for all patterns of ^"\d\d\d" : and leave the first unique one , but remove all the subsequent ones (example, leave first instance of "303" :, but completely remove the rest of them. then leave the first instance of "304" :, but completely remove all the rest of them, etc.).
I've been attempting to do this within the bbedit application, which has a grep option for search/replace. My pattern matching fu is too weak to accomplish this. Any ideas? Or a better way to accomplish this task?
You can't capture repeating capturing group. The capture will always contain only last match of a group. So there's no way you can do this with a single search/replace except of dumb repeating your group in pattern. But even that can be a solution only if you know a max count of elements in resulting groups.
Say we have a tring that is a simplified version of your data:
1a;1b;1c;1d;1e;2d;2e;2f;2g;3x;3y;3z;
We see that maximum count of element is 5, so we repeat the capturing group 5 times.
/([0-9])([a-z]*);?(\1([a-z]);)?(\1([a-z]);)?(\1([a-z]);)?(\1([a-z]);)?/
And replace that with
\1:\2\4\6\8\10;
Then we get desired result:
1:abcde;2:defg;3:xyz;
You can apply this technique to your data if you're in great hurry (and after 2 days I suppose you don't), but using some scripting language will be better and cleaner solution.
For my simplified example you have to iterate through matches of /([0-9])[a-z];?(\1[a-z];?)*/
. Those will be:
1a;1b;1c;1d;1e;
2d;2e;2f;2g;
3x;3y;3z;
And there you can capture all values and bind them to responsive key, which is only one for each iteration.
精彩评论