开发者

need to find and replace using regular expressions in textwrangler - grep, for a csv file

I have this csv file, plain text here: http://pastie.org/1425970

What it looks like in excel: http://cl.ly/3qXk

An example of what I would like it to look like (just using the first row as example): http://cl.ly/3qYT

Plain text of first row: http://pastie.org/1425979

I need to create a csv file, to import all of the information into a da开发者_JAVA技巧tabase table.

I could manually create the csv, but I wanted to see if it was possible to accomplish this using regular expressions in textwrangler (grep) find and replace


Regular expressions aren't really the best way to accomplish this. As others have noted, you're better off writing some code to parse the file into the format you want.

With that said, this ugly regex should get you halfway there:

Find:

(\d+),"?(?:(\d+),? ?)?(?:(\d+),? ?)?(?:(\d+),? ?)?(?:(\d+),? ?)?(?:(\d+),? ?)?(?:(\d+),? ?)?(?:(\d+),? ?)?"?

Replace:

\1,\2\r\1,\3\r\1,\4\r\1,\5\r\1,\6\r\1,\7\r\1,\8

Which will leave you with some extra rows, like below:

1,1
1,8
1,11
1,13
1,
1,
1,
2,10
2,11
2,12
2,
2,
...

You can clean up the extra rows by hand, or with the following regex:

Find:

\d+,\r

Replace:

(empty string)


Using Perl, you could do something like this:

open(my $read,"<","input.csv") or die ("Gah, couldn't read input.csv!\n"); open(my $write,">","output.csv") or die ("WHAAAARGARBL!\n"); while(<$read>) { chomp; if(/(\d+),"(.*)"/) { my @arr=split(/,/,$2); foreach(@arr) { print $write $1.",".$2."\n"; } } } close($read); close($write);


I don't know textmate. But in general I can describe what it takes to do this in pseudo-code.

loop, read each line  
   strip off the newline
   split into an array using /[, "]+/ as delimeter regex
   loop using result. an array slice from element 1 to the last element
       print element 0, comma, then itterator value
   end loop
end loop

In Perl, something like this ..

while ($line = <DATA> ) {
    chomp $line;
    @data_array = split /[, "]+/, $line;
    for $otherfield ( @data_array[ 1 .. $#data_array ]) {
        print "$data_array[0], $otherfield\n";
    }
}

It should be easy if you have a split capability.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜