开发者

Regular expression, replace all commas between double quotes

I have this string:

1001,"Fitzsimmons, Des Marteau, Beale and Nunn",109,"George","COD","Standard",,109,8/14/1998 8:50:02

What regular expression would I use to replace the commas in the "Fitzsimmons, Des Marteau, Beale and Nunn" with a pipe | so it is:

"Fitzsimmons| Des Marteau| Beale and Nunn"

Should have clarified, I am doing a split on this string using the commas, so I want "Fitzsimmons, Des Marteau, Beale and Nunn" to be a string. I plan to replace开发者_StackOverflow the | with a comma after I have split it.


I have tried to use StringTokenizer but it didn't work well, so here is a code which seems to do what you want:

import java.util.*;

public class JTest
{
    public static void main(String[] args)
    {
    String str = "1001,\"Fitzsimmons, Des Marteau, Beale and Nunn\",109,\"George\",\"COD\",\"Standard\",,109,8/14/1998 8:50:02";
    String copy = new String();

    boolean inQuotes = false;

    for(int i=0; i<str.length(); ++i)
        {
        if (str.charAt(i)=='"')
            inQuotes = !inQuotes;
        if (str.charAt(i)==',' && inQuotes)
            copy += '|';
        else
            copy += str.charAt(i);
        }

    System.out.println(str);
    System.out.println(copy);
    }
}


While it would be possible to do with regular expressions, it would be much clearer to first split the line into fields, then do the replacement. There is a good (free) java library for parsing CSV files called opencsv.


Hey Brandon you can easily do this with RE by using look behind and look ahead. see the code below

String cvsString = "1001,\"Fitzsimmons, Des Marteau, Beale and Nunn\",109,\"George\",\"COD\",\"Standard\",,109,8/14/1998 8:50:02";  
String rePattern = "(?<=\")([^\"]+?),([^\"]+?)(?=\")";  
// first replace  
String oldString = cvsString;  
String resultString = cvsString.replaceAll(rePattern, "$1|$2");  
// additional repalces until until no more changes  
while (!resultString.equalsIgnoreCase(oldString)){  
    oldString = resultString;  
    resultString = resultString.replaceAll(rePattern, "$1|$2");  
}  

result string will be 1001,"Fitzsimmons| Des Marteau| Beale and Nunn",109,"George","COD","Standard",,109,8/14/1998 8:50:02

NingZhang.info


Here's a bit of Python that seems to do the trick:

>>> import re
>>> p = re.compile('["][^"]*["]|[^,]*')
>>> x = """1001,"Fitzsimmons, Des Marteau, Beale and Nunn",109,"George","COD","Standard",,109,8/14/1998 8:50:02"""
>>> y = p.findall(x)
>>> ','.join(z.replace(',','|') for z in y if z)
'1001,"Fitzsimmons| Des Marteau| Beale and Nunn",109,"George","COD","Standard",109,8/14/1998 8:50:02'

Seems like this code turn into a code golf question :-)

Oops...missed the Java tag.


I believe this is going to be very difficult to do with a regular expression. The trouble is that the regular expression would have to count quotes to determine if it's inside two quotes or not.

Actually, the .NET regex engine could do it with its balanced matching feature. But I don't think Java has that feature and I can't think of a reliable way to do it without it.

You may have to write some procedural code to accomplish this.


Well, this is a CSV file, so I'd use Ruby's built-in CSV library. Then you don't have to figure out how to deal with escaped quotation marks, for example.

require 'csv'
string =<<CSV
1001,"Fitzsimmons, Des Marteau, Beale and Nunn",109,"George","COD","Standard",,109,8/14/1998 8:50:02
CSV
csv=CSV.parse string
csv.each{|row| row.each {|cell| cell.gsub!(",","|") if cell.is_a?(String)}}
outstring = ""
CSV::Writer.generate(outstring){|out| csv.each {|row| out<<row}}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜