开发者

How can you parse the string which has a text qualifier

How can I parse a String str = "abc, \"def,ghi\"";

such that I get the output as

String[] strs = {"abc", "\"def,ghi\""}

i.e. an array of length 2.

Should I use regular expression or Is there any method in java api or anyother opensource

project which let me do this?

Edited

To give context about the problem, I am reading a text file which has a list of records one on each line. Each record has list of fields separated by delimiter(comma or semi-colon). Now I have a requirement where I have to support text qualifier some thing excel or open office supports. Suppose I have record

abc, "def,ghi"

In this , is my delimiter and " is my text qualifier such that when I parse this string I should get two开发者_JS百科 fields abc and def,ghi not {abc,def,ghi}

Hope this clears my requirement.

Thanks

Shekhar


The basic algorithm is not too complicated:

 public static List<String> customSplit(String input) {
   List<String> elements = new ArrayList<String>();       
   StringBuilder elementBuilder = new StringBuilder();

   boolean isQuoted = false;
   for (char c : input.toCharArray()) {
     if (c == '\"') {
        isQuoted = !isQuoted;
        // continue;        // changed according to the OP comment - \" shall not be skipped
     }
     if (c == ',' && !isQuoted) {
        elements.add(elementBuilder.toString().trim());
        elementBuilder = new StringBuilder();
        continue;
     }
     elementBuilder.append(c); 
   }
   elements.add(elementBuilder.toString().trim()); 
   return elements;
}


This question seems appropriate: Split a string ignoring quoted sections

Along that line, http://opencsv.sourceforge.net/ seems appropriate.


Try this -

 String str = "abc, \"def,ghi\"";
            String regex = "([,]) | (^[\"\\w*,\\w*\"])";
            for(String s : str.split(regex)){
                System.out.println(s);
            }


Try:

List<String> res = new LinkedList<String>();

String[] chunks = str.split("\\\"");
if (chunks.length % 2 == 0) {
    // Mismatched escaped quotes!
}
for (int i = 0; i < chunks.length; i++) {
    if (i % 2 == 1) {
        res.addAll(Array.asList(chunks[i].split(",")));
    } else {
        res.add(chunks[i]);
    }
}

This will only split up the portions that are not between escaped quotes.

Call trim() if you want to get rid of the whitespace.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜