开发者

Separate Data by Comma

I am learning RegEx. completely a newbie :P

I wanted to separate numbers from the below data, which are separated by comma only

test
t,b
45,49
31,34,38,34,56,23,,,,3,23,23653,3875,3.7,8.5,2.5,7.8,2., 6 6 6 6 ,
,
.
.,/;,jm.m.,,n ,sdsd, 3,2m54,2 4,2m,ar ,SSD A,,B,4D,CE,S4,D,2343ES,SD

Suppose I am getting the above data from Form text field. Now I want to read the data only which are numbers seperated by comma

Solution should be[string]

45,49,31,34,38,34,56,23,3,23,23653,3875

all other data should be skipped. I tried something like this ^[0-9]+\,$

But it's also selecting 7 from 3.7, and 5 from 8.5, etc.....

Can anyone hel开发者_如何学JAVAp me out in solving this!!


Assuming you are already splitting at commas and try to check whether the elements you get are numbers, use this expression: ^\d+(?:\.\d+)?$, which means: "must begin with digits potentially followed by a dot and at least one more digit".

This would match 31 as well as 7.8, but not 2., 6 6 6 6 or 2m54.

Here's a part by part explanation of that expression:

  • ^ means: matches must start at the first character
  • $ means: matches must end at the last character, so both together mean the entire string must match
  • \d+ means: one or more digits
  • (?: ... ) is a non-capturing group allowing to apply the ? quantifier
  • \. means: the literal dot
  • (?:\.\d+)? thus means: zero or one occurences of a dot followed by at least one digit

Edit: if you only want integer numbers, just remove the group: ^\d+$ -> entire input must be one or more digits.

Edit 2: If you can prepend and append a comma to the input string(see Edit 4), you should be able to use this regex for getting all numbers: (?<=,)\s*(\d+(?:\.\d+)?)\s*(?=,) (integers only would require you to remove the (?:\.\d+)? part).

That expression gets all numbers between two commas with possible whitespace between the commas and the number and catches the number into a group. This should prevent matches of 6 6 6 6 or 2m54. Then just iterate over the matches to get all the groups.

Edit 3: Here's an example with your input string.

String input = "test\n" +
        "t,b\n" +
        "45,49\n" +
        "31,34,38,34,56,23,,,,3,23,23653,3875,3.7,8.5,2.5,7.8,2., 6 6 6 6 ,\n" +
        ",\n" +
        ".\n" +
        ".,/;,jm.m.,,n ,sdsd, 3,2m54,2 4,2m,ar ,SSD A,,B,4D,CE,S4,D,2343ES,SD\n";

Pattern p = Pattern.compile( "(?<=,|\\n)\\s*(\\d+(?:\\.\\d+)?)\\s*(?=,|\\n)" );    

Matcher m = p.matcher( input );

List<String> numbers = new ArrayList<String>();

while(m.find())
{
  numbers.add( m.group( 1 ) );
}

System.out.println(Arrays.toString( numbers.toArray() ));

//prints: [45, 49, 31, 34, 38, 34, 56, 23, 3, 23, 23653, 3875, 3.7, 8.5, 2.5, 7.8, 3]
//removing the faction group: [45, 49, 31, 34, 38, 34, 56, 23, 3, 23, 23653, 3875, 3]

Edit 4: actually, you don't need to add commas, just use this expression:

`(?<=,|\n|^)\s*(\d+)\s*(?=,|\n|$)`

The groups at the start and end mean the match must follow the start of the input, a comma or a line break and be followed by the end of the input, a comma or a line break.


The shortest solution i could come up with would be to replace anything that isn't a set of numbers separated by commas with the empty string. So you could do s.replaceAll("[^0-9]*,", ",") If you have random newlines in there, you will probably want to add in a s.replaceAll("\n", ","). Then after those transformations, you can just do as suggested and split on commas.


this experssion will give you all numbers you need (only numbers, no commas).

"^\d+|(?<=,)\d+$|(?<=,)\d+(?=,)"

see the grep example:

kent$  echo "31,34,38,34,56,23,,,,3,23,23653,3875,3.7,8.5,2.5,7.8,2., 6 6 6 6 ,
"|grep -oP "^\d+|(?<=,)\d+$|(?<=,)\d+(?=,)"

31
34
38
34
56
23
3
23
23653
3875
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜