java regex : getting a substring from a string which can vary
I have a String like - "Bangalore,India=Karnataka"
. From this String I would like to extract only the substring "Bangalore"
. In this case the regex can be - (.+),.*=.*
. But the problem is, the String can sometimes开发者_运维技巧 come like only "Bangalore"
. Then in that case the above regex wont work. What will be the regex to get the substring "Bangalore"
whatever the String be ?
Try this one:
^(.+?)(?:,.*?)?=.*$
Explanation:
^ # Begining of the string
( # begining of capture group 1
.+? # one or more any char non-greedy
) # end of group 1
(?: # beginig of NON capture group
, # a comma
.*? # 0 or more any char non-greedy
)? # end of non capture group, optional
= # equal sign
.* # 0 or more any char
$ # end of string
Updated:
I thougth OP have to match Bangalore,India=Karnataka
or Bangalore=Karnataka
but as farr as I understand it is Bangalore,India=Karnataka
or Bangalore
so the regex is much more simpler :
^([^,]+)
This will match, at the begining of the string, one or more non-comma character and capture them in group 1.
matcher.matches()
tries to match against the entire input string. Look at the javadoc for java.util.regex.Matcher. You need to use:
matcher.find()
Are you somehow forced to solve this using one regexp and nothing else? (Stupid interview question? Extremely inflexible external API?) In general, don't try to make regexes do what plain old programming constructs do better. Just use the obvious regex, and it it doesn't match, return the entire string instead.
Try this regex, This will grab any grouping of characters at the start followed by a comma but not the comma itself.
^.*(?=,)
If you are only interested to check that "Bangalore" is contained in the string then you don't need a regexp for this.
Python:
In [1]: s = 'Bangalorejkdjiefjiojhdu'
In [2]: 'Bangalore' in s
Out[2]: True
精彩评论