Take out string parts and number parts of string
I may have strings that will look something like this:
ABC
DEF-123
456
789GH-IJK-0
And Im trying to figure out a Regex开发者_StackOverflow that will group it on strings and numbers, like this:
(ABC)
(DEF-)(123)
(456)
(789)(GH-IJK-)(0)
My first thought was to use (\D*|\d*) as pattern, but the numbers isnt returned
How about using inner non-capturing sub groups...
((?:\D+)|(?:\d+))
Example output from perl...
cat input | perl -ane 'chomp; print "looking at $_\n"; while(/((?:\D+)|(?:\d+))/g) {print "Found $1\n";}'
looking at BC
Found BC
looking at DEF-123
Found DEF-
Found 123
looking at 456
Found 456
looking at 789GH-IJK-0
Found 789
Found GH-IJK-
Found 0
Use + instead of * on the alternatives:
(\D+|\d+)
This seems to be working, but quite ugly (backslash-plague). Instead of doing one regexp, separate it into two, one to handle digits and one for characters.
$ sed 's/\([a-zA-Z-]\+\)/(\1)/g ; s/\([0-9]\+\)/(\1)/g' input
(BC)
(DEF-)(123)
(456)
(789)(GH-IJK-)(0)
精彩评论