How to figure out that one or more characters are in a string with regex in TCL
I need a simple solution to figure o开发者_Python百科ut whether some characters are in a string in Tcl. My idea is to do this with a regex.
My string looks like: "word_word-word_word_word-word
" or "word.word.word.word-word
".
My problem is, sometimes I get strings that contain .
_
and -
then i need to call another procedure to handle it.
Now the question again, how to figure it out that the string is contain "_-_-
" or "...-
" with any words between the _
.
-
If you were just looking to see if the string contains a _
, -
, _
, -
in that order with arbitrary random junk between, we could do that two ways (you can substitute other separators, but a .
needs special treatment in a regexp; either [.]
or \.
will do):
regexp {_.+-.+_.+-} $stringToMatchAgainst
string match {*_*-*_*-*} $stringToMatchAgainst
OK, technically the last one (which is glob matching) matches something slightly different, but the effect is similar.
However I'm not sure that the above is what you're really looking for. At a guess you're really after the word
s? Possibly also the separators.
To get a list of the words, we use a somewhat different technique (can't use \w
as that matches underline as well because that's common in identifiers):
set wordList [regexp -all -inline {[a-zA-Z0-9]+} $stringToMatchAgainst]
If you're after the separators too, the easiest method is to use textutil::split::splitx
from Tcllib:
package require textutil::split
set tokenList [textutil::split::splitx $stringToMatchAgainst {([-_.])} ]
In the last case, with an input string of word_word-word_word_word-word
it gives this output:
word _ word - word _ word _ word - word
精彩评论