开发者

How to figure out that one or more characters are in a string with regex in TCL

I need a simple solution to figure o开发者_Python百科ut whether some characters are in a string in Tcl. My idea is to do this with a regex.

My string looks like: "word_word-word_word_word-word" or "word.word.word.word-word". My problem is, sometimes I get strings that contain . _ and - then i need to call another procedure to handle it.

Now the question again, how to figure it out that the string is contain "_-_-" or "...-" with any words between the _ . -


If you were just looking to see if the string contains a _, -, _, - in that order with arbitrary random junk between, we could do that two ways (you can substitute other separators, but a . needs special treatment in a regexp; either [.] or \. will do):

regexp {_.+-.+_.+-} $stringToMatchAgainst
string match {*_*-*_*-*} $stringToMatchAgainst

OK, technically the last one (which is glob matching) matches something slightly different, but the effect is similar.

However I'm not sure that the above is what you're really looking for. At a guess you're really after the words? Possibly also the separators.

To get a list of the words, we use a somewhat different technique (can't use \w as that matches underline as well because that's common in identifiers):

set wordList [regexp -all -inline {[a-zA-Z0-9]+} $stringToMatchAgainst]

If you're after the separators too, the easiest method is to use textutil::split::splitx from Tcllib:

package require textutil::split
set tokenList [textutil::split::splitx $stringToMatchAgainst {([-_.])} ]

In the last case, with an input string of word_word-word_word_word-word it gives this output:

word _ word - word _ word _ word - word
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜