Regex for matching capitals
def normalized?
matches = match(/[^A-Z]*/)
return matches.size == 0
end
This is my function operati开发者_如何转开发ng on a string, checking wether a string contains only uppercase letters. It works fine ruling out non matches, but when i call it on a string like "ABC"
it says no match, because apparently matches.size
is 1 and not zero. There seems to be an empty element in it or so.
Can anybody explain why?
Your regex is wrong - if you want it to match ONLY uppercase strings, use /^[A-Z]+$/
.
Your regular expression is incorrect. /[^A-Z]*/
means "match zero or more characters that are not between A
and Z
, anywhere in the string". The string ABC
has zero characters that are not between A
and Z
, so it matches the regular expression.
Change your regular expression to /^[^A-Z]+$/
. This means "match one or more characters that are not between A
and Z
, and make sure every character between the beginning and end of the string are not between A
and Z
". Then the string ABC
will not match, and then you can check matches[0].size
or whatever, as per sepp2k's answer.
MatchData#size
returns the number of capturing groups in the regex plus one, so that md[i]
will access a valid group iff i < md.size
. So the value returned by size
only depends on the regex, not the matched string, and will never be 0.
You want matches.to_s.size
or matches[0].size
.
ruby-1.9.2-p180> def normalized? s
ruby-1.9.2-p180?> s.match(/^[[:upper:]]+$/) ? true : false
ruby-1.9.2-p180?> end
=> nil
ruby-1.9.2-p180> normalized? "asdf"
=> false
ruby-1.9.2-p180> normalized? "ASDF"
=> true
The *
in your regular expression means that it matches any number of non-uppercase characters, including zero. So it always matches everything. The fix is to remove the *
, then it will fail to match a string containing only uppercase characters. (Although you would need a different test if zero-length strings are not permitted.)
If you want to know that the input string entirely consists of English uppercase letters, i.e. A-Z, then you must remove the Kleene Star as it will match before and after every single character in any input string (zero length match). The statement !s[/[^A-Z]/]
tells you if there's no match of non-A-to-Z characters:
irb(main):001:0> def normalized? s
irb(main):002:1> return !s[/[^A-Z]/]
irb(main):003:1> end
=> nil
irb(main):004:0> normalized? "ABC"
=> true
irb(main):005:0> normalized? "AbC"
=> false
irb(main):006:0> normalized? ""
=> true
irb(main):007:0> normalized? "abc"
=> false
There is only 1 regular expression that defines a string with only and All capitals:
def onlyupper(s)
(s =~ /^[A-Z]+$/) != nil
end
Truth table:
/[^A-Z]*/:
Testing 'asdf' matched 'asdf' length 4
Testing 'HHH' matched '' length 0
Testing '' matched '' length 0
Testing '-=AAA' matched '-=' length 2
--------
/[^A-Z]+/:
Testing 'asdf' matched 'asdf' length 4
Testing 'HHH' matched nil
Testing '' matched nil
Testing '-=AAA' matched '-=' length 2
--------
/^[^A-Z]*$/:
Testing 'asdf' matched 'asdf' length 4
Testing 'HHH' matched nil
Testing '' matched '' length 0
Testing '-=AAA' matched nil
--------
/^[^A-Z]+$/:
Testing 'asdf' matched 'asdf' length 4
Testing 'HHH' matched nil
Testing '' matched nil
Testing '-=AAA' matched nil
--------
/^[A-Z]*$/:
Testing 'asdf' matched nil
Testing 'HHH' matched 'HHH' length 3
Testing '' matched '' length 0
Testing '-=AAA' matched nil
--------
/^[A-Z]+$/:
Testing 'asdf' matched nil
Testing 'HHH' matched 'HHH' length 3
Testing '' matched nil
Testing '-=AAA' matched nil
--------
This question needs a more clear answer. As tchrist commented, I wish he would have answered. The "Regex for matching capitals" is to use:
/\p{Uppercase}/
As tchrist mentions "is distinct from the general category \p{Uppercase_Letter} aka \p{Lu}. That’s because there exist non-Letters that count as Uppercase"
精彩评论