Regex for matching capitals

2023-02-20 01:20 问答作者：

 def normalized?

    matches = match(/[^A-Z]*/)
    return matches.size == 0

  end

This is my function operati开发者_如何转开发ng on a string, checking wether a string contains only uppercase letters. It works fine ruling out non matches, but when i call it on a string like "ABC" it says no match, because apparently matches.size is 1 and not zero. There seems to be an empty element in it or so.

Can anybody explain why?

Your regex is wrong - if you want it to match ONLY uppercase strings, use /^[A-Z]+$/.

Your regular expression is incorrect. /[^A-Z]*/ means "match zero or more characters that are not between A and Z, anywhere in the string". The string ABC has zero characters that are not between A and Z, so it matches the regular expression.

Change your regular expression to /^[^A-Z]+$/. This means "match one or more characters that are not between A and Z, and make sure every character between the beginning and end of the string are not between A and Z". Then the string ABC will not match, and then you can check matches[0].size or whatever, as per sepp2k's answer.

MatchData#size returns the number of capturing groups in the regex plus one, so that md[i] will access a valid group iff i < md.size. So the value returned by size only depends on the regex, not the matched string, and will never be 0.

You want matches.to_s.size or matches[0].size.

ruby-1.9.2-p180>   def normalized? s
ruby-1.9.2-p180?>    s.match(/^[[:upper:]]+$/) ? true : false
ruby-1.9.2-p180?>  end
 => nil 
ruby-1.9.2-p180>  normalized? "asdf"
 => false 
ruby-1.9.2-p180>  normalized? "ASDF"
 => true

The * in your regular expression means that it matches any number of non-uppercase characters, including zero. So it always matches everything. The fix is to remove the *, then it will fail to match a string containing only uppercase characters. (Although you would need a different test if zero-length strings are not permitted.)

If you want to know that the input string entirely consists of English uppercase letters, i.e. A-Z, then you must remove the Kleene Star as it will match before and after every single character in any input string (zero length match). The statement !s[/[^A-Z]/] tells you if there's no match of non-A-to-Z characters:

irb(main):001:0> def normalized? s
irb(main):002:1>     return !s[/[^A-Z]/]
irb(main):003:1> end
=> nil
irb(main):004:0> normalized? "ABC"
=> true
irb(main):005:0> normalized? "AbC"
=> false
irb(main):006:0> normalized? ""
=> true
irb(main):007:0> normalized? "abc"
=> false

There is only 1 regular expression that defines a string with only and All capitals:

def onlyupper(s)
(s =~ /^[A-Z]+$/) != nil
end

Truth table:

/[^A-Z]*/:
 Testing  'asdf'     matched  'asdf'     length  4
 Testing  'HHH'      matched  ''         length  0
 Testing  ''         matched  ''         length  0
 Testing  '-=AAA'    matched  '-='       length  2
--------
/[^A-Z]+/:
 Testing  'asdf'     matched  'asdf'     length  4
 Testing  'HHH'      matched  nil
 Testing  ''         matched  nil
 Testing  '-=AAA'    matched  '-='       length  2
--------
/^[^A-Z]*$/:
 Testing  'asdf'     matched  'asdf'     length  4
 Testing  'HHH'      matched  nil
 Testing  ''         matched  ''         length  0
 Testing  '-=AAA'    matched  nil
--------
/^[^A-Z]+$/:
 Testing  'asdf'     matched  'asdf'     length  4
 Testing  'HHH'      matched  nil
 Testing  ''         matched  nil
 Testing  '-=AAA'    matched  nil
--------
/^[A-Z]*$/:
 Testing  'asdf'     matched  nil
 Testing  'HHH'      matched  'HHH'      length  3
 Testing  ''         matched  ''         length  0
 Testing  '-=AAA'    matched  nil
--------
/^[A-Z]+$/:
 Testing  'asdf'     matched  nil
 Testing  'HHH'      matched  'HHH'      length  3
 Testing  ''         matched  nil
 Testing  '-=AAA'    matched  nil
--------

This question needs a more clear answer. As tchrist commented, I wish he would have answered. The "Regex for matching capitals" is to use:

/\p{Uppercase}/

As tchrist mentions "is distinct from the general category \p{Uppercase_Letter} aka \p{Lu}. That’s because there exist non-Letters that count as Uppercase"

继续阅读：regex ruby

Regex for matching capitals

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？