Ruby Regex, Only One Capture (Very Simple!)
I guess this will be a silly mistake but for me, the following returns an array containing only "M". See this:
/(.)+?/.match("Many many characters!").captures
=> ["M"]
Why doesn't it return an array of every character? I must have missed something blatantly obvious because I can't see whats wrong with this?
Edit: Just realised, I don't need the +? but it still doesn't work without it.
Edit: Apologies! I will clarify: my goal is to allow users to enter a regular expression and styling and an input text file, wherever there is a match, the text will be surrounded with a html element and styling will be applied, I am not just splitting the string into characters, I only used the given regex because it was the simplest although that was stupid on my part. How do I get capture groups from scan() or is that not possible? I see that $1 contains "!" (last match?) and not any others.
Edit: Gosh, it really isn't my day. As injekt has informed me, the captures are stored in separate arrays. How do I get the offset of these captures from the original string? I would like to be able to get the offset of a captures then surround it with another string. Or is that what gsub is for? (I thought that only replaced the match, not a capture group)
Hopefully final edit: Right, let me just start this again :P
So, I have a string. The user will use a configuration file to enter a regular expression, then a style associated with each capture group. I need to be able to scan the entire string and get the start and finish or offset and size of each group match.
So if a user had configured ([\w-\.]+)@((?:[\w]+\.)+)([a-zA-Z]{2,4})
(email address) then I should be able to get:
[ ["elliotpotts", 0, 11],
["sample.", 12, 7],
["com", 19, 3] ]
from the string: "elliotpotts@sample.com"
If that is not clear, ther开发者_运维技巧e is simply something wrong with me :P. Thanks a lot so far guys, and thank you for being so patient!
Because your capture is only matching one single character. (.)+
is not the same as (.+)
>> /(.)+?/.match("Many many characters!").captures
=> ["M"]
>> /(.+)?/.match("Many many characters!").captures
=> ["Many many characters!"]
>> /(.+?)/.match("Many many characters!").captures
=> ["M"]
If you want to match every character recursively use String#scan
or String#split
if you don't care about capture groups
Using scan:
"Many many characters!".scan(/./)
#=> ["M", "a", "n", "y", " ", "m", "a", "n", "y", " ", "c", "h", "a", "r", "a", "c", "t", "e", "r", "s", "!"]
Note that other answer are using (.)
whilst that's fine if you care about the capture group, it's a little pointless if you don't, otherwise it'll return EVERY CHARACTER in it's own separate Array, like this:
[["M"], ["a"], ["n"], ["y"], [" "], ["m"], ["a"], ["n"], ["y"], [" "], ["c"], ["h"], ["a"], ["r"], ["a"], ["c"], ["t"], ["e"], ["r"], ["s"], ["!"]]
Otherwise, just use split
: "Many many characters!".split(' ')"
EDIT In reply to your edit:
reg = /([\w-\.]+)@((?:[\w]+\.)+)([a-zA-Z]{2,4})/
str = "elliotpotts@sample.com"
str.scan(reg).flatten.map { |capture| [capture, str.index(capture), capture.size] }
#=> [["elliotpotts", 0, 11], ["sample.", 12, 7], ["com", 19, 3]]`
Oh, and you don't need scan, you're not really scanning so you dont need to traverse, at least not with the example you provided:
str.match(reg).captures.map { |capture| [capture, str.index(capture), capture.size] }
Will also work
Yes, something important was missed ;-)
(...)
only introduces ONE capture group: the number of times the group matches is irrelevant as the index is determined only by the regular expression itself and not the input.
The key is a "global regular expression", which will apply the regular expression multiple times in order. In Ruby this is done with inverting from Regex#match
to String#scan
(many other languages have a "/g" regular expression modifier):
"Many many characters!".scan(/(.)+?/)
# but more simply (or see answers using String#split)
"Many many characters!".scan(/(.)/)
Happy coding
It's only returning one character because that's all you've asked it to match. You probably want to use scan
instead:
str = "Many many characters!"
matches = str.scan(/(.)/)
The following code is from Get index of string scan results in ruby and modified for my liking.
[].tap {|results|
"abab".scan(/a/) {|capture|
results.push(([capture, Regexp::last_match.offset(0)]).flatten)
}
}
=> [["a", 0], ["a", 2]]
精彩评论