Ruby regex: scan all
I have a string:
TFS[MAD,GRO,BCN],ALC[GRO,PMI,ZAZ,MAD,BCN],BCN[ALC,...]...
I want to convert i开发者_如何转开发t into a list:
list = (
[0] => "TFS"
[0] => "MAD"
[1] => "GRO"
[2] => "BCN"
[1] => "ALC"
[0] => "GRO"
[1] => "PMI"
[2] => "ZAZ"
[3] => "MAD"
[4] => "BCN"
[2] => "BCN"
[1] => "ALC"
[2] => ...
[3] => ...
)
How do I do this in Ruby?
I tried:
(([A-Z]{3})\[([A-Z]{3},+))
But it returns only the first element in [] and doesn't make a comma optional (at the end of "]").
You need to tell the regex that the ,
is not required after each element, but instead in front of each argument except the first. This leads to the following regex:
str="TFS[MAD,GRO,BCN],ALC[GRO,PMI,ZAZ,MAD,BCN],BCN[ALC]"
str.scan(/[A-Z]{3}\[[A-Z]{3}(?:,[A-Z]{3})*\]/)
#=> ["TFS[MAD,GRO,BCN]", "ALC[GRO,PMI,ZAZ,MAD,BCN]", "BCN[ALC]"]
You can also use scan
's behavior with capturing groups, to split each match into the part before the brackets and the part inside the brackets:
str.scan(/([A-Z]{3})\[([A-Z]{3}(?:,[A-Z]{3})*)\]/)
#=> [["TFS", "MAD,GRO,BCN"], ["ALC", "GRO,PMI,ZAZ,MAD,BCN"], ["BCN", "ALC"]]
You can then use map
to split each part inside the brackets into multiple tokens:
str.scan(/([A-Z]{3})\[([A-Z]{3}(?:,[A-Z]{3})*)\]/).map do |x,y|
[x, y.split(",")]
end
#=> [["TFS", ["MAD", "GRO", "BCN"]],
# ["ALC", ["GRO", "PMI", "ZAZ", "MAD", "BCN"]],
# ["BCN", ["ALC"]]]
Here's another way using a hash to store your contents, and less regex.
string = "TFS[MAD,GRO,BCN],ALC[GRO,PMI,ZAZ,MAD,BCN],BCN[ALC]"
z=Hash.new([])
string.split(/][ \t]*,/).each do |x|
o,p=x.split("[")
z[o]=p.split(",")
end
z.each_pair{|x,y| print "#{x}:#{y}\n"}
output
$ ruby test.rb
TFS:["MAD", "GRO", "BCN"]
ALC:["GRO", "PMI", "ZAZ", "MAD", "BCN"]
BCN:["ALC]"]
first split the groups
groups = s.scan(/[^,][^\[]*\[[^\[]*\]/)
# => ["TFS[MAD,GRO,BCN]", "ALC[GRO,PMI,ZAZ,MAD,BCN]"]
Now you have the groups, the rest is pretty straightforward:
groups.map {|x| [x[0..2], x[4..-2].split(',')] }
# => [["TFS", ["MAD", "GRO", "BCN"]], ["ALC", ["GRO", "PMI", "ZAZ", "MAD", "BCN"]]]
If I understood correctly, you may want to get such array.
yourexamplestring.scan(/([A-Z]{3})\[([^\]]+)/).map{|a,b|[a,b.split(',')]}
[["TFS", ["MAD", "GRO", "BCN"]], ["ALC", ["GRO", "PMI", "ZAZ", "MAD", "BCN"]], ["BCN", ["ALC", "..."]]]
精彩评论