开发者

multiline matching with ruby

I have a string variable with multiple lines: e.g.

"SClone VARPB63A\nSeq_vec SVEC 1 65 pCR2.1-topo\nSequencing_vector \"pCR2.1-topo\"\nSeq_vec SVEC 102 1710 pCR2.1-topo\nClipping QUAL 46 397\n

I would want to get both of lines that start with "Seq_vec SVEC" and extract the values of the integer part that matches...

string = "Clone VARPB63A\nSeq_vec SVEC 1 65 pCR2.1-topo\nSequencing_vector \"pCR2.1-topo\"\nSeq_vec SVEC 102 1710 pCR2.1-topo\nClipping QUAL 46 397\n"

seqvector = Regexp.new("Seq_vec\\s+SVEC\\s+(\\d+\\s+\\d+)",Regexp::MULTILINE )
vector = string.match(seqvector)
        if vector
           vector_start,vector_stop = vector[1].split(/ /)
           puts vector_start.to_i
           puts vector_stop.to_i
         end

However this only grabs the first match's values and开发者_JAVA百科 not the second as i would like. Any ideas what i could be doing wrong? Thank you


To capture groups use String#scan

vector = string.scan(seqvector)
=> [["1 65"], ["102 1710"]]


match finds just the first match. To find all matches use String#scan e.g.

string.scan(seqvector)
=> [["1 65"], ["102 1710"]]

or to do something with each match:

string.scan(seqvector) do |match|
  # match[0] will be the substring captured by your first regexp grouping
  puts match.inspect
end


Just to make this a bit easier to handle, I would split the whole string into an array first and then would do:

string = "SClone VARPB63A\nSeq_vec SVEC 1 65 pCR2.1-topo\nSequencing_vector \"pCR2.1-topo\"\nSeq_vec SVEC 102 1710 pCR2.1-topo\nClipping QUAL 46 397\n"

selected_strings = string.split("\n").select{|x| /Seq_vec SVEC/.match(x)}


selected_strings.collect{|x| x.scan(/\s\d+/)}.flatten # => [" 1", " 65", " 102", " 1710"]
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜