how to use one line regular expression to get matched content

2023-01-04 12:10 问答作者：

I'm a newbie to ruby, I want to know if I can use just one line to do the job.

Take the 'search' of this site for example. When user typed [ruby] regex, I can use following code to get the tag and keyword

'[ruby] regex' =~ /\[(.*?)\](.*)/
tag, keyword = $1, $2

Can we write it just in one line?

UPDATE

Thank you so much! May I make it harder and more interesting, that the input may contains more than one tags, like:

[ruby] [regex] [rails] one line

Is it开发者_运维百科 possible to use one line code to get the tags array and the keyword? I tried, but failed.

You need the Regexp#match method. If you write /\[(.*?)\](.*)/.match('[ruby] regex'), this will return a MatchData object. If we call that object matches, then, among other things:

matches[0] returns the whole matched string.
matches[n] returns the nth capturing group ($n).
matches.to_a returns an array consisting of matches[0] through matches[N].
matches.captures returns an array consisting of just the capturing group (matches[1] through matches[N]).
matches.pre_match returns everything before the matched string.
matches.post_match returns everything after the matched string.

There are more methods, which correspond to other special variables, etc.; you can check MatchData's docs for more. Thus, in this specific case, all you need to write is

tag, keyword = /\[(.*?)\](.*)/.match('[ruby] regex').captures

Edit 1: Alright, for your harder task, you're going to instead want the String#scan method, which @Theo used; however, we're going to use a different regex. The following code should work:

# You could inline the regex, but comments would probably be nice.
tag_and_text = / \[([^\]]*)\] # Match a bracket-delimited tag,
                 \s*          # ignore spaces,
                 ([^\[]*) /x  # and match non-tag search text.
input        = '[ruby] [regex] [rails] one line [foo] [bar] baz'
tags, texts  = input.scan(tag_and_text).transpose

The input.scan(tag_and_text) will return a list of tag–search-text pairs:

[ ["ruby", ""], ["regex", ""], ["rails", "one line "]
, ["foo", ""], ["bar", "baz"] ]

The transpose call flips that, so that you have a pair consisting of a tag list and a search-text list:

[["ruby", "regex", "rails", "foo", "bar"], ["", "", "one line ", "", "baz"]]

You can then do whatever you want with the results. I might suggest, for instance

search_str = texts.join(' ').strip.gsub(/\s+/, ' ')

This will concatenate the search snippets with single spaces, get rid of leading and trailing whitespace, and replace runs of multiple spaces with a single space.

'[ruby] regex'.scan(/\[(.*?)\](.*)/)

will return

[["ruby", " regex"]]

you can read more about String#scan here: http://ruby-doc.org/core/classes/String.html#M000812 (in short it returns an array of all consecutive matches, the outer array in this case is the array of matches, and the inner is the capture groups of the one match).

to do the assignment you can rewrite it like this (assuming you will only ever have one match in the string):

tag, keyword = '[ruby] regex'.scan(/\[(.*?)\](.*)/).flatten

depending on exactly what you want to accomplish you may want to change the regex to

/^\s*\[(.*?)\]\s*(.+)\s*$/

which matches the whole input string, and trims some spaces from the second capture group. Anchoring the pattern to the start and end will make it a bit more efficient, and it will avoid getting false or duplicate matches in some cases (but that very much depends on the input) -- it also guarantees that you can safely use the returned array in assignment, because it will never have more than one match.

As for the follow up question, this is what I would do:

def tags_and_keyword(input)
  input.scan(/^\s*\[(.+)\]\s+(.+)\s*$/) do |match|
    tags = match[0].split(/\]\s*\[/)
    line = match[1]
    return tags, line
  end
end

tags, keyword = tags_and_keyword('[ruby] [regex] [rails] one line')
tags # => ["ruby", "regex", "rails"]
keyword # => "one line"

it can be rewritten in one line, but I wouldn't:

tags, keyword = catch(:match) { input.scan(/^\s*\[(.+)\]\s+(.+)\s*$/) { |match| throw :match, [match[0].split(/\]\s*\[/), match[1]] } }

My solution assumes all tags come before the keyword, and that there's only one tags/keyword expression in each input. The first capture globs all tags, but then I split that string, so it's a two-step process (which, as @Tim wrote in his comment, is required unless you have an engine capable of recursive matching).

Put this into your ApplicationHelper or somewhere else you need

def element_id_for(f, element)
  matcher   = /id=(".*"|'.*')/
  el_string = f.hidden_field(element.to_sym)
  id_string = matcher.match(el_string)[0].gsub(/id="/, '').chomp('"')
  return    id_string
end

Finally, you could use this method like this:

form_for :test_form do |f|
  my_id = element_id_for(f, :start_date)
  # => "text_form_start_date"
end

继续阅读：regex ruby

how to use one line regular expression to get matched content

更多精彩内容

精彩评论

最新问答

第一次出国飞行流程+注意事项？

再生油（关于再生油的介绍）？

东莞科技进修学院（关于东莞科技进修学院的介绍）？

均为镇政府人员平均年龄不超30？

手机msn在哪里下载（其实很简单）？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？