Ruby regex for finding comments?
I've been at this all day and I can't figure it out. I have some Ruby code in a string below and would only only like to match lines with code on them and the first comment for the code if it exists.
# Some ignored comment.
1 + 1 # Simple math (this comment would be collected) # ignored
# ignored
user = User.new
user.name = "Ryan" # Setting an attribute # Another ignored comment
And this would capture:
-
"1 + 1"
"Simple math"
-
"user = User.new"
nil
-
"user.name = "Ryan"
"Setting an attribute"
I'm using /^\x20*(.+)\x20*(#\x20*.+\x20*){1}$/
to match agai开发者_Go百科nst each line but it doesn't seem to work for all code.
Kobi's answer partially works, but does not match lines of code that lack a comment at the end.
It will also fail when it encounters string interpolation, e.g.:
str = "My name is #{first_name} #{last_name}" # first comment
...will be erroneously matched as: str = "My name is #{first_name}
You need a more comprehensive regex. Here's one idea:
/^[\t ]*([^#"'\r\n]("(\\"|[^"])*"|'(\\'|[^'])*'|[^#\n\r])*)(#([^#\r\n]*))?/
^[\t ]*
- Leading whitespace.([^#"'\r\n]("(\\"|[^"])*"|'(\\'|[^'])*'|[^#\n\r])*)
- Matches a line of code.
Breakdown:[^#"'\r\n]
- the first character in a line of code, and..."(\\"|[^"])*"
- a double-quoted string, or...'(\\'|[^'])*'
- a single-quoted string, or...[^#\n\r]
- any other character outside a quoted string that is not a#
or line ending.
(#([^#\r\n]*))?
- Matches first comment at the end of a line of code, if any.
Due to the more complex logic, this will capture 6 subpatterns for each match. Subpattern 1 is the code, subpattern 6 is the comment, and you can ignore the others.
Given the following block of code:
# Some ignored comment.
1 + 1 # Simple math (this comment would be collected) # ignored
# ignored
user = User.new
user.name = "Ryan #{last_name}" # Setting an attribute # Another ignored comment
The above regex would produce the following (I excluded subpatterns 2, 3, 4, 5 for brevity):
1.1 + 1
6.Simple math (this comment would be collected)
1.user = User.new
6.
1.user.name = "Ryan #{last_name}"
6.Setting an attribute
Demo: http://rubular.com/r/yKxEazjNPC
While the underlying problem is quite difficult, you can find what you need here using the pattern:
^[\t ]*[^\s#][^#\n\r]*#([^#\n\r]*)
Which reads:
[\t ]*
- leading spaces.[^\s#]
- one actual character. This should match the code.[^#\n\r]*
- Characters until the # sign. Anything besides hash or newlines.#([^#\n\r]*)
- The "first" comment, captured in group 1.
Working example: http://rubular.com/r/wNJTMDV9Bw
精彩评论