JavaScript regex match characters inside quotes and not in character set
I have a string I would like to split using #, ., [], or {}
characters, as in CSS. The desired functionality is:
"div#foo[bar='value'].baz{text}"
- Output:
["div", "#foo", "[bar='value'", ".baz", "{text"]
This is easy enough, with this RegEx:
input.match(/([#.\[{]|^.*?)[^#.\[{\]}]*/g)
"div[bar='value.开发者_运维问答baz']"
should ignore the .
)
How can I make the second part of my RegEx (the [^#.\[{\]}]*
portion) capture not only the negated character set, but also any character within quotes. In other words, how can I implement the RegEx, (\"|').+?\1
into my current one.
Edit:
I've figured out a regex that works decent, but can't handle escaped-quotes inside quotes (for example: "stuff here \\" quote "
). If someone knows how to do that, it would be extremely helpful:
str.match(/([#.\[{]|^.*?)((['"]).*?\3|[^.#\[\]{\}])*/g);
var str = "div#foo[bar='value.baz'].baz{text}";
str.match(/(^|[\.#[\]{}])(([^'\.#[\]{}]+)('[^']*')?)+/g)
// [ 'div', '#foo', '[bar=\'value.baz\'', '.baz', '{text' ]
var tokens = myCssString.match(/\/\*[\s\S]*?\*\/|"(?:[^"\\]|\\[\s\S]*)"|'(?:[^'\\]|\\[\s\S])*'|[\{\}:;\(\)\[\]./#]|\s+|[^\s\{\}:;\(\)\[\]./'"#]+/g);
Given your string, it produces
div
#
foo
[
bar=
'value.foo'
]
.
baz
{
text
}
The RegExp above is loosely based on the CSS 2.1 lexical grammar
Firstly, and i can't stress this enough: you shouldn't use regexps to parse css, you should use a real parser, for instance http://glazman.org/JSCSSP/ or similar - many have built them, no need for you to reinvent the wheel.
that said, to solve your current problem do this:
var str = "div#foo[bar='value.foo'].baz{text}";
str.match(/([#.\[{]|^.*?)(?:[^#\[{\]}]*|\.*)/g);
//["div", "#foo", "[bar='value.foo'", ".baz", "{text"]
精彩评论