开发者

JavaScript regex match characters inside quotes and not in character set

I have a string I would like to split using #, ., [], or {} characters, as in CSS. The desired functionality is:

- Input: "div#foo[bar='value'].baz{text}"

- Output: ["div", "#foo", "[bar='value'", ".baz", "{text"]

This is easy enough, with this RegEx: input.match(/([#.\[{]|^.*?)[^#.\[{\]}]*/g)

However, this doesn't ignore syntax characters inside quotes, as I would like it too. (e.x. "div[bar='value.开发者_运维问答baz']" should ignore the .)

How can I make the second part of my RegEx (the [^#.\[{\]}]* portion) capture not only the negated character set, but also any character within quotes. In other words, how can I implement the RegEx, (\"|').+?\1 into my current one.

Edit: I've figured out a regex that works decent, but can't handle escaped-quotes inside quotes (for example: "stuff here \\" quote "). If someone knows how to do that, it would be extremely helpful:

str.match(/([#.\[{]|^.*?)((['"]).*?\3|[^.#\[\]{\}])*/g);


var str = "div#foo[bar='value.baz'].baz{text}";
str.match(/(^|[\.#[\]{}])(([^'\.#[\]{}]+)('[^']*')?)+/g)
// [ 'div', '#foo', '[bar=\'value.baz\'', '.baz', '{text' ]


var tokens = myCssString.match(/\/\*[\s\S]*?\*\/|"(?:[^"\\]|\\[\s\S]*)"|'(?:[^'\\]|\\[\s\S])*'|[\{\}:;\(\)\[\]./#]|\s+|[^\s\{\}:;\(\)\[\]./'"#]+/g);

Given your string, it produces

div
#
foo
[
bar=
'value.foo'
]
.
baz
{
text
}

The RegExp above is loosely based on the CSS 2.1 lexical grammar


Firstly, and i can't stress this enough: you shouldn't use regexps to parse css, you should use a real parser, for instance http://glazman.org/JSCSSP/ or similar - many have built them, no need for you to reinvent the wheel.

that said, to solve your current problem do this:

var str = "div#foo[bar='value.foo'].baz{text}";

str.match(/([#.\[{]|^.*?)(?:[^#\[{\]}]*|\.*)/g);

//["div", "#foo", "[bar='value.foo'", ".baz", "{text"]
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜