开发者

Regex exec only returning first match [duplicate]

This question already has answers here: RegEx to extract all matches from string using RegExp.exec (19 answers) Closed 3 years ago.

I am trying to implement the following regex search found on golfscript syntax开发者_Python百科 page.

var ptrn = /[a-zA-Z_][a-zA-Z0-9_]*|'(?:\\.|[^'])*'?|"(?:\\.|[^"])*"?|-?[0-9]+|#[^\n\r]*|./mg;
input = ptrn.exec(input);

Input is only ever the first match of the regexp. for example: "hello" "world" should return ["hello", "world"] but it only returns ["hello"].


RegExp.exec is only able to return a single match result at once.

In order to retrieve multiple matches you need to run exec on the expression object multiple times. For example, using a simple while loop:

var ptrn = /[a-zA-Z_][a-zA-Z0-9_]*|'(?:\\.|[^'])*'?|"(?:\\.|[^"])*"?|-?[0-9]+|#[^\n\r]*|./mg;

var match;
while ((match = ptrn.exec(input)) != null) {
    console.log(match);
}

This will log all matches to the console.

Note that in order to make this work, you need to make sure that the regular expression has the g (global) flag. This flag makes sure that after certain methods are executed on the expression, the lastIndex property is updated, so further calls will start after the previous result.

The regular expression will also need to be declared outside of the loop (as shown in the example above). Otherwise, the expression object would be recreated on every iteration and then the lastIndex would obviously reset every time, resulting in an infinite loop.


It is possible to call match method on the string in order to retrieve the whole collection of matches:

var ptrn = /[a-zA-Z_][a-zA-Z0-9_]*|'(?:\\.|[^'])*'?|"(?:\\.|[^"])*"?|-?[0-9]+|#[^\n\r]*|./mg;
var results = "hello world".match(ptrn);

results are (according to the regular expression):

["hello", " ", "world"]

match spec is here


I did not get what is meant by "hello" "world" in your question, is it user input or regex but I was told that RegExp object has a state -- its lastIndex position that it starts the search from. It does not return all the results at once. It brings only the first match and you need to resume .exec to get the rest of results starting from lastIndex position:

const re1 = /^\s*(\w+)/mg; // find all first words in every line
const text1 = "capture discard\n me but_not_me" // two lines of text
for (let match; (match = re1.exec(text1)) !== null;) 
      console.log(match, "next search at", re1.lastIndex);

prints

["capture", "capture"] "next search at" 7
[" me", "me"] "next search at" 19

The functional JS6 way to build iterator for your results is here

RegExp.prototype.execAllGen = function*(input) {
    for (let match; (match = this.exec(input)) !== null;) 
      yield match;
} ; RegExp.prototype.execAll = function(input) {
  return [...this.execAllGen(input)]}

Please also note how, unlike poke, much more nicely I used match variable enclosed in the for-loop.

Now, you can capture your matches easily, in one line

const matches = re1.execAll(text1)

log("captured strings:", matches.map(m=>m[1]))
log(matches.map(m=> [m[1],m.index]))
for (const match of matches) log(match[1], "found at",match.index)

which prints

"captured strings:" ["capture", "me"]

[["capture", 0], ["me", 16]]
"capture" "found at" 0
"me" "found at" 16
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜