Javascript RegEx non-capturing prefix
I am trying to do some string replacement with RegEx in Javascript. The scenario is a single line string containing long comma-delimited list of numbers, in which duplicates are possible.
An example string is: 272,2725,2726,272,2727,297,272
(The end may or may not end in a comma)
In this example, I am trying to match each occurrence of the whole number 272. (3 matches expected)
The example regex I'm trying to use is: (?:^|,)272(?=$|,)
The problem I am having is that the second and third matches are including the leading comma, which I do not want. I am confused because I thought (?:^|,)
would match, but not capture. Can someone shed light on this for me? An interesting bit is that the trailing comma is excluded from the re开发者_开发技巧sult, which is what I want.
For what it is worth, if I were using C# there is syntax for prefix matching that does what I want: (?<=^|,)
However, it appears to be unsupported in JavaScript.
Lastly, I know I could workaround it using string splitting, array manipulation and rejoining, but I want to learn.
Use word boundaries instead:
\b272\b
ensures that only 272
matches, but not 2725
.
(?:...)
matches and doesn't capture - but whatever it matches will be part of the overall match.
A lookaround assertion like (?=...)
is different: It only checks if it is possible (or impossible) to match the enclosed regex at the current point, but it doesn't add to the overall match.
Here is a way to create a JavaScript look behind that has worked in all cases I needed.
This is an example. One can do many more complex and flexible things.
The main point here is that in some cases, it is possible to create a RegExp non-capturing prefix (look behind) construct in JavaScript .
This example is designed to extract all fields that are surrounded by braces '{...}'. The braces are not returned with the field.
This is just an example to show the idea at work not necessarily a prelude to an application.
function testGetSingleRepeatedCharacterInBraces()
{
var leadingHtmlSpaces = ' ' ;
// The '(?:\b|\B(?={))' acts as a prefix non-capturing group.
// That is, this works (?:\b|\B(?=WhateverYouLike))
var regex = /(?:\b|\B(?={))(([0-9a-zA-Z_])\2{4})(?=})/g ;
var string = '' ;
string = 'Message has no fields' ;
document.write( 'String => "' + string
+ '"<br>' + leadingHtmlSpaces + 'fields => '
+ getMatchingFields( string, regex )
+ '<br>' ) ;
string = '{LLLLL}Message {11111}{22222} {ffffff}abc def{EEEEE} {_____} {4444} {666666} {55555}' ;
document.write( 'String => "' + string
+ '"<br>' + leadingHtmlSpaces + 'fields => '
+ getMatchingFields( string, regex )
+ '<br>' ) ;
} ;
function getMatchingFields( stringToSearch, regex )
{
var matches = stringToSearch.match( regex ) ;
return matches ? matches : [] ;
} ;
Output:
String => "Message has no fields"
fields =>
String => "{LLLLL}Message {11111}{22222} {ffffff}abc def{EEEEE} {_____} {4444} {666666} {55555}"
fields => LLLLL,11111,22222,EEEEE,_____,55555
精彩评论