开发者

How to remove duplicate character at the end of a string in Regex

Can anyone help me with the following regex

<script type="text/javascript">
        function quoteWords() {
            var search = document.getElementById("search_box");
            search.value = search.value.replace(/^\s*|\s*$/g, ""); //trim string of ending and beginning whitespace
            if(search.value.indexOf(" ") != -1){ //if more then one word
                search.value = search.value.replace(/^"*|"*$/g, "\"");
            }
        }
  </script>

<input type="text" name="keywords" value="" id="search_box" size="17">
<input onClick="quoteWords()" type="submit" value="Go">

Issue : It breaks when manually adding double quotes and pressing submit, one extra double quote is entered at the end. The regex code should see if the double quotes exist, it should not add any thing.

So it makes "long enough" to "long enough"" <- it adds an extra double quote at the end

Can anyone check the regex code so see how to solve this issue.

I only want the dou开发者_C百科ble quotes to be inserted once.


The error is definitely happening in this line:

search.value = search.value.replace(/^"*|"*$/g, "\"");

And it is due to the fact that "* matches 0 or more quotes. However, you presumably wouldn't want to just replace it with "+ since that wouldn't do the job you wanted of double-quoting strings with spaces in them.

You probably just want to do something like this, in two statements:

search.value = search.value.replace(/^"*|"*$/g, '')
search.value = '"' + search.value + '"'

Part of the key is that there is no 'end of string' character to consume - the regex engine 'just knows' when it is at the end of the string. So after matching a quote at the end of the string, the cursor just moves to the end of the string, and it finds the empty string one more time before falling off the string. Thus, the quote at the end of the string is replaced by a quote, and the 'nothing' at the end of the string is also replaced by a quote.

I recommend taking a look at the ECMAScript spec at http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-262.pdf sections 15.5.4.10 and 15.5.4.11 yourself. However, I've also provided an intuitive illustration of how this works at this gist.

EDIT:

Since people seem confused as to why this would happen, here's something that might help:

http://www.grymoire.com/Unix/Sed.html#uh-6

That's from the documentation for sed, but it explains why combining * and /g is a bad idea. The fact that JS doesn't just explode when you do that is a mark in its favor. Note that there are an infinite number of '0 characters' at every position in the string.


I'm guessing that your problem is that you're getting three matches on a string like

"long enough"

The first match is the start plus the first quote (since regexes are greedy by default). The second match is the end quote and end of string ($). However, since the end of string is not an actual character, a third match of 0 characters at the end of string is perhaps occurring.

One possible solution would be to add quotes to the string and then replace one or more quote instead of zero or more quotes:

search.value = (search.value + '"').replace(/^"*|"+$/g, "\"");


In a regex, * matches 0 or more instances of the preceding item, and + matches 1 or more instances. Since you're using *, the regex matches when there are 0 or more characters that match \s in your first regex, and 0 or more "s in your second. Changing your *s to +s should give you the behavior you expect.

Edit: If you want to make it so that the result is surrounded by double quotes if they don't exist at the beginning or ending of the line, use something like /^[^"]|[^"]$ which reads as "the start of a line followed by any character other than a double-quote or any character other than a double quote followed by the end of a line"

Double Edit: That should probably be /^[^"\w]|[^"\w]$/ to make sure you don't replace the first and last characters of your match :/


You can use + instead of *:

search.value = search.value.replace(/^"+|"+$/g, '"');
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜