Help to understand and recode javascript function to deal with special characters
I am trying to rewrite a javascript function since I was told this function its a bit nasty peace of code and it could be nicely written by a very kind user from here.
I have been trying to understand what the function does, therefore I could rewrite it properly, but since I dont fully understand how it works its a very difficult task.
Therefore I am looking for help and directions (NOT THE SOLUTION AS I WANT TO LEARN MYSELF) to understand and rewrite this function in a nicer way.
The function its been made for dealing with special characters, and I know that it loops through the string sent to it, search for special characters, and add what it needs to the string to make it a valid string.
I have been trying to use value.replace(/"/gi,"/""), but surely I am doing it wrong as it crashes.
Could anybody tell me where to start to recode function?
Any help would be appreciated.
My comments on the function are in capital letters. Code
<script type="text/javascript">
function convertString(value){
for(var z=0; z <= value.length -1; z++)
{
//if current character is a backslash||WHY IS IT CHECKING FOR \\,\\r\\n,and \\n?
if(value.substring(z, z + 1)=="\\" && (value.substring(z, z + 4)!="\\r\\n" && value.substring(z, z + 2)!="\\n"))
{//WHY IS IT ADDING \\\\ TO THE STRING?
value = value.substring(0, z) + "\\\\" + value.substring(z + 1, value.length);
z++;
}
if(value.substring(z, z + 1)=="\\" && value.substring(z, z + 4)=="\\r\\n")
{//WHY IS IT ADDING 4 TO Z IN THIS CASE?
z = z+4;
}
if(value.substring(z, z + 1)=="\\" && value.substring(z, z + 2)=="\\n")
开发者_StackOverflow {//WHY IS IT ADDING 2 TO Z IN THIS CASE?
z = z+2;
}
}
//replace " with \"
//loop through each character
for(var x = 0; x <= value.length -1; x++){
//if current character is a quote
if(value.substring(x, x + 1)=="\""){//THIS IS TO FIND \, BUT HAVENT THIS BEEN DONE BEFFORE?
//concatenate: value up to the quote + \" + value AFTER the quote||WHY IS IT ADDING \\ BEFORE \"?
value = value.substring(0, x) + "\\\"" + value.substring(x + 1, value.length);
//account for extra character
x++;
}
}
//return the modified string
return(value);
}
<script>
Comments within the code on capital letters are my questions about the function as I mention above.
I would appreciate any help, orientation, advise, BUT NOT THE SOLUTION PLEASE AS I DO WANT TO LEARN.
Ok, let's step through this.
//if current character is a backslash||WHY IS IT CHECKING FOR \,\r\n,and \n?
\ is a special character, known as an escape character. \, \r and \n are all escape sequences. There are several other javascript escape sequences, but these are the ones you are dealing with. If you look up that last link you will see that \ is the escape sequence for adding a backslash. Since \ itself is an escape character, adding a \ to a string requires you to add two. It does this for all groups of two \ as long as the escape sequence is not for the newline (\n), nor \r\n, a windows newline. When your string is later used, \ will end up being a single \ in the output.
//WHY IS IT ADDING 4 TO Z IN THIS CASE?
The reason the script is adding 4 and 2 to z in the other two ifs is because it has determined an escape sequence of that length, and therefor doesn't need to check other characters in the sequence. As an example, consider the string `AAABAAACAAA'
If I wanted to use the same method, looping through character by character, and change all instances of A to D, then I might do this:
for (i = 0; i < myString.length; i++) {
if (myString.substring(i) == 'A') {
myString = myString.substring(0, i) + 'D' + myString.substring(i+1, myString.length);
}
}
Instead, if I knew all of my A's were in groups of 3, like they are in my case I could do this
for (i = 0; i < myString.length; i++) {
if (myString.substring(i, i+3) == 'AAA') {
myString = myString.substring(0, i) + 'DDD' + myString.substring(i+3,myString.length);
i+= 3;
}
}
Here, I am finding an occurrence of AAA. The first time I find AAA my i = 0. Since I found AAA when i == 0, and I am replacing them with DDD, I know i + 1 and i + 2 are not going to contain a letter A (because i just replaced them)... so I can skip ahead and start my processing three characters down on the next loop.
//THIS IS TO FIND \, BUT HAVENT THIS BEEN DONE BEFFORE?
No, here you are looking for \", the escape sequence for a double quote.
Try this to see the difference in output.
var testString = "This is a \"string\" with \"escape sequences\".\nIt \"escapes\" backslashes like this \\ and double quotes like this \" but leaves new lines alone";
alert(testString);
alert(convertString(testString));
The code seems to be doing some kind of escaping on a string. In the first loop it's replacing all instances of \
with \\
unless they precede a \r\n
or \n
sequence, in which case it skips past them. The second loop is replacing "
with \"
, as the comment says. I'm not entirely sure why it's escaping lone backslashes but leaving new lines alone, though.
I think the thing that's got you confused is that the backslash character is an escape character: it removes any special meaning from the following character. For example, if I want to use the string this is a "string"
in my code, I'd write it as follows:
var foo = "this is a \"string\"";
The escape characters in this case prevent the "
characters from terminating the string (sine they're part of it). Of course, if you want an actual backslash in your string, you need to escape it with another backslash, e.g.:
var foo = "this is a file path: C:\\bar\\some-file.txt";
Similarly, \r
and \n
denote the carriage return and line feed characters respectively. On a Windows platform, \r\n
is used for new lines, while on Linux platforms, \n
is used.
See Wikipedia for more on escape characters and newlines.
To address your questions about why 4 and 2 are being added to z
, in each case the increment corresponds to the number of characters that are being skipped. A "\\n"
string literal has the value \n
, which is two characters, and hence 2 is being added to z
. Similarly, "\\r\\n"
is four characters, and so 4 is being added to z
.
Sorry if I've got the wrong end of the stick!
精彩评论