Extremely strange glitch in Chrome - parses contents of string!
Okay - this is the dumbest glitch I have seen in a while:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
开发者_StackOverflow中文版<head>
<script type='text/javascript'>
var data = "</script>";
</script>
</head>
<body>
This should break!
</body>
</html>
This causes syntax errors because the JavaScript parser is actually reading the contents of the string. How stupid!
How can I put </script>
in my code. Is there any way?
Is there a valid reason for this behavior?
Within X(HT)ML (when actually treated as such), scripts are required to be escaped as CDATA for precisely this reason. http://www.w3.org/TR/xhtml1/diffs.html#h-4.8
In XHTML, the script and style elements are declared as having #PCDATA content. As a result,
<
and&
will be treated as the start of markup, and entities such as<
and&
will be recognized as entity references by the XML processor to<
and&
respectively. Wrapping the content of the script or style element within a CDATA marked section avoids the expansion of these entities.<script type="text/javascript"> <![CDATA[ ... unescaped script content ... ]]> </script>
If your XHTML document is just served as text/html and treated as tag soup, that doesn't apply and you'll just have to "escape" the string like '</scr' + 'ipt>'
.
It's not a glitch - this is normal expected behaviour and quite rightly so if you think about it. HTML specs do not define scripting languages, so all the engine should see is plain text up until </script>
, which closes the tag. There are a couple of options, other than the ones already outlined:
// escape the / character, changing the format of the "closing" tag
var data = "<\/script>";
// break up the string
var data = "</"+"script>";
The first method works because HTML doesn't use \
for escaping, it's treated as a literal character, and of course <\/script>
isn't a valid closing tag. The second one works for more obvious reasons, but I've been told by someone else here that it shouldn't be used (and I never quite understood why).
Write it this way:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<script type='text/javascript'>
<!--
var data = "</script>";
-->
</script>
</head>
<body>
This should break!
</body>
</html>
The reason is simply that HTML is parsed before executing javascript and the <!--
and -->
make the parser ignore all tags that appear in this section.
If you can believe the HTML4 standard, the script content
ends at the first ETAGO ("</") delimiter followed by a name start character ([a-zA-Z])
So, the JavaScript parser is not reading the contents of the string as you describe; the JavaScript parser never gets anything after var data = "
, which obviously isn't a valid script.
The simplest way to avoid accidentally ending your JavaScript early is to use Andy E's first suggestion:
var data = "<\/script>";
This way the HTML parser doesn't see </
so the script content doesn't end, and \/
is equivalent to /
in a JavaScript string literal, so the results are correct. This is also the method shown for JavaScript in the standard.
精彩评论