开发者

Using only CR as linebreak inside pre tag doesn't work

At work, we stumbled upon Bugzilla creating HTML output that led to lines much too long because the browser didn't break the lines. This was happening on Chrome, but not on Firefox 3.5, so we didn't really care. But Firefox 4 behaves just like Chrome, so we had to find another workaround.

An example is:

<html>
  <body>
    <pre>
      Lorem ipsum dolor sit amet, consetetur sadipscing elitr,&#013;sed diam nonumy eirmod tempor invidunt ut labore et&#013;dolore magna aliquyam erat, sed diam voluptua. At vero eos&#013;et accusam et justo duo dolores et ea rebum. Stet clita kasd&#013;gubergren, no sea takimata sanctus est Lorem ipsum dolor sit&#013;amet.&#013;
    </pre>
  </body>
</html>

The server is using only CR as a linebreak which is very uncommon and the usual alternatives (CR+LF, only LF) work correctly, so the right way to fix this is to tell the Bugzilla server to use one of these linebreak methods. Anyway, I'm curious why this doesn't work and ignoring the linebreaks seems to be the "correct" way for browsers.

Also, I found a strange local workaround for Chrome and FF 4 using a Greasemonkey script (modified version of this one):

var els = document.getElementsByTagName("*");
for(var i = 0, l = els.length; i < l; i++) {
  var el = els[i];
  el.innerHTML = el.innerHTML;
}

It seems this would've no effect on the page, but with this script, linebreaks suddenly are showing correctly.

So my questions a开发者_高级运维re:

  1. Is the Chrome/FF 4 way the "correct" way to handle these kinds of linebreaks inside <pre>?
  2. Why is this Greasemonkey script working?


The GM script works because apparently JS converts CR's (\r) to LF (\n), dynamically on writes to the DOM.

See this test at jsFiddle. Notice how the CR (decimal 13), at the end of the 2nd line, gets converted to LF (decimal 10).


Yes, the HTML RFC defines a line break as: http://www.w3.org/TR/html401/struct/text.html#line-breaks

A line break is defined to be a carriage return (&#x000D;), a line feed (&#x000A;), or a carriage return/line feed pair. All line breaks constitute white space.

However, a bare carriage return is extremely rare. I'm not surprised it doesn't work. But technically, I'd say that FF4 and Chrome are in the wrong.

Not sure why your greasemonkey script is working. My guess is that getting el.innerHTML is converting CR to CR-LF or LF.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜