开发者

innerHTML alternative for retrieving contents of page?

开发者_Python百科I'm currently using innerHTML to retrieve the contents of an HTML element and I've discovered that in some browsers it doesn't return exactly what is in the source.

For example, using innerHTML in Firefox on the following line:

<div id="test"><strong>Bold text</strong></strong></div>

Will return:

<strong>Bold text</strong>

In IE, it returns the original string, with two closing strong tags. I'm assuming in most cases it's not a problem (and may be a benefit) that Firefox cleans up the incorrect code. However, for what I'm trying to accomplish, I need the exact code as it appears in the original HTML source.

Is this at all possible? Is there another Javascript function I can us?


I don't think you can receive incorrect HTML code in modern browsers. And it's right behaviour, because you don't have source of dynamicly generated HTML. For example Firefox' innerHTML returns part of DOM tree represented in string. Not an HTML source. And this is not a problem because second </strong> tag is ignored by the browser anyway.


innerHTML is generated not from the actual source of the document ie. the HTML file but is derived from the DOM object that is rendered by the browser. So if IE somehow shows you incorrect HTML code then it's probably some kind of bug. There is no such method to retrieve the invalid HTML code in every browser.


You can't in general get the original invalid HTML for the reasons Ivan and Andris said.

IE is also “fixing” your code just like Firefox does, albeit in a way you don't notice on serialisation, by creating an Element node with the tagName /strong to correspond to the bogus end-tag. There is no guarantee at all that IE will happen to preserve other invalid markup structures through a parse/serialise cycle.

In fact even for valid code the output of innerHTML won't be exactly the same as the input. Attribute order isn't maintained, tagName case isn't maintained (IE gives you <STRONG>), whitespace is various places is lost, entity references aren't maintained, and so on. If you “need the exact code”, you will have to keep a copy of the exact code, for example in a JavaScript variable in a <script> block written after the content in question.


If you don't need the HTML to render (e.g., you're going to use it as a JS template or something) you can put it in a textarea and retrieve the contents with innerHTML.

<textarea id="myTemplate"><div id="test"><strong>Bold text</strong></strong></div></textarea>

And then:

$('#myTemplate').html() === '<div id="test"><strong>Bold text</strong></strong></div>'

Other than that, the browser gets to decide how to interpret the HTML and it will only return you it's interpretation, not the original.


innerTEXT ? or does that have the same eeffect?


You must use innerXML property. It does exactly what you want to achieve.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜