开发者

Get the raw HTML of selected content using javascript

How would I get the raw HTML of the selected content on a page using Javascript? For the sake of simplicity, I'm sticking with browsers supporting window.getSelection.

Here is an example; the content between both | represent my selection.

<p>
    The <em>quick brown f|ox</em> jumps over the lazy <strong>d|og</strong>.
</p>

I can capture and alert the normalized HTML with the following Javascript.

var selectionRange = window.getSelection().getRangeAt(0);
    selectionContents = selectionRange.cloneContents(),
    fragmentContainer = document.createElement('div');
    fragmentContainer.appendChild(selectionContents);
alert(fragmentContainer.innerHTML);

In the above example, the alerted contents would collapse the trailin开发者_StackOverflow中文版g elements and return the string <em>ox</em> jumps over the lazy <strong>d</strong>.

How might I return the string ox</em> jumps over the lazy <strong>d?


You would have to effectively write your own HTML serialiser.

Start at the selectionRange.startContainer/startOffset and walk the tree forwards from there until you get to endContainer/endOffset, outputting HTML markup from the nodes as you go, including open tags and attributes when you walk into an Element and close tags when you go up a parentNode.

Not much fun, especially if you are going to have to support the very different IE<9 Range model at some point...

(Note also that you won't be able to get the completely raw original HTML, because that information is gone. Only the current DOM tree is stored by the browser, and that means details like tag case, attribute order, whitespace, and omitted implicit tags will differ between the source and what you get out.)


Looking at the API's, I don't think you can extract the HTML without it being converted to a DocumentFragment, which by default will close any open tags to make it valid HTML.

See Converting Range or DocumentFragment to string for a similar Q.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜