Chrome extension read innerHTML of the current page?

2023-03-28 09:49 问答作者：

Hi this may be a silly question, but I can't find the answer anywhere. I'm writing a chrome extension, all I need is to read in the html of the current page so I can extract some data from it.

here's what I have so far:

<script>
    window.addEventListener("load", windowLoaded, fal开发者_如何学JAVAse);
    function windowLoaded() {
        alert(document.innerHTML)
      });
    }
</script>

Can anybody tell me what I'm doing wrong? thanks,

function windowLoaded() {
    alert('<html>' + document.documentElement.innerHTML + '</html>');
}
addEventListener("load", windowLoaded, false);

~~Notice how windowLoaded is created before it is used, not after, which won't work.~~

Also notice how I am getting the innerHTML of document.documentElement, which is the html tag, then adding the html source tags around it.

I'm writing a chrome extension, all I need is to read in the html of the current page so I can extract some data from it.

I think an important answer here is not the correct code to use to alert the innerHTML but how to get the data you need from what's already been rendered.

As pimvdb pointed out, your code isn't working because of a typo and needing document.documentElement.innerHTML, something you can diagnose in the Chrome console (Ctrl+Shift+I). But that's secondary to why you'd want the inner HTML in the first place. Whether you're looking for a certain node, specific text, how many <div> elements exist, the value of an ID, etc., I'd heavily recommend the use of a library like jQuery (vanilla JS works, but it can be verbose and unwieldy). Instead of reading in all the HTML and parsing it with string functions or regex, you probably want to take advantage of all the DOM parsing functionality already available to you.

In other words, something like this:

$("#some_id").val();                      // jQuery
document.getElementById("some_id").value; // vanilla JS

is probably way safer, easier and more readable than something eminently breakable like this (probably a bit off here, but just to make a point):

innerHTML.match(/<[^>]+id="some_id"[^>]+value="(.*?)"[^>]*?>/i)[1];

Use document.documentElement.outerHTML. (Note that this is not supported in Firefox; irrelevant in your case.) However, this is still not perfect as it doesn't return nodes outside the root element (!doctype and possibly some comments or processing instructions). The document.innerHTML property is, AFAIK, specified in HTML5 specification, but currently not supported in any browser.

Just FYI, navigating to view-source:www.example.com also displays the entire markup (Chrome & Firefox). But I don't know whether you can work with it somehow.

window.addEventListener("load", windowLoaded, false);

function windowLoaded() {
    alert(document.documentElement.innerHTML);
}

You had a } with no purpose, and the }); should just be }. These are syntax errors.

Also, it's document.documentElement.innerHTML, since it's not a property of document.

继续阅读：google-chrome-extension innerhtml javascript

Chrome extension read innerHTML of the current page?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？