Fetch multiple, external URLs with GM_xmlhttpRequest, add page <H1> to links?

2023-03-13 04:32 问答作者：

SOLVED thanks to Hellion's help!

Here is the code:

// ==UserScript==
// @name          Facebook Comment Moderation Links
// @description   Appends story titles to Facebook Comment Moderation "Visit Website" links
// @include       http*://developers.facebook.com/tools/*
// ==/UserScript==

var allLinks, thisLink, expr, pageTitle, myURL, myPage, pageContent, title;

// grabbing URLs
function fetchPage(myPage, targetLink) {
        GM_xmlhttpRequest({
            method: 'GET',
            url: myPage,
            onload: function(response){

                // get the HTML content of the page
                pageContent = response.responseText;

                // use regex to extract its h1 tag
                pageTitle = pageContent.match(/<h1.*?>(.*?)<\/h1>/g)[0];

                // strip html tags from the result
                pageTitle = pageTitle.replace(/<.*?>/g, '');

                // append headline to Visit Website link
                title = document.createElement('div');
                title.style.backgroundColor = "yellow";
                title.style.color = "#000";
                title.appendChild(document.createTextNode(pageTitle));
                targetLink.parentNode.in开发者_Go百科sertBefore(title, targetLink.nextSibling);  

            }
        }); 
}


function processLinks() {

    // define which links to look for
    expr = "//a[contains (string(), 'Visit Website')]";
    allLinks = document.evaluate(
        expr,
        document,
        null,
        XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE,
        null);

    // loop through the links
    for (var i = 0; i < allLinks.snapshotLength; i++) {
        thisLink = allLinks.snapshotItem(i);    
        myURL = thisLink.getAttribute('href');

        // follow Visit Website link and attach corresponding headline
        fetchPage(myURL, thisLink);
    }
}

// get the ball rolling
processLinks();

--- EARLIER STUFF BELOW ---

I am trying to make a Greasemonkey script that fetches the URL from each of a set of links and appends the contents of the page's h1 tag to the end of the link.

So far, I can get it to show the URL itself, which doesn't require a page request, but not the page's h1 tag contents, which does.

I understand from other questions on this site that GM_xmlhttpRequest is asynchronous and I am pretty sure this is at least part of the cause. However I cannot find the solution to this specific problem.

Below is the code I have so far. It is for Facebook's website comment moderation tool -- in the Moderator View, each comment has a link, "Visit Website," that takes you to the article the comment is on.

As it is written right now, it would append the HTTP status code, not the page title, and then the URL to each "Visit Website" link. The status code part is just a placeholder. I plan on adding the HTML parsing, etc. to get the h1 tag later.

Right now I am just trying to get the GM_xmlhttpRequest and the content insertion to match up.

Any help is sorting this out would be greatly appreciated. Thank you!

var allLinks, thisLink, expr, pageTitle, myURL, pageContent, title;

// define which links to process
    expr = "//a[contains (string(), 'Visit Website')]";
    allLinks = document.evaluate(
        expr,
        document,
        null,
        XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE,
        null);

// cycle through links
for (var i = 0; i < allLinks.snapshotLength; i++) {

    thisLink = allLinks.snapshotItem(i);    
    myURL = thisLink.getAttribute('href');

    GM_xmlhttpRequest({
        method: 'GET',
        url: myURL,
        onload: function(responseDetails){

            pageTitle = responseDetails.status;

        }
    });

    // append info to end of each link 
    title = document.createElement('div');
    title.style.backgroundColor = "yellow";
    title.style.color = "#000";
    title.appendChild(document.createTextNode(
        ' [' + pageTitle + ' - ' + thisLink.getAttribute('href') + ']'));
    thisLink.parentNode.insertBefore(title, thisLink.nextSibling);  

}

As it's written, yes, you suffer from the asynchronous nature of the GM_xmlhttpRequest() call. The loop will fire off and start fetching all the pageTitle values, but will immediately continue on, not waiting for the requests to complete, and so pageTitle (which you didn't declare anywhere, by the way) is null when you use it for the textNode.

The first step you need to take to rectify the situation is to move all of the stuff that currently follows the GM_xmlhttpRequest() call to the inside of the onload: function() definition. Then, only after each page has been retrieved will you continue on with modifying your links. (There may be other issues with needing to pass in or reacquire the thislink value too, I'm not sure.)

You can change the following 3 lines to only 1 line:

            // get the HTML content of the page
            pageContent = response.responseText;

            // use regex to extract its h1 tag
            pageTitle = pageContent.match(/<h1.*?>(.*?)<\/h1>/g)[0];

            // strip html tags from the result
            pageTitle = pageTitle.replace(/<.*?>/g, '');

             pageTitle = $('h1', response.response).text();

继续阅读：greasemonkey javascript

Fetch multiple, external URLs with GM_xmlhttpRequest, add page <H1> to links?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？