开发者

Generate a list of all the pages contained in a website programmatically, using PHP

How is it possibe to generate a list of all the pages o开发者_JS百科f a given website programmatically using PHP?

What I'm basically trying to achieve is to generate something like an sitemap, in nested unordered list with links for all the pages contained in a website.


If all pages are linked to one another, then you can use a crawler or spider to do this.

If there are pages that are not all linked you will need to come up with another method. You can try this:

  1. Add an "image bug/web beacon/web bug" to each page you tracked as follows:
    OR
    alternatively add a javascript function to each page that makes a call to /scripts/logger.php You can use any of the javascript libraries that make this super simple like Jquery, Mootools, or YUI.
  2. Create the logger.php script, have it save the request's originating URL somewhere like a file or a database.

Pros: - Fairly simple

Cons:

  • Requires edits to each page
  • Pages that aren't visited don't get logged

Some other techniques that don't really fit your need to do it programatically but may be worth considering include:

  • Create a spider or crawler
  • Use a ripper such as CURL, or Teleport Plus.
  • Using Google Analytics (similar to the image bug technique)
  • Use a log analyzer like Webstats or a freeware UNIX webstats analyzer


You can easly list the files with the glob function... But if the pages uses includes/requires and other stuff to mix multiple files into "one page" you'll need to import the Google "site:mysite.com" search results.. Or just create a table with the URL of every page :P

Maybe this can help: http://www.xml-sitemaps.com/ (SiteMap Generator)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜