开发者

How to track links within a website

so I am transferring an old website to a new server, and attempting cleanup in the process.

What I am looking for is some script or free software that can:

a) show the paths through the website (following hyperlinks, etc), so I can see what links to what

an开发者_如何学Pythond b) some software than can see which html files are orphans (not linked to) in the folder structure.

Any help with either or both of these would be greatly appreciated :)


http://haveamint.com/ says it all, Beautiful GUI, Simple integration, Light Weight, Database Storage, JavaScript Tracking.

Have a mint (y)

Or you can just use Google analytic's witch is pretty much used by every site these days


a) show the paths through the website (following hyperlinks, etc), so I can see what links to what

So basically a crawler? You could whisk something together with an http-library, an html parser and any brand of scripting language. I don't know of any off-the-shelf scripts though.

and b) some software than can see which html files are orphans (not linked to) in the folder structure.

Does your site consist of plain html files, or is there some sort of server-side technology, such as PHP? If so, there is no way of automatically detecting said orphans, since they are generated as a function of the server side application and aren't actual pages, even though they may appear as such in a browser.


a) depending on the complexity of your site and how dynamic the content is you can download any spider and restrict it to your wevsite and check the results("burp suite" contains a pretty good spider and is alltogether a tool that everyone should know).

b) after the spider have done its work check the access time of all the files in your wevsites directory any file that has an access time older than the spider execution time is probably an orphan.

(both solutions will be less effective on a website that use user input to reffer to pages)


home.snafu.de/tilman/xenulink.html (Xenulink) provides link spidering, and, with FTP access, orphan file checking.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜