How can I find files that aren't needed on my site so I can delete them?
I'm developing a website, and after testing different ways to do things, I know that I have many files on my site that are not being used, including HTML/PHP files, images, stylesheets, and external scripts. Is there some program I can use or something so I can find all of the files that I don't need so I can delete them?
I need to find all f开发者_JAVA百科iles that are safe to delete, don't have anything to do with the site anymore, and that deleting them won't have any effect on how my site works.
I've tried finding orphaned files in Dreamweaver, but it lists a lot of files that I do actually need.
Here's one idea: Crawl the site and create a list of every file you can find, then check anything that's not on that list. Wikipedia has a list of crawlers including some open source ones.
Xenu's linksleuth is the easiest way I've found.
http://home.snafu.de/tilman/xenulink.html
After you do the scan you have the option to put in your FTP info. If you do so, it will also generate a list of files that are not accessible (orphans).
How would you qualify unnecessary? That's something you need to be sure of before beginning this. I guess one way to garbage collect your site is to delete files not being referenced by any other files.
The idea with the crawler @Brendan to get all files that actually are used is very nice.
Then you can start deleting files from your website and after that use a program to find any broken links in your website like Xenu or LinkTiger or then one you prefer.
You can connect with some ftp application, and delete files manual. This is the safest way, because scripts and programs don't know what is needed and what not...
This did not exist at the time this question was asked, but there is a Python script called weborphans designed for this purpose.
Here's a blog entry by the author with some more info: Finding orphaned files on websites
精彩评论