开发者

php file got executed by alexa crawler and caused problems!

I've wrote a script that will be used to release the new pages automatically at a particular time. It will just show a countdown timer and then when it reaches 0 it will rename a particular file into index.php and renames the current index.php to index-modified.php

There's no problem开发者_高级运维 in this. But at some point time my customer told that my site is not coming.. I found that the index.php is renamed into index-modified.php and all other pages are working fine. And without index.php my site was showing 404 error.

Then i analyzed the access log and found the alexa crawler have accessed that release script and that caused the problem

I want to know how the alexa crawler had found my internal script file and crawled that?? Will it happen to all my internal admin purpose files? I dont have any links for that script at any of my pages.

I wonder how it could find the files that are present inside my server..????


I wonder how it could find the files that are present inside my server?

Probably because someone who accessed those files used the Alexa Toolbar

It only managed to do this because there are two things wrong with the script.

  1. It is not protected with an authentication/authorization layer.

  2. It makes a significant change on the server in response to a GET request. The HTTP spec provides GET for "safe" requests and POST for requests which do something.


index.php is the default PHP script name in a directory. It will be executed when you navigate to the directory without giving a filename.

To solve this use POST to invoke the modifications. If you can't do that, then at least give the script a name that is unlikely to be guessed.


You should use robots.txt and disallow spiders from crawling:

User-agent: *
Disallow: index.php


if you script is located within the htdocs (for apache) folder chances are the crawlers will find it and try to crawl it. What you can do is:

1) put a rule in robots.txt, here you can learn more about it : http://www.javascriptkit.com/howto/robots.shtml

This will advise crawlers not to execute the script, but won't forbid them to

2) put the script in a subfolder and protect it with a password - best in your case, REALLY what you don't want is random visitors or spiders to disable your web site. More about how to do that easy is .htaccess here:

http://www.javascriptkit.com/howto/htaccess3.shtml

Wish you best of luck, Marin

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜