开发者

how do I block my rails app from being hit by bots?

I'm not even sure I'm using the right terminology, whether this is actually bots or not. I didn't want to use the word 'spam' because it's not like I have comments or posts that are being created/spammed. It looks more like something is making the same repeated request to my domain, which is what made me think it was some kind of bot.

I've opened my first rails app to the 'public', which is a really a small group of users, <50 currently. That was last Friday. I started having performance issues today, so I looked at the log and I see tons of these RoutingErrors

ActionController::RoutingError (No route matches "/portalApp/APF/pages/business/util/whichServer.jsp" with {:method=>:get}):

They are filling up the log and I'm assuming this is causing the slowdown. Note the .jsp on the end and this is a rails app, so I've got no urls remotely like this in my app. I mean, the /portalApp I don't even have, so I don't know where this is coming from.

This is hosted at Dreamhost and I chatted with one of their support people, and he suggested a couple sites that detail using htaccess to block things. But that looks like you need to know the IP or domain that the requests are coming from, which I don't.

How can I block this? How can I find the IP or domain from the request? Any other suggestions?


Follow up info:

After looking at the access logs, it looks like it's not a bot. Maybe I'm not reading the logs right, but there are valid url requests (generated from within my Flex app) coming from the same IP. So now I'm wondering if it's some kind of plugin generating the requests, but I really don't know. Now I'm wondering if it's possible to block a certain 开发者_JAVA技巧url request, based on a pattern, but I suppose that's a separate question.


Old question, but for people who are still looking for alternatives I suggest checking out Kickstarter's rack-attack gem. Allows not only blacklisting and whitelisting, but also throttling.


These page seems to offer some good advice: Here

The section on blocking by user agent may be something you could look at implementing. Is there anyway you can get the useragent from the bot from your logs? If so look for the unique aspect of the useragent that probably identifies the bot and add the following to .htaccess replacing the relevant bits

BrowserMatchNoCase SpammerRobot bad_bot
Order Deny,Allow
Deny from env=bad_bot

Its detail on that link in more detail and of course, if you can't get the useragent from your logs then this will be of no use to you!


You can also update your public/robots.txt file to allow/disallow robots.

http://www.robotstxt.org/wc/robots.html

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜