Web client classification using Artifical Neural Network
I have a high traffic web si开发者_StackOverflowte.
I want to create software which analyses client requests on-the-fly and decide if they come from a real user or a botnet bot. For training the neural network to identify legitimate ("good") users I can use logs when there are no DDoS activity. Once trained, the network would distinguish real users from bots.What I have:
- request URI (and order)
- cookie
- user agent
- request frequency.
Any ideas on how to best design ANN for this task and how to tune it?
Edit: [in response to comments about the overly broad scope of this question]
I currently have a working C# program which blocks clients on the basis the frequency of identical requests. Now I'd like to improve its "intelligence" with a classifier based on neural network.
I don't know how to normalize these inputs for ANN and I need suggestions in this specific area.This isn't really suited to neural networks. Neural networks are great provided (as a rough guide):
- You can spare the processing power,
- The data is not temporal,
- The input data is finite,
I don't think that you really pass any of these.
Re: normalizing the inputs: you map your input data to a set of symbols (which are then turned into numbers) or you map the inputs to a floating point number where the number represents some degree of intensity. You can map any kind of data to any kind of scheme but you would really only want to use ANN's when the problem solution is nonlinear (all the data for one classification of another classification CAN'T be clustered on one side of a line with all the data for the other classification on the other side of the line). In both cases you end up with a vector of inputs associated with an output ([BOT, HUMAN], or [BOT, HUMAN, UNKNOWN] or [BOT, PROBABLY-BOT, PROBABLY-HUMAN, HUMAN], etc).
How do you distinguish between two users coincidentally submitting the exact same book request equentially in time (let's assume you are selling books)?
精彩评论