"Learning" filter engines
Are there any "intelligent" or "learning" engines out there, that are able to identify "evil" phrases in texts ( maybe something like a learning Spamfilter... e.g. used in Thunderbird? )
For example if i want to filter texts with mailadresses:
asdasd asd as d dgfdgfdgfdg sadasd(at)asfsdf.com
At first the tool wouldn't recognize this as an emailadress... but if the user "teached" (开发者_如何学运维 clicked a "text contains an mailadress"-button for example ) the tool several times, that text which contains phrases like "xxxxx(at)xxxxx.xx" is suspicious, it "learns" that it should mark these text automatically in the future...
Question: Is there anything like it on the market? I foudn some libs ( like SpamAssasin, etc. ) but these are "specialized" on emails...
The general idea you are talking about is a Bayesian filter. Maybe that will help you in your searches.
Edit: A few other examples:
- Python
- Java
- .NET
- Ruby
Yeah, this seems to be good start: http://nbayes.codeplex.com/ ( C# implementation of the bayesian algorithm )
精彩评论