What's the best, easiest, free way to check in Java if a piece of text is spam? [closed]
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 days ago.
Improve this questionWhat's the best, easiest, free way to check in Java if a pi开发者_高级运维ece of text is spam?
It's not easy at all and requires to have some theoretical / mathematical / statistics background. It's called Bayesian filtering, it's just one of the methods but works great.
You can have an introduction and some background on wikipedia here, but it is a topic greatly covered over the internet, just search around (here on StackOverflow too I think).
Probably the easiest way is to leverage an existing API for that. Akismet has bindings for Java, and it's what Wordpress uses on its blogs by default. Oh, and it's free, libre, open source software.
You could pipe it through SpamAssassin and see what the return value is.
Here's a wacky idea: send the text as an email to a Gmail account. Then use IMAP to see whether it ended up in the Inbox or the Spam folder.
Akismet makes all that mathematical and logic for you, I think is the best way to avoid spam.
You only need to ask for you key related to your website. There's a free (voluntary paid) way.
A normal call through it's Java API would be like this, I use commentCheck
for that piece of text you're checking.
Akismet akismet = new Akismet(AKISMET_KEY, SITE);
return akismet.commentCheck(
request.getRemoteAddr(),
request.getHeader("User-agent"),
request.getHeader("referer"),
"", //permalink
"comment", //comment type
"", //author
"", //email
"",
commentText, //Text to check
request.getParameterMap());
If this call returns true
, it's considered spam.
精彩评论