Application/Script for checking "unseriousness" possible?
I have a classifieds website where users may sell/buy just about anything...
My issue which costs the company alot of money, time etc is that A开发者_StackOverflow社区LL classifieds must be reviewed by a physical person (employee) before beeing posted on the website.
So when you create a new classified enty, you get the message like "Your ad will be reviewed against our policy and then postedn withing two hours".
So a person must actually check if there are curses, discrimination, unseriousness etc etc...
My question:
Do you think it is possible to create a php code for checking all these things instead of hiring people to do it? For instance, how have Ebay solved this?
Blacklisting words is easy, and checking for dublicate entries also, but what about discrimination, and "unseriousness"?
I don't think you can fully 100% automate it - but you can make the reviewers' job easier.
You could create an app for that, that assigns a "rating" to a classified.
The more serious a classified appears, the higher the score. Per "possible infraction" (bad word list, too short message, bad grammar, bad formatting, bad typing) you lower the score.
You could then implement a "too low score gets rejected automatically".
You could offer the reviewers a system to rate the "higher scoring items" first (do take the date-posted into account too, so low scoring posts are sure to be evaluated - just later). This will improve their efficiency.
Show the reviewers the offended rules ("this post probably has bad grammar", highlight the blacklisted words, ...). Maybe allow them to add bad words (and a penalty modifier, eg -0.5).
But look at how professional sites do it: there is a "flag" button under each post - have the community help you out. They flag a post, a moderator goes to check.
My suggestion: do away with manually checking the posts. Once you regulate content, you become liable for all the content on the website. Accepting posts without any form of moderation will eliminate a great deal of liability. In order to maintain the quality and prevent undesireable content, though, you can add the ability for other users to "flag" content as inappropriate, which would allow you to manually review only the subset of content that has been flagged, without having to review all content.
Now, to answer your actual question,... you can automate filtering using machine learning techniques. Do not expect this automated filtering to be 100% accurate, however. You will have to experiment with different types of features and different ML algorithms, but I'd aim for something in the 90% range, and expect at least something 80% or more accurate. That said, I wouldn't even bother with that, because unless you have a very low false positive rate, you will annoy people for blocking legitimate posts, and allowing users to flag inappropriate content is typically sufficient. You could also provide a way for users to rate each other's posts. Crowdsourcing is a fairly effective technique for this kind of thing.
Also, I should add one last thing... if you still want to have people manually review posts or if you want to get a large number of posts manually rated in order to have a sufficiently large labeled dataset for training your machine learning algorithm, you might be interested in Mechanical Turk, which allows you to leverage a very large number of people really, really cheaply.
No.
A computer cannot understand free text in a reliable way - you need a human eye.
There are tools to filter and identify spam (e.g. Akismet), but not jokes, hate-speech, off-topic posts and such.
 
         加载中,请稍侯......
 加载中,请稍侯......
      
精彩评论