Whitelist for player to player chat in kids game
We're developing an educational multiplayer game for kids and want to allow players to chat with each other using a whitelist system. When using whitelist chat, players will be able to type only words which appear in the whitelist.
We're aware of the limitations of whitelists in 开发者_开发技巧general, but we think a whitelist chat system is something that would allow our players to express themselves better in the game, while allowing a higher level of security than moderated or blacklist chat.
While the system is easy enough to implement, we haven't been able to find a sample whitelist of "safe" words online. Does anyone know of where we can find such a list, preferably with a license that allows us to use it in a commercial project?
Thanks.
I do not believe that a simple whitelist of words will cut it. There are quite a few euphemisms for a lot of stuff out there, that a whitelist would never block (e.g. "he is growing like a weed" is fine, "he is growing weed" is NOT). And let's not mention the basic "would you like to meet?" which would be fine if the meeting were to happen in-game, but very dangerous if it were to happen out of it. Then there is also the issue of blocking rare, foreign or mistyped words, that might make your chat system frustrating enough that it would not be used.
In my opinion, there is absolutely no way you could ever match the security offered by an active and competent human moderator. Of course, depending on the volume of chat traffic and any real-time requirements there are quite a few practical issues with using humans for this. Considering that your application is targeted at children, however, human moderation might be quite acceptable, despite its much higer cost.
A second choise, but one very far from the abilities of human moderation, is to use some statistical filter such as Bogofilter, which will happily sort arbitrary text if you train it well. A blacklist would also help to immediately cut down messages with words that little kids should not (but usually do) know. You would also need a bunch of filters that would cut down messages with stuff like telephone numbers, email and street addresses and web links.
Perhaps the method with the best effectiveness/cost ratio would be to use human moderators assisted by multiple statistical filters to better make use of their time. Keep in mind, however, that if there are malicious users (i.e. anything else than same-age kids in a classroom) there is no way to make sure that nothing questionable or dangerous ever goes through.
You can try the standard unix dictionary. /usr/share/dict/words. But you'll have to modify it to remove the naughty words.
http://en.wikipedia.org/wiki/Words_%28Unix%29
http://www.openwall.com/wordlists/
While this doesn't exactly answer your question, Runescape uses a white list of phrases, rather than words.
The implementation in Runescape is awkward, because there are so many phrases to choose from. You have to go through 3 or 4 menus sometimes to get to the phrase you want.
If you can come up with a better organization of phrases, then this might work for you.
精彩评论