开发者

How to "Validate" Human Names in CakePHP?

I have a PHP script that is supposed to check for "valid" human names, but recently cracked against a name with a space, so we added spaces to our validator.

Rather than doing this, is there a way to add a blacklist to CakePHP's validator to block all "invalid" characters, rather than allowing "valid" ones?

NOTE: I know ho开发者_如何学运维w to do this in PHP (generally) but using CakePHP's validator syntax is different.


I agree with the other comments that validating a name is probably a bad idea.

For virtually everything you can think of to validate, there will be someone with a name that breaks your rule. If you're happy with the idea that you're going to be blocking real people from entering their names, then you can validate it as much as you like. But the more validation rules you put in, the more likely you are to find a real person who can't sign in.

Here's a link to a page which describes some of the obvious (and not so obvious) things which people try to validate, which can trip them up:

http://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/

If you want to allow anybody onto your site, then the best you can really hope for is to force a maximum field length to fit the space you've allocated in your database. Even then you're going to annoy someone.


There is no way to "validate". How can you prevent someone really called:

Robert'); DROP TABLE Students; --

http://xkcd.com/327/


EDIT: What I really mean is, people in some countries may have their name in different language (say Japanese, Chinese, Korean) and may even contains symbols. How will you think if a site says your name is "INVALID" when he/she is entering their real names?


Don't make any assumptions about how a name may pe spelled. Accept any input (yes, any), and do proper escaping when displaying it, so you don't get XSS vulnerabilities.

I'd suggest you do this escaping in the model on afterFind(), so you don't forget it somewhere. Keep the original data in a separate field of the model, like ['unescaped_name'], if you need to access the plain data.


Custom Regular Expression Validation

var $validate = array(
    'name' => array(
        'rule' => '/^[^%#\/*@!...other characters you don\'t want...]+$/',  
        'message' => 'Only letters and integers, min 3 characters'
    )
);

This is too naïve an approach though, as you would have to blacklist almost the entire range of Unicode characters. You can pretty much only do whitelisting of basic latin characters plus common quirks like spaces and apostrophes. Any more than that and you'll fight an uphill battle you can't win. You may be able to create a reasonably good algorithm over time, but it will never be 100% foolproof. So either restrict your users to basic latin names (and hope not to alienate your audience) or skip the validation entirely*.

* Or invest a few years into developing an algorithm covering <100% of human names, working 99.9% of the time.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜