Is FILTER_SANITIZE_EMAIL pointless if already using FILTER_VALIDATE_EMAIL?

2023-04-01 22:14 问答作者：

I am just creating a registration form, and I am looking only to insert valid and safe emails into the database.

Several sites (including w3schools) recommend running FILTER_SANITIZE_EMAIL before running FILTER_VALIDATE_EMAIL to be safe; however, this could change the submitted email from an invalid into a valid email, which could not be what the user wanted, for example:

The user has the email address jeff!@gmail.com, but accidentally inserts jeff"@gmail.com.

FILTER_SANITIZE_EMAIL would remove the " making the email jeff@gmail.com which FILTER_VALIDATE_EMAIL would say is valid even though it's not the user's actual email address.

To avoid this problem, I plan only to run FILTER_VALIDATE_EMAIL. (assuming I don't intend to output/process any emails declared invalid)

This will tell me whether or not the email is valid. If it is then there should be no need to pass it through FILTER_SANITIZE_EMAIL because any illegal/unsafe characters, would've already caused the email to be returned invalid, correct?

I also don't know of any email approved as valid by FILTER_VALIDATE_EMAIL that could be used for injection/xss due to the fact that white spaces, parentheses () and semicolons would invalidate the email. Or am I wrong?

(note: I wi开发者_如何学运维ll be using prepared statements to insert the data in addition to this, I just wanted to clear this up)

Here's how to insert only valid emails.

<?php
$original_email = 'jeff"@gmail.com';

$clean_email = filter_var($original_email,FILTER_SANITIZE_EMAIL);

if ($original_email == $clean_email && filter_var($original_email,FILTER_VALIDATE_EMAIL)){
   // now you know the original email was safe to insert.
   // insert into database code go here. 
}

FILTER_VALIDATE_EMAIL and FILTER_SANITIZE_EMAIL are both valuable functions and have different uses.

Validation is testing if the email is a valid format. Sanitizing is to clean the bad characters out of the email.

<?php
$email = "test@hostname.com"; 
$clean_email = "";

if (filter_var($email,FILTER_VALIDATE_EMAIL)){
    $clean_email =  filter_var($email,FILTER_SANITIZE_EMAIL);
} 

// another implementation by request. Which is the way I would suggest
// using the filters. Clean the content and then make sure it's valid 
// before you use it. 

$email = "test@hostname.com"; 
$clean_email = filter_var($email,FILTER_SANITIZE_EMAIL);

if (filter_var($clean_email,FILTER_VALIDATE_EMAIL)){
    // email is valid and ready for use
} else {
    // email is invalid and should be rejected
}

PHP is open source, so these questions are easily answered by just using it.

Source for FILTER_SANITIZE_EMAIL:

/* {{{ php_filter_email */
#define SAFE        "$-_.+"
#define EXTRA       "!*'(),"
#define NATIONAL    "{}|\\^~[]`"
#define PUNCTUATION "<>#%\""
#define RESERVED    ";/?:@&="

void php_filter_email(PHP_INPUT_FILTER_PARAM_DECL)
{
    /* Check section 6 of rfc 822 http://www.faqs.org/rfcs/rfc822.html */
    const unsigned char allowed_list[] = LOWALPHA HIALPHA DIGIT "!#$%&'*+-=?^_`{|}~@.[]";
    filter_map     map;

    filter_map_init(&map);
    filter_map_update(&map, 1, allowed_list);
    filter_map_apply(value, &map);
}

Source for FILTER_VALIDATE_EMAIL:

void php_filter_validate_email(PHP_INPUT_FILTER_PARAM_DECL) /* {{{ */
{
const char regexp[] = "/^(?!(?:(?:\\x22?\\x5C[\\x00-\\x7E]\\x22?)|(?:\\x22?[^\\x5C\\x22]\\x22?)){255,})(?!(?:(?:\\x22?\\x5C[\\x00-\\x7E]\\x22?)|(?:\\x22?[^\\x5C\\x22]\\x22?)){65,}@)(?:(?:[\\x21\\x23-\\x27\\x2A\\x2B\\x2D\\x2F-\\x39\\x3D\\x3F\\x5E-\\x7E]+)|(?:\\x22(?:[\\x01-\\x08\\x0B\\x0C\\x0E-\\x1F\\x21\\x23-\\x5B\\x5D-\\x7F]|(?:\\x5C[\\x00-\\x7F]))*\\x22))(?:\\.(?:(?:[\\x21\\x23-\\x27\\x2A\\x2B\\x2D\\x2F-\\x39\\x3D\\x3F\\x5E-\\x7E]+)|(?:\\x22(?:[\\x01-\\x08\\x0B\\x0C\\x0E-\\x1F\\x21\\x23-\\x5B\\x5D-\\x7F]|(?:\\x5C[\\x00-\\x7F]))*\\x22)))*@(?:(?:(?!.*[^.]{64,})(?:(?:(?:xn--)?[a-z0-9]+(?:-+[a-z0-9]+)*\\.){1,126}){1,}(?:(?:[a-z][a-z0-9]*)|(?:(?:xn--)[a-z0-9]+))(?:-+[a-z0-9]+)*)|(?:\\[(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){7})|(?:(?!(?:.*[a-f0-9][:\\]]){7,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?)))|(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){5}:)|(?:(?!(?:.*[a-f0-9]:){5,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3}:)?)))?(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))(?:\\.(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))){3}))\\]))$/iD";

pcre       *re = NULL;
pcre_extra *pcre_extra = NULL;
int preg_options = 0;
int         ovector[150]; /* Needs to be a multiple of 3 */
int         matches;


/* The maximum length of an e-mail address is 320 octets, per RFC 2821. */
if (Z_STRLEN_P(value) > 320) {
    RETURN_VALIDATION_FAILED
}

re = pcre_get_compiled_regex((char *)regexp, &pcre_extra, &preg_options TSRMLS_CC);
if (!re) {
    RETURN_VALIDATION_FAILED
}
matches = pcre_exec(re, NULL, Z_STRVAL_P(value), Z_STRLEN_P(value), 0, 0, ovector, 3);

/* 0 means that the vector is too small to hold all the captured substring offsets */
if (matches < 0) {
    RETURN_VALIDATION_FAILED
}

}

I read the same article and thought the same thing: Simply changing an invalid variable is not good enough. We need to actually tell the user that there was a problem, instead of just ignoring it. The solution, I think, is to compare the original to the sanitized version. I.e. to use the w3schools example, just add:

$cleanfield = filter_var($field, FILTER_SANITIZE_EMAIL);

if ($cleanfield != $field){
    return FALSE;
}

The "proper" way of doing this is asking for the user's email two times (which is common/good practice). But to answer your question, FILTER_SANITIZE_EMAIL is not pointless. It's a filter that sanitizes emails and it does its job well.

You need to understand that a filter that validates either returns true or false whereas a filter that sanitizes actually modifies the given variable. The two do not serve the same purpose.

always use validation filters early at moment of input, while sanitization is better used late in output as it clean the value before it reach the user

继续阅读：php xss

Is FILTER_SANITIZE_EMAIL pointless if already using FILTER_VALIDATE_EMAIL?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？