开发者

I need a PHP regular expression to validate string format of 5 digits, one comma

I have a huge PHP input box on a webpage. This input should only take 5 digit string separated by commas:

00100,00247,90277,97030,00657

notice the last one has no comma at the end.

Is there a regular expression that can do this? Since the input box is very large and can take 100+ of these items, I want to validate it on the PHP server side before the database is queried and those avoid any SQL In开发者_如何学Pythonjection tries.

Query is only run if only 5 numbers and a comma in the sequence, except for the last one. These are a state's public water system ID's by the way.


I believe this will get the result you're looking for, though explode may be the better option.

/^(?:\d{5},)*\d{5}$/

This will only match 1 or more 5-digit numbers that are comma delimited with no spaces.


Since this is user submitted data, your validation should be more flexible. What if the user accidentally puts a space after one of the commas? Or a line break gets inserted?

I realize you are looking for a regex solution but may I suggest using explode to create an array and apply a rule to each element. Having them separated into elements allows more flexibility when validating and storing:

$nums = explode(',', '00100,00247,90277,97030,00657');
foreach ($nums as $num) {
    if (!preg_match('/^\d{5}$)/', trim($num))) {
        // error!
    }
}


I'd explode it and validate each string individually:

$input = '00100,00247,90277,97030,00657';
$input_array = explode(',', $input);
$is_valid = true;

foreach ($input_array as $number) {
  if (preg_match("/\\d/", trim($number)) != strlen(trim($number))) {
    $is_valid = false;
  }
}

print($is_valid);


I think you rather need str_getcsv:

while ($row = str_getcsv($fp)) {
    // $row is an array containing your digits
}


Simple. This regex matches a value having one or more comma separated 5-digit numbers:

if (preg_match('/^\d{5}(\s*,\s*\d{5})*$/', $value)) {
    // Good value
}

It allows whitespace between the numbers as well.


This might work:

/^\d{5}(?:,\d{5})*$/

edit 1 noticed ridgerunner has the same answer, so disregard this.


edit 2 some notes on performance.

Failure analysis

Backtracking give back on failure:

^\d{5}(?:,\d{5})*$ gives back ,\d{5}
^(?:\d{5},)*\d{5}$ gives back \d{5},

Post Backtracking regressive topography checks:
(After backtracking give back, checks are to the right of the one that gave back)

^\d{5}(?:,\d{5})*$ checks for $
^(?:\d{5},)*\d{5}$ checks for \d{5}$

Winner: ^\d{5}(?:,\d{5})*$

NON-Backtracking regex's (using possesive quantifier +):

^\d{5}(?:,\d{5})*+$ gives nothing back, fails immediately
^(?:\d{5},)*+\d{5}$ gives nothing back fails immediately

Benchmarks

Using a string of 50 blocks of \d{5},.
The sample string is matched against each regex in a loop of 100,000 times.
Failure was induced at the end of the string, removed for a sucess test.

Sucess:
All took 1 second to complete a sucessfull run.

Failure, Backtracking:
^\d{5}(?:,\d{5})\*$ took 1.2 seconds best
^(?:\d{5},)\*\d{5}$ took 1.6 seconds

Failure, Non-Backtracking:
^\d{5}(?:,\d{5})*+$ took .9 seconds
^(?:\d{5},)*+\d{5}$ took .9 seconds

Conclusions

Backtracking - Put the smallest post-backtracking check
after the backtracking sub-expression. In this case, the
smallest is $.
In general, put the required expressions ahead of the optional ones.
Best ^\d{5}(?:,\d{5})*$

NON-Backtracking - It doesn't matter.
^\d{5}(?:,\d{5})*+$ or ^(?:\d{5},)*+\d{5}$

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜