开发者

REGEXR help - how to extract a year from a string

I have a year listed in my string

$s = "Acquired by the University in 1988";

In practice, that could be anywhere in this singl开发者_JAVA百科e line string. How do I extract it using regexr? I tried \d and that didn't work, it just came up with an error.

Jason

I'm using preg_match in LAMP 5.2


You need a regex to match four digits, and these four digits must comprise a whole word (i.e. a string of 10 digits contains four digits but is not a year.) Thus, the regex needs to include word boundaries like so:

if (preg_match('/\b\d{4}\b/', $s, $matches)) {
    $year = $matches[0];
}


Try this code:

<?php
  $s = "Acquired by the University in 1988 year.";
  $yr = preg_replace('/^[^\d]*(\d{4}).*$/', '\1', $s);
  var_dump($yr);
?>

OUTPUT:

string(4) "1988"

However this regex works with an assumption that 4 digit number appears just once in the line.


Well, you could use \d{4}, but that will break if there's anything else in the string with four digits.

Edit:

The problem is that, other than the four numeric characters, there isn't really any other identifying information (as, according to your requirements, the number can be anywhere in the string), so based on what you've written, this is probably the best that you can do outside of range checking the returned value.

$str = "the year is 1988";
preg_match('/\d{4}/', $str, $matches);

var_dump($matches);


/(^|\s)(\d{4})(\s|$)/gm

Matches

Acquired by the University in 1988
The 1945 vintage was superb
1492 columbus sailed the ocean blue

Ignores

There were nearly 10000 people there!
Member ID 45678
Phone Number 951-555-2563

See it in action at http://refiddle.com/10k


preg_match('/(\d{4})/', $string, $matches);


For a basic year match, assuming only one year

$year = false;
if(preg_match("/\d{4}/", $string, $match)) {
  $year = $match[0];
}

If you need to handle the posibility of multiple years in the same string

if(preg_match_all("/\d{4}/", $string, $matches, PREG_SET_ORDER)) {
  foreach($matches as $match) {
    $year = $match[0];
  }
}


/(?<!\d)\d{4}(?!\d)/ will match only 4-digit numbers that do not have digits before or after them.

(?<!\d) and (?!\d) are look-behind and look-ahead (respectively) assertions that ensure that a \d does not occur before or after the main part of the RE.

It may in practice be more sensible to use \b instead of the assertions; this will ensure that the beginning and end of the year occur at a "word boundary". So then "1337hx0r" would be appropriately ignored.

If you are only for looking for years within the past century or so, you could use

/\b(19|20)\d{2}\b/


Also if your string is something like that :

$date = "20044Q";

You can use below code to extract year from any string.

preg_match('/(?:(?:19|20)[0-9]{2})/', $date, $matches);
echo $matches[0];
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜