开发者

How to solve €25.99 vs 25,99€ preg_match problem?

If I have these strings:

$string1 = "This book costs €25.99 in our shop."

and on the other side

$string2 = "This book costs 25,99€ in our shop."

How to get the "€25.99" or "25,99€" using preg_match ? How will the code look like?

Please, notice that there are 2 ways of writing the euro symbol. The correct way in EU is to write the symbol after the number like 25,99€ and using comma as desimal separator. However, a lot of US people are stuck to the dollar way (€25.99) and dot as desimal separator.

How to do this check for both cases and 开发者_C百科get the value with symbol in the cleanest and most effiecient way?


Here's the raw regex: €\d+(?:[,.]\d+)?|\d+(?:[,.]\d+)?€

preg_match ( "/€\d+(?:[,.]\d+)?|\d+(?:[,.]\d+)?€/" , $string1, $matches)

If you want to consider optional spaces between euro and the value, use this:

preg_match ( "/€ ?\d+(?:[,.]\d+)?|\d+(?:[,.]\d+)? ?€/" , $string1, $matches)


agent-j's pattern is on the right track, but I would do something slightly more restrictive:

/€\d+(:?[.,]\d{2})?|\d+(:?[.,]\d{2})?€/

The only difference is that the decimal part is limited to 2 places, if it exists. I don't think you want to allow something like 99,999€, especially since that could mean "99 thousand, 999 euros" if written in the American style.

What I think you're trying to get at in your reference to the cleanest and most efficient way is that the above pattern seems awkward and redundant when you look at it. It's basically the \d+(:?[.,]\d{2})? portion repeated twice, with the € symbol switching sides. This feels wrong, but it isn't. You can't really get around it without bringing in just as much complexity, if not more. Even if you try to get around it with fancy lookarounds, it's going to look something like this:

/^(?=.*€)€?\d+(:?[.,]\d{2})?((?<!€.*)€)?$/

Clearly not an improvement. Sometimes the most obvious solution is the best one, even if it makes you feel dirty.

Note: If you want to get really crazy with it, you can try a variation (caution: untested, and I haven't done much PHP in a while):

$inner = "(:?\d{1,3}(?:([.,])\d{3})*(?:(?!\1)[.,]\d{2})?|\d*(?:[.,]\d{2})?)";

Usage:

preg_match ( "/€" . $inner . "|" . $inner . "€/", $string1, $matches)

That should also accept things like 99,999.99; 999999,99; 9.999.999,99; .99; etc.


Check for both cases:

/([$€]?[\d,]+[$€]?)/

The ? makes the [$€] optional (literally '0 or 1 of...'), so you'd have to check for the degenerate case where there's just a bare number with no currency symbol at all.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜