Nesting [ ] with PHP RegExp
I'm trying to ensure that a string in PHP has only letters, hyphens or apostraphes. To accomplish this I wanted to make a range of valid characters using [ ]. S开发者_JS百科o my idea was to do this:
[[A-Za-z]-'] // Weird syntax highlighting here
Will this work? Is it possible to nest brackets like that? This is meant to match a single character that is either a letter, a hyphen, or an apostraphe. I may be approaching the problem naively and that's OK, I just wanted to know if putting brackets within brackets like this is legal in PHP. Thanks!
I'm assuming you're using this in one of the regular expression matching functions (like preg_match("[[A-Za-z]-']*", ...)
, and in that case, it's a question not of PHP syntax, but regular expression syntax. And the answer is no, you can't nest brackets like that. If you want a regular expression that matches only a letter, hyphen, or apostrophe, use [A-Za-z'-]
. (The hyphen goes last so that the regex engine knows that it's not representing a range of characters like A-Z
. Alternatively you can escape the hyphen with a backslash, then you can put it anywhere: [A-Za-z\-']
.)
I don't understand.
What's wrong with
[A-Za-z'-]
?
[\pL\p{Pd}'ʹ’]
That matches:
- any Letter character
- any Dash Punctuation character
- U+0027 APOSTROPHE (which is not the preferred form)
- U+02B9 MODIFIER LETTER PRIME
- U+2019 RIGHT SINGLE QUOTATION MARK
To ensure that a string contains only the desired characters you can do it two ways:
- You know its good if all chars in the string are valid.
- You know its bad if any one char in the string is invalid.
Here is a PHP snippet that demonstrates both methods:
// Method 1: Good if all chars in the string are valid.
$re_all_valid = '/^[A-Za-z\-\']*$/';
if (preg_match($re_all_valid, $text)) {
echo("GOOD: String contains all valid characters.\n");
} else {
echo("BAD: String does NOT contain all valid characters.\n");
}
// Method 2: Bad if any one char in the string is invalid.
$re_one_invalid = '/[^A-Za-z\-\']/';
if (preg_match($re_one_invalid, $text)) {
echo("BAD: String contains one invalid character.\n");
} else {
echo("GOOD: String does NOT contain one invalid character.\n");
}
Notes: Method 1 requires anchors at both ends of the string and a quantifier applied to the positive character class. Method 2 uses a negated character class and only needs to match one character in the string. Method 2 is likely more efficient.
精彩评论