Cyrillic alphabet validation
I came across an interesting defect today the issue is I have a deployment of my web application in Russia and the name value "Наталья" is not returning true as alphaNumeric in the method below. Curious for some input on how people would开发者_高级运维 approach a problem like this! - Duncan
private boolean isAlphaNumeric(String str) {
return str.matches("[\\w-']+");
}
You have to use Unicode regex . for example \p{L}+
for any unicode letter. For more look in the java doc for java.util.Pattern
there is section called unicode support. Also, there are details here: link
In my case I have to check whether it's a name written in Russian.
I've ended up with this:
private static final String ruNameRegEx = "[А-ЯЁ][-А-яЁё]+";
and for the full name:
private static final String ruNamePart = "[А-яЁё][-А-яЁё]+";
private static final String ruFullNameRegEx = "\\s*[А-ЯЁ][-А-яЁё]+\\s+(" + ruNamePart + "\\s+){1,5}" + ruNamePart + "\\s*";)";
The last one covers some complex cases:
public class Test {
Pattern ruFullNamePattern = Pattern.compile(ruFullNameRegEx);
@Test
public void test1() {
assertTrue(isRuFullName("Иванов Василий Иванович"));
}
@Test
public void test2() {
assertTrue(isRuFullName(" Иванов Василий Акимович "));
}
@Test
public void test3() {
assertTrue(isRuFullName("Ёлкин Василий Иванович"));
}
@Test
public void test4() {
assertTrue(isRuFullName("Иванов Василий Аксёнович"));
}
@Test
public void test5() {
assertFalse(isRuFullName("иванов василий акимович"));
}
@Test
public void test6() {
assertFalse(isRuFullName("Иванов С.В."));
}
@Test
public void test7() {
assertTrue(isRuFullName("Мамин-Сибиряк Анна-Мария Иоановна"));
}
@Test
public void test8() {
assertTrue(isRuFullName("Хаджа Насредин Махмуд-Азгы-Бек"));
}
@Test
public void test9() {
assertTrue(isRuFullName("Хаджа Насредин ибн Махмуд"));
}
private boolean isRuFullName(String testString) {
Matcher m = ruFullNamePattern.matcher(testString);
return m.matches();
}
}
精彩评论