Regex for replacing à,Á,Ä etc. -> a, Õ,ò, etc. -> o
Western Latin character set contains characters such as开发者_JAVA技巧 À Á Â Ã Ä Å which have all the same standard char 'a' as 'radix'. This happens on e,i,o,etc. as well. Is there a regex for replacing these variations to their 'radix' characters?
This would be used to create a seo friendly url from a text (but not limited to):
Example: La cena è pronta => La cena e pronta
Try this:
string str = "La cena è pronta àèéìòùçæÀÈÉÌÒÙÇÆ";
str = str.Normalize(NormalizationForm.FormD); // Or use NormalizationForm.FormKD
str = Regex.Replace(str, @"\p{Mn}", "");
// Result: La cena e pronta aeeioucæAEEIOUCÆ
But note that Æ
remains Æ
.
精彩评论