Convert French to ASCII (French speakers are wanted)
i need to convert French text to most correct analogue in ASCII. Let me explain. In German you should convert ä to ae, this is not simple removing of diacritics, it is finding most correct analogue. Please help me with French. I found that there is no programmatic way to do it, i create Dictionary<char, string>
.
To convert (+ capitals): é, à, è, ù, â, ê, î, ô, û, ë, ï, ü, ÿ, ç. and any other you suggest! Please write suggeste开发者_JS百科d substitution in ascii.
Thanks, Andrey.
PS: Please don't point to How do I remove diacritics (accents) from a string in .NET?. That method is great but a bit language agnostic. It just strips diacritics. I plan to use it as a default if i don't have good analogue.
PPS: Pleas don't close the question, it is related to programming, since i implement multingual app
As far as I know, when accents aren't available in French (ie, when converting to ASCII) you simply type the equivalent ASCII character (unlike German, where you can add an e after the vowel with the umlaut). Just for the accents you provided, I've never seen ÿ used in French. Don't forget æ and œ.
Normally, when accents aren't available, we simply don't write them.
If you want to retain the information, you need to use some kind of encoding, to indicate which character set is being used, and use more than ascii (that is, use characters 128 to 255 of the charset).
alternatively, you could encode in a form of your own. Sparcstations had a way of entering accented characters:
à = \a`
â = \a^
ç = \c,
é = \e'
ë = \e"
etc.
But it's an encoding method, for storing the data, not a transliteration method, for writing it down for French readers. I'm afraid we haven't adopted an alternative to the accents yet.
精彩评论