Recognizing tamil string and process them using c or c++ and the use of unicode

2023-04-02 21:44 问答作者：

The input is given in a language with a script other than the roman al开发者_开发问答phabets.A program in c or c++ must recognize them..

How do i take input in Tamil and split it into letters so that i can recognize each Tamil alphabet?

how do i use wchar_t and locale?

The C++ standard libraries do not handle Unicode completely, neither does C; you'd be better off using a library like Boost, which is cross platform

Including and using WinAPI and windows.h allow's you to use Unicode, but only on Win32 programs.

See here for a previous rant of mine on this subject.

Assuming that your platform is capable of handling Tamil characters, I suggest the following sequence of events:

I. Get the input string into a wide string:

#include <clocale>

int main()
{
  setlocale(LC_CTYPE, "");
  const char * s = getInputString(); // e.g. from the command line

  const size_t wl = mbstowcs(NULL, s, 0);
  wchar_t * ws = new wchar_t[wl];
  mbstowcs(ws, s, wl);
  //...

II. Convert the wide string into a string with definite encoding:

#include <iconv.h>

// ...

iconv_t cd = iconv_open("UTF32", "WCHAR_T");
size_t iin = wl;
size_t iout = 2 * wl; // random safety margin
uint32_t * us = new uint32_t[iout];
iconv(cd, reinterpret_cast<char*>(ws), &iin, reinterpret_cast<char*>(us), &iout);
iconv_close(cd);

// ...

Finally, you have in us an array of Unicode codepoints that made up your input text. You can now process this array, e.g. by looking each codepoint up in a list and checking whether it comes from the Tamil script, and do with it whatever you see fit.

继续阅读：c locale unicode

Recognizing tamil string and process them using c or c++ and the use of unicode

I. Get the input string into a wide string:

II. Convert the wide string into a string with definite encoding:

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

I. Get the input string into a wide string:

II. Convert the wide string into a string with definite encoding:

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？