Removing accents chars from arabic text

2023-01-06 04:10 问答作者：

I've a database with collation of utf8 / utf8_bin. DB have arabic text with accent chars (kasar etc). I want to search text without accent chars.

What i did is, i wrote 'like' query ...

WHERE replace(field1,0x[CODE],'') like '%[arabic text]%' --- where [CODE] is th开发者_StackOverflow中文版e accent char.

This is working fine when i write static utf arabic text in place of field1. But it does not replace when it run from database field.

May be db field have latin1 encoding. How can i fix / check this issue.

Thanks.

utf8_bin is a binary collation - it is extremely literal and strict in comparing characters.

utf8_general_ci is more lenient and normalizes Umlauts and accents to their "basic" version:

Ä => A

Ü => U etc.

I have never worked with Arabic before so I don't know whether it applies for those accents as well but I would expect so.

You should be able to temporarily use the lenient collation like this:

WHERE field1 like '%[arabic text]%' COLLATE utf8_general_ci;

i used the below query in postgreSQL/navicat to create a new accentless column of the original accented text

with remove_accent_1 as (

SELECT

index, text as original_text, replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace("text", 'ٍ',''), 'ِ',''), 'ً',''), 'َ',''), 'ّ',''), 'ْ',''), 'ۖ',''), 'ٌ',''), 'ُ',''), 'آ','ا'), 'ٰ ',''), ' ۚ ',''), 'ۗ ','') as new_text

FROM "arabic_text"

ORDER BY "index" ASC )

select

index, original_text, new_text

from remove_accent_1

继续阅读：php

Removing accents chars from arabic text

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？