ORA-29275: partial multibyte character

2022-12-13 04:49 问答作者：

I have inp开发者_JAVA百科ut data coming from a flat file which has english, japanese, chinese characters in one column. I am loading these values in a staging table column whose schema definition is VARCHAR2(250 CHAR), the main table column has definition VARCHAR2(250) WHICH i can not change. So, i am doing a SUBSTR on this column. After loading the table when i did a

SELECT * FROM TABLE

...I get this error :

ORA-29275: partial multibyte character

If i select other columns then no issues.

you should use SUBSTRB when you copy your data from your 250 CHAR column to your 250 byte column. This function will only output whole characters (you won't get incomplete unicode characters):

SQL> select substrb('中华人', 1, 9) ch9,
  2         substrb('中华人', 1, 8) ch8,
  3         substrb('中华人', 1, 7) ch7,
  4         substrb('中华人', 1, 6) ch6,
  5         substrb('中华人', 1, 5) ch5
  6    FROM dual;

CH9       CH8      CH7     CH6    CH5
--------- -------- ------- ------ -----
中华人       中华       中华      中华     中

Edit:

@mwardm made an interesting comment concerning the actual length of the resulting string and whether the resulting string could contain an invalid sequence of bytes. Consider the following on an AL32UTF8 DB:

SQL> select lengthb('ÏÏÏ'),
  2         lengthb(substrb('ÏÏÏÏÏÏ', 1, 5)),
  3         dump('ÏÏÏ'),
  4         dump(substrb('ÏÏÏÏÏÏ', 1, 5))
  5    FROM dual;

LE LE DUMP('ÏÏÏ')                           DUMP(SUBSTRB('ÏÏÏÏÏÏ',1,5))
-- -- ------------------------------------- -------------------------------
 6  5 Typ=96 Len=6: 195,143,195,143,195,143 Typ=1 Len=5: 195,143,195,143,32

As you can see the last byte of the substrb string is not the truncated first byte of the special character but encodes a legit character (The first 128 characters in this character set are the same as the ASCII7US character set so this encodes the ' ' space character, using RTRIM as suggested in another answer will remove the last character).

Furthermore, I also got this interesting result using the character set AL16UTF16:

SQL> select lengthb(N'ĈĈ') le,
  2         dump(N'ĈĈ') dump,
  3         lengthb(substrb(N'Ĉ', 1, 3)) length_substr,
  4         dump(substrb(N'ĈĈ', 1, 3)) dump_substr
  5    from dual;

        LE DUMP                    LENGTH_SUBSTR DUMP_SUBSTR
---------- ----------------------- ------------- -----------------
         4 Typ=96 Len=4: 1,8,1,8               2 Typ=1 Len=2: 1,8

In this case Oracle has choosen to cut the string after the second byte because in the AL16UTF16 character set there is no legit one-byte character. The resulting string is only 2 bytes instead of 3.

This would need further testing and is by no mean a rigorous demonstration but I still stand by my first hunch that substrb will return a valid sequence of bytes that encodes a valid string of characters.

I think i might have found a good way to do it if you do rtrim(substrb('中华人', 1, 8)) you get '中华' and a byte length of the expected 6

please try

Using substr will behave differently depending on the database character set. I assume from your description that your DB character set is not one of the Unicode variants, and you must truncate the varchar2(250 char) data to 250 BYTES or less. This is dangerous because it can stop in the middle of a 2-byte character, resulting in the message you got. You should look at the documentation for substrc(), which will calculate its length based on characters and not bytes.

It might help if you explain more why you are required to throw away part of the data.

继续阅读：oracle sql

ORA-29275: partial multibyte character

Edit:

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

Edit:

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？