开发者

C# ASCII GetBytes how to set which character is used for unrecognizable conversion?

I am porting some code from native C++ to C# and I need to do the following:

ASCII.GetBytes when it encounters a unicode character it does not recognize it returns to me character with decimal number 63 (question mark) but in my C++ code using WideCharToMultiByte开发者_Python百科(CP_ACP, ... when it encounters a character it doesn't know it uses character with decimal number 37 (% sign).

My question is how can I make ASCII.GetBytes return to me #37 instead of #63 for unknown characters?


In C#, you can use the DecoderFallback/EncoderFallback of an encoding to decide how it will behave. You can't change the fallback of Encoding.ASCII itself, but you can clone it and then set the fallback. Here's an example:

using System;
using System.Text;

class Test
{    
    static void Main()
    {
        Encoding asciiClone = (Encoding) Encoding.ASCII.Clone();
        asciiClone.DecoderFallback = new DecoderReplacementFallback("%");
        asciiClone.EncoderFallback = new EncoderReplacementFallback("%");

        byte[] bytes = { 65, 200, 66 };
        string text = asciiClone.GetString(bytes);
        Console.WriteLine(text); // Prints A%B
        bytes = asciiClone.GetBytes("A\u00ffB");
        Console.WriteLine(bytes[1]); // Prints 37
    }
}


Presumably the C++ code calls WideCharToMultiByte with lpDefaultChar = "%".

There's no way to pass this into the Encoding.GetBytes call, but you could call WideCharToMultiByte using P/Invoke.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜