How big should my nvarchar column be to store a maximum 255 characters?
I need to store 255 characters in a database column of type nvarchar. They characters are UTF-8 and can be multibyte. I am not the best with character encodings, so I'm not sure if that makes sense. I want to hold 255 characters that can be in any language, etc.
You can find some simple-to-understand background information about different Unicode encodings in this, which is a chapter I wrote in a manual for an open-source project. That background information will help you to understand some of the details in my answer.
The link to documentation about nvarchar
provided by Simmo states that nvarchar
is stored in UCS-2 format. Because of this, you will need to convert the UTF-8 strings into UCS-2 strings before storing them in the database. You can find C++ code to do that conversion here.
A subtle but important point is that the conversion code will actually convert into UTF-16, which is a superset of UCS-2 (UTF-16 supports the use of surrogate pairs, while UCS-2 doesn't). I don't use SQL Server so I don't know if it will complain if you try to insert some surrogate pairs into it. (Perhaps somebody else here can confirm whether or not it will).
If SQL Server disallows surrogate pairs, then there will be a limit on the range of languages your application can support, but at least you know that nvarchar(255)
is sufficient for your needs.
On the other hand, if SQL Server allows the use of surrogate pairs, then you might want to use nvarchar(510)
to allow for the (remote) possibility that every single character will be composed of surrogate pairs.
http://msdn.microsoft.com/en-us/library/ms186939.aspx
255 characters.
精彩评论