Difference between different string types in SQL Server?
What is the difference between char
, nchar
, ntext
, nvarchar
, text
and varchar
in SQL?
Is there really an application case for each of these types, or are some of them just开发者_C百科 deprecated?
text
and ntext
are deprecated, so lets omit them for a moment. For what is left, there are 3 dimensions:
- Unicode (UCS-2) vs. non-unicode:
N
in front of the name denotes Unicode - Fixed length vs. variable length:
var
denotes variable, otherwise fixed - In-row vs. BLOB:
(max)
as length denotes a BLOB, otherwise is an in-row value
So with this, you can read any type's meaning:
CHAR(10)
: is an in-row fixed length non-Unicode of size 10NVARCHAR(256)
: is an in-row variable length Unicode of size up-to 256VARCHAR(MAX)
: is a BLOB variable length non-Unicode
The deprecated types text
and ntext
correspond to the new types varchar(max)
and nvarchar(max)
respectively.
When you go to details, the meaning of in-row
vs. BLOB
blurs for small lengths as the engine may optimize the storage and pull a BLOB in-row or push an in-row value into the 'small BLOB' allocation unit, but this is just an implementation detail. See Table and Index Organization.
From a programming point of view, all types: CHAR
, VARCHAR
, NCHAR
, NVARCHAR
, VARCHAR(MAX)
and NVARCHAR(MAX)
, support an uniform string API: String Functions. The old, deprecated, types TEXT
and NTEXT
do not support this API, they have a separate, deperated, TEXT API to manipulate. You should not use the deprecated types.
BLOB types support efficient in-place updates by using the UPDATE table SET column.WRITE(@value, @offset)
syntax.
The difference between fixed-length and variable length types vanishes when row-compression on a table. With row-compression enabled, fixed lenght types and variable length are stored in the same format and trailing spaces are not stored on disk, see Row Compression Implementation. Note that page-compression implies row-compression.
- 'n' represents support for unicode characters.
- char - specifies string with fixed length storage. Space allocated with or without data present.
- varchar - Varying length storage. Space is allocated as much as length of data in column.
- text - To store huge data. The space allocated is 16 bytes for column storage.
Additionally - text and ntext have been deprecated for varchar(max) and nvarchar(max)
text
and ntext
are deprecated in favor of varchar(max)
and nvarchar(max)
The n prefix simply means Unicode. They "n" types work similarly to the plain versions except they work with Unicode text.
char is a fixed length field. Thus char(10) filled with "Yes" will still take 10 bytes of storage.
varchar is a variable length field. char(10) filled with "Yes" will take 5 bytes of storage (there is a 2 byte overhead for using var data types).
char(n) holding string of length x. Storage = n bytes. varchar(n) holding string of length x. Storage = x+2 bytes.
vchar and nvarchar are similar except it is 2 bytes per character.
Generally speaking you should only use char & char (over varchar & nvarchar) when working with fixed or semi-fixed strings. A good example would be a product_code or user_type which is always n characters long.
You shouldn't use text (or ntext) as it has been deprecated. varchar(max) & nvarchar(max) provides the same functionality.
N
prefix indicates unicode support and takes up twice the bytes per character of non-unicode.
Varchar
is variable length. You use an extra 2 bytes per field to store the length.
Char
is fixed length. If you know how long your data will be, use char
as you will save bytes!
Text
is mostly deprecated in my experience.
Be wary of using Varchar(max)
and NVarchar(max)
as these fields cannot be indexed.
I only know between "char" and "varchar".
char: it can allocate memory of specified size whether or not it is filled
varchar: it will allocate memory based on the number of characters in it but it should have some size called maximum size.
Text is meant for very large amounts of text, and is in general not meant to be searchable (but can be in some circumstances. It will be slow anyway).
The char/nchar datatypes are of fixed lenghts, and are padded if entered stuff is shorter, as opposed to the varchar/nvarchar types, which are variable length.
The n types have unicode support, where the non-n types don't.
Text is deprecated.
Char is a set value. When you say char(10), you are reserving 10 characters for every single row, whether they are used or not. Use this for something that shouldn't change lengths (For example, Zip Code or SSN)
varchar is variable. When you say varchar(10), 2 bytes is set aside to store the size of the data, as well as the actual data (which might be only say, four bytes).
The N represents uni-code. Twice the space.
n-prefix: unicode. var*: variable length, the rest is fixed length.
All data types are properly and nicely... documented.
Like here:
http://msdn.microsoft.com/en-us/library/ms187752.aspx
Is there really an application case for each of these types, or are some of them just deprecated?
No, there is a good case for ANY of them.
精彩评论