开发者

SQL Server nvarchar N prefix for varchar fields

I am using a SQL Server 2005/20开发者_StackOverflow中文版08 Express database. Are there any problems with using the N string prefix (used for nvarchar fields) for varchar fields?

e.g. if I have a database field:

CREATE TABLE [dbo].[posts](
    post_title varchar(30)
)

And then I insert just ascii data but with an N prefix:

INSERT INTO [dbo].[posts] ([post_title]) VALUES (N'My Title');

The problem arises because I want to save UTF-8 characters from a PHP application and I can't currently differentiate whether the field it is being saved to is varchar or nvarchar. So I just want to assume that all are nvarchar given that I will only ever try to save ASCII characters to varchar fields.


The accepted answer is misleading, but that is due, in part, to the question itself being ambiguous (though probably not intentionally).

Yes, any Unicode string (i.e. literal prefixed with N, or XML and N-prefixed variables) will implicitly convert to 8-bit ASCII when stored into a CHAR / VARCHAR / TEXT (don't use this one!) field. BUT, and this can be a rather important distinction in many cases, only Unicode code points in the range of U+0000 to U+007F (i.e. ASCII values 0 - 127) are guaranteed to convert correctly. Everything from U+0080 (i.e. ASCII value 128) on up may or may not convert, depending on the Code Page implied by the Collation of the field being inserted into. If the Code Page of that Collation does not have a mapping for that symbol, then you get a ? instead.

To find out what the Code Page is exactly, first find the Collation of the field via either of the following two queries:

SELECT * FROM sys.columns WHERE [object_id] = OBJECT_ID(N'table_name');

-- OR:

EXEC sp_help N'table_name';

Then you can find the Code Page from the Collation, using:

SELECT COLLATIONPROPERTY('collation_name', 'CodePage');

And then you can find a chart on any one of several sites, based on that code page number, that will show you what is mapped.

And collations are not per-row, they are per-field. So whatever the Collation is for a field determines the character set for non-Unicode fields (i.e. CHAR / VARCHAR / TEXT).

So the question is: what is meant by the term "ASCII" in the Question? It technically refers to just the 7-bit values (the first 128 ; values 0 - 127), but people often use it to mean anything that can fit into a single byte, which also includes the Extended ASCII values (the second 128 ; values 128 - 255) which are dependent on the Code Page.


Regarding the potential issue(s) surrounding having a VARCHAR column to NVARCHAR variables and literals: indexes will not be ignored, but there is some negative impact, and that varies based on the Collation of the VARCHAR column.

If the column Collation is a SQL Server Collation (i.e. one that starts with SQL_, such as SQL_Latin1_General_CP1_CI_AS), then you can get an Index Scan, but not a Seek.

But, if the column Collation is a Windows Collation (i.e. one that does not start with SQL_, such as Latin1_General_100_CI_AS), then you can get an Index Seek.

The following test shows this behavior:

-- DROP TABLE dbo.VarcharColumnIndex;
CREATE TABLE dbo.VarcharColumnIndex
(
  ID INT IDENTITY(1, 1) NOT NULL CONSTRAINT [PK_VarcharColumnIndex] PRIMARY KEY CLUSTERED,
  SqlServerCollation VARCHAR(50) COLLATE SQL_Latin1_General_CP1_CI_AS,
  WindowsCollation VARCHAR(50) COLLATE Latin1_General_100_CI_AS
);

CREATE NONCLUSTERED INDEX [IX_VarcharColumnIndex_SqlServerCollation]
  ON dbo.VarcharColumnIndex ([SqlServerCollation]);
CREATE NONCLUSTERED INDEX [IX_VarcharColumnIndex_WindowsCollation]
  ON dbo.VarcharColumnIndex ([WindowsCollation]);

INSERT INTO dbo.VarcharColumnIndex ([SqlServerCollation], [WindowsCollation])
  VALUES ('a', 'b');

DECLARE @a NVARCHAR(50) = N'a';
SELECT [SqlServerCollation] FROM dbo.VarcharColumnIndex WHERE [SqlServerCollation] = @a;
-- Index Scan

DECLARE @b NVARCHAR(50) = N'b';
SELECT [WindowsCollation] FROM dbo.VarcharColumnIndex WHERE [WindowsCollation] = @b;
-- Index Seek


If you write strings with the N prefix into a varchar field it will be implicitly converted. There is no other overhead and you can safely assume "everything is nvarchar"

There may be an problem comparing nvarchar variables to varchar columns because of data type precedence. The varchar column will be converted and any indexes won't be used.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜