开发者

SQL Server silently truncates varchar's in stored procedures

According to this forum discussion, SQL Server (I'm using 2005 but I gather this also applies to 2000 and 2008) silently truncates any varchars you specify as stored procedure parameters to the length of the varchar, even if inserting that string directly using an INSERT would actually cause an error. eg. If I create this table:

CREATE TABLE testTable(
    [testStringField] [nvarchar](5) NOT NULL
)

then when I execute the following:

INSERT INTO testTable(testStringField) VALUES(N'string which is too long')

I get an error:

String or binary data would be truncated.
The statement has been terminated.

Great. Data integrity preserved, and the caller knows about it. Now let's define a stored procedure to insert that:

CREATE PROCED开发者_如何学CURE spTestTableInsert
    @testStringField [nvarchar](5)
AS
    INSERT INTO testTable(testStringField) VALUES(@testStringField)
GO

and execute it:

EXEC spTestTableInsert @testStringField = N'string which is too long'

No errors, 1 row affected. A row is inserted into the table, with testStringField as 'strin'. SQL Server silently truncated the stored procedure's varchar parameter.

Now, this behaviour might be convenient at times but I gather there is NO WAY to turn it off. This is extremely annoying, as I want the thing to error if I pass too long a string to the stored procedure. There seem to be 2 ways to deal with this.

First, declare the stored proc's @testStringField parameter as size 6, and check whether its length is over 5. This seems like a bit of a hack and involves irritating amounts of boilerplate code.

Second, just declare ALL stored procedure varchar parameters to be varchar(max), and then let the INSERT statement within the stored procedure fail.

The latter seems to work fine, so my question is: is it a good idea to use varchar(max) ALWAYS for strings in SQL Server stored procedures, if I actually want the stored proc to fail when too long a string is passed? Could it even be best practice? The silent truncation that can't be disabled seems stupid to me.


It just is.

I've never noticed a problem though because one of my checks would be to ensure my parameters match my table column lengths. In the client code too. Personally, I'd expect SQL to never see data that is too long. If I did see truncated data, it'd be bleeding obvious what caused it.

If you do feel the need for varchar(max) beware a massive performance issue because of datatype precedence. varchar(max) has higher precedence than varchar(n) (longest is highest). So in this type of query you'll get a scan not a seek and every varchar(100) value is CAST to varchar(max)

UPDATE ...WHERE varchar100column = @varcharmaxvalue

Edit:

There is an open Microsoft Connect item regarding this issue.

And it's probably worthy of inclusion in Erland Sommarkog's Strict settings (and matching Connect item).

Edit 2, after Martins comment:

DECLARE @sql VARCHAR(MAX), @nsql nVARCHAR(MAX);
SELECT @sql = 'B', @nsql = 'B'; 
SELECT 
   LEN(@sql), 
   LEN(@nsql), 
   DATALENGTH(@sql), 
   DATALENGTH(@nsql)
;

DECLARE @t table(c varchar(8000));
INSERT INTO @t values (replicate('A', 7500));

SELECT LEN(c) from @t;
SELECT 
   LEN(@sql + c), 
   LEN(@nsql + c), 
   DATALENGTH(@sql + c), 
   DATALENGTH(@nsql + c) 
FROM @t;


Thanks, as always, to StackOverflow for eliciting this kind of in-depth discussion. I have recently been scouring through my Stored Procedures to make them more robust using a standard approach to transactions and try/catch blocks. I disagree with Joe Stefanelli that "My suggestion would be to make the application side responsible", and fully agree with Jez: "Having SQL Server verify the string length would be much preferable". The whole point for me of using stored procedures is that they are written in a language native to the database and should act as a last line of defence. On the application side the difference between 255 and 256 is just a meangingless number but within the database environment, a field with a maximum size of 255 will simply not accept 256 characters. The application validation mechanisms should reflect the backend db as best they can, but maintenance is hard so I want the database to give me good feedback if the application mistakenly allows unsuitable data. That's why I'm using a database instead of a bunch of text files with CSV or JSON or whatever.

I was puzzled why one of my SPs threw the 8152 error and another silently truncated. I finally twigged: The SP which threw the 8152 error had a parameter which allowed one character more than the related table column. The table column was set to nvarchar(255) but the parameter was nvarchar(256). So, wouldn't my "mistake" address gbn's concern: "massive performance issue"? Instead of using max, perhaps we could consistently set the table column size to, say, 255 and the SP parameter to just one character longer, say 256. This solves the silent truncation problem and doesn't incur any performance penalty. Presumably there is some other disadvantage that I haven't thought of, but it seems a good compromise to me.

Update: I'm afraid this technique is not consistent. Further testing reveals that I can sometimes trigger the 8152 error and sometimes the data is silently truncated. I would be very grateful if someone could help me find a more reliable way of dealing with this.

Update 2: Please see Pyitoechito's answer on this page.


The same behavior can be seen here:

declare @testStringField [nvarchar](5)
set @testStringField = N'string which is too long'
select @testStringField

My suggestion would be to make the application side responsible for validating the input before calling the stored procedure.


Update: I'm afraid this technique is not consistent. Further testing reveals that I can sometimes trigger the 8152 error and sometimes the data is silently truncated. I would be very grateful if someone could help me find a more reliable way of dealing with this.

This is probably occurring because the 256th character in the string is white-space. VARCHARs will truncate trailing white-space on insertion and just generate a warning. So your stored procedure is silently truncating your strings to 256 characters, and your insertion is truncating the trailing white-space (with a warning). It will produce an error when said character is not white-space.

Perhaps a solution would be to make the stored procedure's VARCHAR a suitable length to catch a non-white-space character. VARCHAR(512) would probably be safe enough.


One solution would be to:

  1. Change all incoming parameters to be varchar(max)
  2. Have sp private variable of the correct datalength (simply copy and paste all in parameters and add "int" at the end
  3. Declare a table variable with the column names the same as variable names
  4. Insert into the table a row where each variable goes into the column with the same name
  5. Select from the table into internal variables

This way your modifications to the existing code are going to be very minimal like in the sample below.

This is the original code:

create procedure spTest
(
    @p1 varchar(2),
    @p2 varchar(3)
)

This is the new code:

create procedure spTest
(
    @p1 varchar(max),
    @p2 varchar(max)
)
declare @p1Int varchar(2), @p2Int varchar(3)
declare @test table (p1 varchar(2), p2 varchar(3)
insert into @test (p1,p2) varlues (@p1, @p2)
select @p1Int=p1, @p2Int=p2 from @test

Note that if the length of the incoming parameters is going to be greater than the limit instead of silently chopping off the string SQL Server will throw off an error.


You could always throw an if statement into your sp's that check the length of them, and if they're greater than the specified length throw an error. This is rather time consuming though and would be a pain to update if you update the data size.


This isn't the Answer that'll solve your problem today, but it includes a Feature Suggestion for MSSQL to consider adding, that would resolve this issue.
It is important to call this out as a shortcoming of MSSQL, so we may help them resolve it by raising awareness of it.
Here's the formal Suggestion if you'd like to vote on it:
https://feedback.azure.com/forums/908035-sql-server/suggestions/38394241-request-for-new-rule-string-truncation-error-for

I share your frustration.
The whole point of setting Character-Size on Parameters is so other Developers will instantly know
    what the Size Limits are (via Intellisense) when passing in Data.
This is like having your documentation baked right into the Sproc's Signature.

Look, I get it, Implicit-Conversion during Variable Assignments is the culprit.
Still, there is no good reason to expend this amount of energy battling scenarios
    where you are forced to work around this feature.
If you ask me, Sprocs and Functions should have the same engine-rules in place,
    for Assigning Parameters, that are used when Populating Tables. Is this really too much to ask?

All these suggestions to use Larger Character-Limits
    and then adding Validation for EACH Parameter in EVERY Sproc is ridiculous.
I know it's the only way to ensure Truncation is avoided, but really MSSQL?
I don't care if it's ANSI/ISO Standard or whatever, it's dumb!

When Values are too long - I want my code to break - every time.
It should be: Do not pass go, and fix your code.
You could have multiple truncation bugs festering for years and never catch them.
What happened to ensuring your Data-Integrity?

It's dangerous to assume your SQL Code will only ever be called after all Parameters are Validated.
I try to add the same Validation to both my Website and in the Sproc it calls,
    and I still catch Errors in my Sproc that slipped past the website. It's a great sanity-check!
What if you want to re-use your Sproc for a WebSite/WebService and also have it called from other
    Sprocs/Jobs/Deployment/Ad-Hoc Scripts (where there is no front-end to Validate Parameters)?

MSSQL Needs a "NO_TRUNC" Option to Enforce this on any Non-Max String Variable
    (even those used as Parameters for Sprocs and Functions).
It could be Connection/Session-Scoped:
    (like how the "TRANSACTION ISOLATION LEVEL READ UNCOMMITTED" Option affects all Queries)
Or focused on a Single Variable:
    (like how "NOLOCK" is a Table Hint for just 1 Table).
Or a Trace-Flag or Database Property you turn on to apply this to All Sproc/Function Parameters in the Database.

I'm not asking to upend decades of Legacy Code.
Just asking MS for the option to better manage our Databases.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜