开发者

Performance implication on nvarchar(4000)?

I have a column that is declared as nvarchar(4000) and 4000 is SQL Server's limit on the length of nvarchar. it would not be used as a key to sort rows.

Are there any implication that i should be aware of before setting the length of the nvarchar field to 4000?

Update

I'm storing an XML, an XML Serialized Object to be exact. I know this isn't favorable but we will most likely never need to perform queries on it and this implementation dramatically decreases development time for certain features that we plan on extending. I expect the XML data to be 1500 characters long on average but there can be those exceptions where it can be longer than 4000. Could it be longer than 4000 characters? It could be but in 开发者_Python百科very rare occasions, if it ever happens. Is this application mission critical? Nope, not at all.


SQL Server has three types of storage: in-row, LOB and Row-Overflow, see Table and Index Organization. The in-row storage is fastest to access. LOB and Row-Overflow are similar to each other, both slightly slower than in-row.

If you have a column of NVARCHAR(4000) it will be stored in row if possible, if not it will be stored in the row-overflow storage. Having such a column does not necesarily indicate future performance problems, but it begs the question: why nvarchar(4000)? Is your data likely to be always near 4000 characters long? Can it be 4001, how will your applicaiton handle it in this case? Why not nvarchar(max)? Have you measured performance and found that nvarchar(max) is too slow for you?

My recommendation would be to either use a small nvarchar length, appropiate for the real data, or nvarchar(max) if is expected to be large. nvarchar(4000) smells like unjustified and not tested premature optimisation.

Update

For XML, use the XML data type. It has many advantages over varchar or nvarchar, like the fact that it supports XML indexes, it supports XML methods and can actually validate the XML for a compliance to a specific schema or at least for well-formed XML compliance.

XML will be stored in the LOB storage, outside the row.

Even if the data is not XML, I would still recommend LOB storage (nvarchar(max)) for something of a length of 1500. There is a cost associated with retrieving the LOB stored data, but the cost is more than compensated by macking the table narrower. The width of a table row is a primary factor of performance, because wider tables fit less rows per page, so any operation that has to scan a range of rows or the entire table needs to fetch more pages into memory, and this shows up in the query cost (is actualy the driving factor of the overall cost). A LOB stored column only expands the size of the row with the width of a 'page id', which is 8 bytes if I remember correctly, so you can get much better density of rows per page, hence faster queries.


Are you sure that you'll actually need as many as 4000 characters? If 1000 is the practical upper limit, why not set it to that? Conversely, if you're likely to get more than 4000 bytes, you'll want to look at nvarchar(max).

I like to "encourage" users not use storage space too freely. The more space required to store a given row, the less space you can store per page, which potentially results in more disk I/O when the table is read or written to. Even though only as many bytes of data are stored as are necessary (i.e. not the full 4000 per row), whenever you get a bit more than 2000 characters of nvarchar data, you'll only have one row per page, and performance can really suffer.

This of course assumes you need to store unicode (double-byte) data, and that you only have one such column per row. If you don't, drop down to varchar.


Do you certainly need nvarchar or can you go with varchar? The limitation applies mainly to sql server 2k. Are you using 2k5 / 2k8 ?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜