开发者

Clustered index dilemma - ID or sort?

I have a table with two very important fields:

id INT identity(1,1) PRIMARY KEY
identifiersortcode VARCHAR(900)

My app always sorts and pages search results in the UI based on identifiersortcode, but all table joins (and they are legion) are on the id field. (Aside: yes, the sort code really is that long. There's a strong BL reason.)

Also, due to O/RM use, most SELECT statements are going to pull almost every column.

Currently, the clustered index is on id, but I'm wondering if the TOP / ORDER BY portion of most queries would make identifiersortcode a more attractiv开发者_运维技巧e option as the clustered key, even considering all of the table joins going on.

Inserts on the table and changes to the identifiersortcode are limited enough that changing my clustered index would be a problem for insert/update operations.

Trying to make the sort code's non-clustered index a covering index (using INCLUDE) is not a good option. There are a number of large columns, and some of them have a lot of update activity.


Kimberly L. Tripp's criteria for a clustered index are that it be:

  • Unique
  • Narrow
  • Static
  • Ever Increasing

Based on that, I'd stick with your integer identity id column, which satisfies all of the above. Your identifiersortcode would fail most, if not all, of those requirements.


To correctly determine which field will benefit most from the clustered index, you need to do some homework. The first thing that you should consider is the selectivity of your joins. If your execution plans filter rows from this table FIRST, then join on the other tables, then you are not really benefiting from having the clustered index on the primary key, and it makes more sense to have it on the sort key.

If however, your joins are selective on other tables (they are filtered, then an index seek is performed to select rows from this table), then you need to compare the performance of the change manually versus the status quo.


Currently, the clustered index is on id, but I'm wondering if the TOP / ORDER BY portion of most queries would make identifiersortcode a more attractive option as the clustered key, even considering all of the table joins going on.

Making identifiersortcode a CLUSTERED KEY will only help if it is used both in filtering and ordering conditions.

This means that it is chosen a leading table in all your joins and uses Clustered Index Scan or Clustered Index Range Scan access path.

Otherwise, it will only make the things worse: first, all secondary indexes will be larger in size; second, inserts in non-increasing order will result in page splits which will make them run longer and result in a larger table.


Why, for God's sake, does your identifier sort code need to be 900 characters long? If you really need 900 characters to be distinct for sorting, it should probably be broken up into multiple fields.


Appart from repeating what Chris B. said, I think you should really stick to your current PK, since - as you said - all joins are on the Id.
I guess you already have indexed the identifiersortcode.... Nevertheless, IF you have performance issues, would reaaly think twice about this @#"%$£ identifiersortcode !-)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜