开发者

SQL Primary key sort order

I am using MS SQL server 2005

I have a table with 3 columns where I store user-message mapping like:

msg_for msg_from msg_id 
bob     bob      1 
bob     john     1 
bob     steve    1 
bob     bob      2 
bob     john     2 
bob     bob      3 
bob     john     3 
bob     steve    3

The PK is on 3 columns and msg_id is FK to messages table that stores the messages

The above is the physical storage I see according to the PK on 3 columns

Now my query MUST return the messages for a given user having latest msg at top (order by msg_id DESC)

bob john  3
bob steve 3
bob john  2
bob steve 2
bob john  1
bob steve 1

This mapping table has millions of rows. I see 95% of the cost is to SORT the result.

Is it 开发者_JS百科possible to have the PK or some other way store data physically like this (avoid SORT)?

msg_for msg_from msg_id
bob     bob      3
bob     john     3
bob     steve    3
bob     bob      2
bob     john     2
bob     bob      1
bob     john     1
bob     steve    1

Thanks


Yes.

When you set up the Primary Key (or any index) you can define this

ALTER TABLE dbo.[Messages] ADD CONSTRAINT [PK_Messages] PRIMARY KEY CLUSTERED 
(
    msg_for ASC, msg_from ASC, msg_id DESC
)

SQL Server can scan in either direction so it only makes sense if you want to control the sort order combination for multiple columns.

Edit: You say in the comments that the problem query is

select top 10 msg_id 
from message_user 
where msg_for = @user_name 
and msg_from <> @user_name 
order by msg_id DESC

The issue here isn't one of Ascending, Descending.

To give an analogy. Phone books are listed in surname, forename order but if you needed to know the lexicographically last 10 forenames in the directory you would need to scan the whole book. This would be unavoidable regardless of whether or not within each section forenames were listed in ascending or descending order.

Similarly the composite index keys would need to be msg_for, msg_id, msg_from to satisfy this query optimally not msg_for, msg_from, msg_id With this latter order it will still need to scan the whole section of the index satisfying the msg_for = @user_name criteria as it cannot know if there will be a later msg_id still to come belonging to a later msg_from Additionally regardless in which direction msg_id is sorted in their individual sub sections a sequential scan of the msg_for = @user_name part of the index will still require a sort as it they are fragmented by being in subsections according to msg_from.


The only way to guarantee order in the result set is by using an ORDER BY.

In SQL Server, a clustered index can help... assuming the optimizer sees the index as being useful.


Well no wonder sorting takes forever. Varchar/string types are usually types that are very heavy when it comes to sorting, whether it's SQL or any programming language for that matter. Whenever possible use integral types for such things.

I suggest you use integral values to identify members. Have a Members table (MemberId INT, MemberName VARCHAR, etc), then a Messages table (MessageId INT, MessageBody VARCHAR, etc) and then have a join table, say Correspondence with (SenderMemberId INT, RecipientMemberId INT, MessageId INT). Sorting on integral values will be way faster this way.

I think you can easily refactor your data to fit on such a new structure.


Depending on your DBMS, you can use a clustered index to achieve that.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜