开发者

How to properly index a many-many association table?

In a typical many-many arrangement like this...

Movies       Actors       Movies_Actors
------       ------       -------------
movie_ID     actor_ID     FK_movie_ID
title        name         FK_actor_ID

... how should the association table ('Movies_Actors') be indexed for optimal read speed?

I usually see this done only with the composite primary key in the association table, like so:

CREATE TABLE Movies_Actors (
  FK_movie_ID INTEGER,
  FK_actor_ID INTEGER,
  PRIMARY KEY (FK_movie_ID, FK_actor_ID)
)

However, this seems like the index will only be useful when searching for both movie_ID and actor_ID (although I'm not certain on whether a composite index also works for the individual columns).

Since both "what actors are in Movie X" and "what movies has actor Y been in" will be the common queries for this table, it seems like there should be an individual index on each column to quickly locate actors and movies on their own. Does a composite index effectively do this? If not, having a composite index seems pointless on this table. And if a composite index is pointless, what to do about a primary key? The candidate key is clearly the开发者_StackOverflow composite of the two columns, but if the resulting composite index is pointless (it mustn't be?) it seems like a waste.

Also, this link adds some confusion and indicates that it might even be useful to actually specify two composite indices... one of them as (FK_movie_ID, FK_actor_ID), and the other in reverse as (FK_actor_ID, FK_movie_ID), with the choice of which is the primary key (and thus usually clustered) and which is 'just' a unique composite index being based on which direction is queried more.

What is the real story? Does a composite index automatically effectively index each column for searching on one or the other? Should the optimal (in read speed, not size) association table have a composite index in each direction and one on each column? What are the behind-the-scene mechancs?


EDIT: I found this related question that for some reason I didn't locate before posting... How to properly index a linking table for many-to-many connection in MySQL?


(although I'm not certain on whether a composite index also works for the individual columns).

Yes, it can. But only the prefix: http://use-the-index-luke.com/sql/where-clause/the-equals-operator/concatenated-keys

Also, this link adds some confusion and indicates that it might even be useful to actually specify two composite indices... one of them as (FK_movie_ID, FK_actor_ID), and the other in reverse as (FK_actor_ID, FK_movie_ID),

That's actually the thing to do.

Take one as clustering index, and the other as non-clustering index that will anyways include the clustering index key--hence no need to include the that column again (thx to JNK).

CREATE CLUSTERED INDEX a on Movies_Actors (fk_movie_id, fk_actor_id);
CREATE NONCLUSTERED INDEX b on Movies_Actors (fk_actor_id);

What is the real story?

http://Use-The-Index-Luke.com/ :)

Does a composite index automatically effectively index each column for searching on one or the other?

No. Only the prefix of the index. If you have an index (a,b,c), the query a=? and b=? can use the index. However c=? can't, nor can b=? and c=?.

Should the optimal (in read speed, not size) association table have a composite index in each direction and one on each column?

If you need to join in both directions, yes ("composite index in each direction") and no ("one on each column").

What are the behind-the-scene mechanics?

Well, same link again.

Speaking SQL Server, you might eventually also consider an indexed view. That's kind of pre-joining. Two indexes, as above, might also be fast enough.


In SQL Server, a composite index can be used for a single field search for the first column only. That means you should have an additional, one field index on FK_actor_id if there will be searches on that field without FK_Movie_id in the same query.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜