What is MySQL primary (key1, key2)
I am working on existing DB and try to optimize it. I see a table without a single primary key but with two foreign keys work as a primary key. I know it will work. however, is it better to have one single primary key with two foreign keys for better performance or primary (key1, key2) will just work as good as one?
For example:
CREATE TABLE ABC (
'xid' int(11),
'yid' int (11),
PRIMAY KEY (xid, yid)
)
does it perform the same (in terms of indexing) as:
CREATE TABLE ABC (
'id' int(11),
'yid' int (11),
'xid', int (11),
PRIMARY KEY (id),
KEY (xid, yid)
)
some updates there
so, I do some testing.. using simple queries on three different tables
Table myA, with 10,000+ records. only has userid as the primary index.
SELECT * FROM myA where userid=12345
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE myA const PRIMARY PRIMARY 4 const 1
Table myB, is a many-to-many table, with primary id, and userid as one of two foreign keys. with over 50,000 records
SELECT * FROM myB where userid=12345
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE myB ref userid userid 4 const 53
Table myC, is also a many-to-many table, but with composite primary key, userid is one of the two. with over 100,000 records
SELECT * FROM myC where userid=12345
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE myC ALL NULL NULL NULL NULL 101289 Using where
So, table C actually examine all 100,000 records!!! (the query only returns 50 records)
There are something that i don't u开发者_如何学Gonderstand... it seems to me composite primary isn't doing the job.
MORE..
While I'm doing more testing and "Explain", I conclude that (in MySQL), even you set composite keys as primary key. You still have to explicitly set index for all keys. Then you will enjoy indexing.
In the second example you show, the xid
, yid
columns are indexed but nothing prevents your application from entering the same pair of xid
, yid
on multiple rows of the ABC
table:
INSERT INTO ABC (xid, yid) VALUES (123, 456), (123, 456); -- NO ERROR
You can get unintentional duplicates this way, and it can cause strange effects when you do joins and counts. Also if you need to update a row to change the association between a given xid
and its yid
, you could update one row and not the other(s).
You should at least declare the key over (xid, yid)
as a UNIQUE KEY
to prevent duplicates.
The first example you show uses a compound primary key (some people say composite primary key). SQL supports multi-column indexes, and multi-column constraints. There's no downside to doing this, except that if you want to run a query to pick one row out, you need to use two columns instead of one in the condition that identifies the row.
DELETE FROM ABC WHERE xid = 123 AND yid = 456;
Likewise if another table contains a foreign key to reference the ABC
table, it would have to have both columns.
Enough programmers find using two columns to be so burdensome and confusing that they'd rather add a single-column surrogate key.
Insisting on a superfluous surrogate key when none is needed is something I consider to be an SQL Antipattern.
Re your updated question above: Are you aware that a compound index only helps when your search includes the left-most columns in the index? This is true of any composite index in any brand of RDBMS. Example:
CREATE TABLE myC (
somethingid INT,
userid INT,
PRIMARY KEY (somethingid, userid)
);
SELECT * FROM myC WHERE userid = 12345;
This query cannot use the primary key index.
The classic example to explain compound index usage is the telephone book analogy: If I ask you to search for everyone whose last name is "Thomas" you can use the fact that the phone book is ordered by last name to help make your search quick. But if I ask you to search for everyone whose first name is "Thomas," you have to search every page. The phone book is like a compound index on (last_name
, first_name
). So if your search doesn't include last_name
, you must resort to a brute-force search.
You can also create an extra index just for the other column, so you can do a search using that as the criterion. You don't need an extra single-column index for the first column. The compound index is adequate.
CREATE TABLE myC (
somethingid INT,
userid INT,
PRIMARY KEY (somethingid, userid),
KEY (userid)
);
Usually if that column is declared as a foreign key, the RDBMS should create an index automatically. However, in some versions of some RDBMS products, you have to create an index on a foreign key column yourself, as a separate action.
Adding the id
column is only worth doing if you are exposing the contents to the user. Either way, it's a typical many-to-many join table.
To change the primary key to a single column means adding a unique key constraint onto the two foreign key columns - there's no need because you get the uniqueness from defining the columns as the primary key, even if it is a composite key. A composite key is a combination of 2+ columns.
If you aren't selecting by that id
column, there's no need for it.
What you're describing is a composite primary key, which is acceptable and a sensible pattern. A lot of applications use an artificial primary key (usually an int or guid) when there is a perfectly acceptable candidate key already. This adds work for the database, but in some cases makes the application easier to write.
There is no real reason to add an extra primary key when there is already a candidate. I'd lean towards not doing so.
You also gain performance in some cases by not adding an unnecessary auto-generated primary key column, and it reduces the space usage of your database (albeit only slightly)
There's one other cool thing about primary keys which is that the data is actually ordered by the primary key on the disk. So there is a slight difference between even a unique index and a primary key when it comes to performance. It's probably not going to be much of a performance difference though, just how much time it takes to actually get the data off of the disk.
精彩评论