MySQL database performance tuning, schema optimisation
I have a MySQL database with a pretty simple schema. There are parent
, child
and access
tables.
parent
stores 51 fields that are all varchar
(ranging in length from 16 to 512) except for for 4 longtext
fields and the primary key which is a bigint
. Besides the primary key, there are indexes on 3 other fields. One so that the child
table can refer to it as a foreign key.
child
stores 23 fields that are mostly varchar
with some text
fields. A varchar(256)
field is used as the foreign key to link it to parent. The actual contents of the foreign key field are all expected to be shorter than 60 characters, though.
accesss
has a bigint
field and a varchar
field that together make up the primary key and the bigint
field is the foreign key that links it to parent
.
The access
table is used to specify which users have access to which records from parent
. There may be multiple users who should have access to any record.
There are around 2e6 rows in parent
(and therefore and around 2e7 rows in access
)child
. EDIT: Sorry, access
has 5329840 rows. i.e. there are one or more rows in access
for every row in parent
.
The above schema is based on an old FileMaker database that we're looking to migrate to MySQL. I am sure it is not ideal, but I don't know exactly why.
Queries are fast or pretty slow depending on the parameters. So, e.g. the following query will take a second or two if there are several records that bob
has access to. The query will take several minutes (e.g. 12 minutes), though, if there are no records that user bob
has access to (e.g. if there is no user called bob
):
SELECT
p."RecordID", p."SerialNumber", p."Folder", p."NoteType",
p."FirstName", p."LastName", p."DOB", p."Body", p."From",
p."DateTxt", a."UserName" AS Access
FROM parent AS p
INNER JOIN access AS a ON a."RecordID" = p."RecordID"
WHERE p."RecordID" > 123
AND a."UserName" = 'bob'
ORDER BY p."RecordID"
LIMIT 500;
Here's what EXPLAIN
says about the query:
+----+-------------+-------+--------+---------------+---------+---------+---------------------+--------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+---------------------+--------+--------------------------+
| 1 | SIMPLE | p | range | PRIMARY | PRIMARY | 8 | NULL | 731752 | Using where |
| 1 | SIMPLE | a | eq_ref | PRIMARY | PRIMARY | 74 | db.p.RecordID,const | 1 | Using where; Using index |
+----+-------------+-------+--------+---------------+---------+---------+---------------------+--------+--------------------------+
2 rows in set (0.01 sec)
So, is there some way to speed up the non-matching queries? Could this be caused by using varchar/text fields for everything? Would using a 开发者_运维百科varchar(256) field as a foreign key cause problems (although it's not used in the above query)? Or is the query to blame?
EDIT: I've just realised that the PRIMARY KEY ("RecordID", "UserName")
on the access
table is not being used for SELECT ... FROM access WHERE UserName = 'blah'
. I've created an index on the UserName column and that appears to have fixed the problem. I'd still appreciate it if anyone has advice.
Current output of EXPLAIN
:
+----+-------------+-------+--------+---------------+---------+---------+---------------------+--------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+---------------------+--------+--------------------------+
| 1 | SIMPLE | p | range | PRIMARY | PRIMARY | 8 | NULL | 605020 | Using where |
| 1 | SIMPLE | a | eq_ref | PRIMARY,UNidx | PRIMARY | 74 | db.p.RecordID,const | 1 | Using where; Using index |
+----+-------------+-------+--------+---------------+---------+---------+---------------------+--------+--------------------------+
2 rows in set (0.00 sec)
EDIT: @DRapp's suggestion does make a huge difference. The query is fast with or without the index on access.UserName. If I drop the index and try DRapp's query without the STRAIGHT_JOIN
, then it is again slow.
DRapp's query without index on access.UserName:
mysql> explain SELECT STRAIGHT_JOIN p."RecordID", p."SerialNumber", p."Folder", p."NoteType", p."FirstName", p."LastName", p."DOB", p."Body", p."From", p."DateTxt", a."UserName" AS Access FROM access as a, parent AS p where a."UserName" = 'bob' and a."RecordID" > 123 and a."RecordID" = p."RecordID" order by a."RecordID" limit 500;
+----+-------------+-------+--------+---------------+---------+---------+---------------+---------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+---------------+---------+--------------------------+
| 1 | SIMPLE | a | range | PRIMARY | PRIMARY | 8 | NULL | 2382668 | Using where; Using index |
| 1 | SIMPLE | p | eq_ref | PRIMARY | PRIMARY | 8 | bb.a.RecordID | 1 | |
+----+-------------+-------+--------+---------------+---------+---------+---------------+---------+--------------------------+
2 rows in set (0.00 sec)
Same query with the index on access.UserName
:
mysql> explain SELECT STRAIGHT_JOIN ...;
+----+-------------+-------+--------+---------------+---------+---------+---------------+---------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+---------------+---------+--------------------------+
| 1 | SIMPLE | a | ref | PRIMARY,UNidx | UNidx | 66 | const | 1209780 | Using where; Using index |
| 1 | SIMPLE | p | eq_ref | PRIMARY | PRIMARY | 8 | db.a.RecordID | 1 | |
+----+-------------+-------+--------+---------------+---------+---------+---------------+---------+--------------------------+
2 rows in set (0.00 sec)
Same query without the index on access.UserName
and without the STRAIGHT_JOIN
:
mysql> explain SELECT p."RecordID", p."SerialNumber", p."Folder", p."NoteType", p."FirstName", p."LastName", p."DOB", p."Body", p."From", p."DateTxt", a."UserName" AS Access FROM access as a, parent AS p where a."UserName" = 'zzz' and a."RecordID" > 123 and a."RecordID" = p."RecordID" order by a."RecordID" limit 500;
+----+-------------+-------+--------+---------------+---------+---------+---------------------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+---------------------+--------+----------------------------------------------+
| 1 | SIMPLE | p | range | PRIMARY | PRIMARY | 8 | NULL | 484016 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | a | eq_ref | PRIMARY | PRIMARY | 74 | db.p.RecordID,const | 1 | Using where; Using index |
+----+-------------+-------+--------+---------------+---------+---------+---------------------+--------+----------------------------------------------+
2 rows in set (0.00 sec)
I've always had good luck using the "STRAIGHT_JOIN" clause in MySQL, and putting the primary table basis as first in the list of qualifiers. In this case, your "Access" table looking for Bob, THEN looking at the records Bob has access to see. If it fails at the ACCESS query, no need to look deeper. Additionally, since the join is based on same "RecordID", I've changed references to the "a." table. Since this query is based on username first, I would have a key on it too. I'd be interested in its Explain, and performance too.
SELECT STRAIGHT_JOIN
p."RecordID",
p."SerialNumber",
p."Folder",
p."NoteType",
p."FirstName",
p."LastName",
p."DOB",
p."Body",
p."From",
p."DateTxt",
a."UserName" AS Access
FROM
access as a,
parent AS p
where
a."UserName" = 'bob'
and a."RecordID" > 123
and a."RecordID" = p."RecordID"
order by
a."RecordID"
limit
500
精彩评论