Need basic advice regarding mysql indexes and query performance

2023-01-23 23:19 问答作者：

I'm a relative newbie when it comes to database design. I know how to define the tables, but I'm starting to run into performance issues as my database grows. Can someone give me some guidance on how to use mysql indexes to improve query performance? My understa开发者_开发技巧nding of indexes is that they prevent the entire database table from being searched line by line, while still being able to return valid results.

I have typically just defined a primary key for each table that gives a unique id for each row. Is the idea behind defining an "index" that you do it for each field that will be used in the WHERE clause of a mysql statement? Not sure if that is too general of a statement.

As an example, lets say we have the following three tables:

products

products_id, products_name

categories

categories_id, categories_name

products_to_categories

products_id categories_id

and we want to run the following query:

SELECT p.products_name, c.categories_name 
  FROM categories as c 
  JOIN products as p 
  JOIN products_to_categories as p2c 
 WHERE p.products_id=12345

Would we define both fields in the products_to_categories table as indexes and then the products_id and categories_id as primary keys within their parent table?

Any advice or guidance on the general approach to indexes would be much appreciated!

Would we define both fields in the products_to_categories table as indexes and then the products_id and categories_id as primary keys within their parent table?

Yes, that's exactly what I would do.

Also, if any data is going to be unique, like product name for example, you can give a unique key to that column. This will let your database engine stop searching the entire table when it finds the unique item.

Here's your query optimized:

SELECT p.products_name, c.categories_name 
  FROM categories as c
  LEFT JOIN products_to_categories as p2c ON p2c.categories_id = c.categories_id
  INNER JOIN products as p ON p.products_id = p2c.products_id
 WHERE p.products_id=12345

The best way to think about indexes is how you expect to query the data.

Let's assume that products_id and categories_id are PRIMARY KEY in your database, which means they are indexed automatically. If not, start with that.

When I do multi-join tables, if you want to be paranoid, create two indexes to allow bi-directional accessibility of the IDs, e.g.

CREATE TABLE products_to_categories (
    products_id integer unsigned NOT NULL,
    categories_id integer unsigned NOT NULL,
    INDEX p_to_c (products_id,categories_id),
    INDEX c_to_p (categories_id,products_id)
) ENGINE=MyISAM;

This takes a lot of space, but it will be really, really fast, and unless you query both directions (from products to categories, and then reverse), it's probably overkill. Alternatively, by default, I do:

CREATE TABLE products_to_categories (
    products_id integer unsigned NOT NULL,
    categories_id integer unsigned NOT NULL,
    INDEX p (products_id),
    INDEX c (categories_id)
) ENGINE=MyISAM;

If you need some sort of constraint (many-to-one, one-to-many) then change your index types to UNIQUE etc.

In general, start with the latter definition, do your query, and run an EXPLAIN on it. If it shows anything which is more than 1 for the number of matched rows (except for the first table), then re-work the indexes.

Database indexing is really more a matter of testing and diagnostics than many think. I didn't know how to do this for a while, until I actually had a problem. In short:

Create your indexes
Determine your queries
Run EXPLAIN on your queries, and run timing tests to determine query speed!
Adjust your indexes
Go back to 3

As one user commented below, EXPLAIN is a good starting point before running timing tests, but nothing beats actual timing tests in the wild.

Yes, define both fields in the products_to_categories table as indexes and then the products_id and categories_id as primary keys within their parent tables. And yes, the idea behind defining an index is that you do it for each field that will be used in the WHERE clause of a mysql statement.

Indexes are automatically generated for fields with a unique constraint, including primary keys. But indexes should exist on all fields used in a join in both tables. Generally, use the explain <SQL> tool to find out exactly which indexes are being used. Watch out for full table scans of joined tables when you're looking for specific values.

Strange how most answers so far dont include a primary key on the products_to_categories table so it would be possible to insert duplicates e.g.

insert into products_to_categories (prod_id, cat_id) values (1,1),(1,2),(1,1); -- oops

You might want to check the following answer of mine which is optimised for selecting products by category without the need of secondary indexes.

Rewriting mysql select to reduce time and writing tmp to disk

继续阅读：sql

Need basic advice regarding mysql indexes and query performance

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？