How to optimise join between two MySQL tables?
I have two tables: p_group.full_data, which is a large dataset I'm working on (100k rows, 200 columns) and p_group.full_data_aggregated, which I've produced to summarise a load of other tables.
Now, what I'd like to do is perform a join between full_data and full_data_aggregated to select out certain rows, averages, and so on. The query I have is as foll开发者_开发知识库ows:
SELECT 'name', p.group_id, a.group_condition, p.event_index, AVG(p.value) FROM p_group.full_data p
JOIN p_group.full_data_aggregated as a on p.group_id = a.group_id AND p.event_index = a.event_index
WHERE (a.group_condition='open')
GROUP BY p.group_id, p.event_index
I have an index on: full_data.group_id, full_data.event_index and full_data_aggregated.group_id, full_data_aggregated.event_index, full_data_aggregated.group_condition.
Now, the problem is that this query simply won't finish: previously, I had my full_data split up into different tables (one for each group_id), and that worked fine. But now that I have joined the groups together, the query sits there running and so I can only assume I have done something stupid.
Is there anything else I can try to actually get this query to run at a decent speed? I'm sure I've messed up something with indices and the group by function, but I can't work out what. I've tried all sorts of variations of the above query. EXPLAIN indicates that it is "using where; using temporary; using filesort" but I'm not sure how to fix this.
Thanks!
I assume that your indexes are combination indexes (with group_id and event_index together). If you have separate indexes for each field, then only one index is used at a time and the database engine is going through significantly more data.
For example, if you only have a few unique group_id, but lots of event_index, and you have two indexes, one on group_id only, and the other one on event_index, then you query is going to run through a large number of rows for each group_id. If you have one index instead, with both fields in order, then the query will run much faster.
精彩评论