MySQL: Optimizing SELECT from 6 huge identical tables with different data split up by timestamp

2023-01-14 02:35 问答作者：

please I have the same problem as I found here

MySQL - Selecting data from multiple tables all with same structure but different data ,

I have to select data from many MySQL tables with identical structure, but different data (split up into table_0, table_1, table_2 etc to table_5 to distribute开发者_高级运维 millions of records of data).

The hardware generating the data records for each device moves from table to table according to timestamp field, which is NOT unique. e.g. 50 records in table_0 may have the same timestamp. When the data gets to the end of table_5, it goes back to table_0 to start overwriting the data there. I need to get the data on each device within a time range.

Each table's data columns (for table_0, table_1... up to table_5):

timestamp, robotGroupID, robotID, sensor1, sensor2, sensor3, ... (many of them)

However the tables are HUGE and the UNION ALL (I read its faster than DISTINCT) takes forever to execute, even with just two tables let alone 6. e.g. I will illustrate for two tables below.

MySQL statement in PHP: (illustrated for just sensor 1, sensor 2 and sensor 3)

(SELECT sensor1, sensor2, sensor3 FROM table_0 WHERE robotID=".$robotID." AND timestamp BETWEEN ".$timeStampStart." AND ".$timeStampStop)

UNION ALL

(SELECT sensor1, sensor2, sensor3 FROM table_1 WHERE robotID=".$robotID." AND timestamp BETWEEN ".$timeStampStart." AND ".$timeStampStop)

N.B it is the exact same query except for the table name. Sensor data for a robot within a time range may span none, one, or more of the tables at once.

I cannot use LIMIT because the number of reports from robots within each time range cannot be known ahead of time. I cant use the MERGE STORAGE ENGINE cos I only have read-only access to the company's database.

I have an idea to use count(robotID) or so on each table to check before running queries but Im not sure how to go about this cos I'm quite a novice.

Please how do you think I can make this work faster for 6 tables and many more columns since there are many more columns than illustrated? Thanks in advance!

Are the fields RobotID and Timestamp indexed?

I would add a multi-field index of ( RobotId, timestamp ) at the very least.

You say you have read only access to the tables, so can you request this index to be added? I'm sure it will help in both your original and updated queries posted.

I must confess Im still a novice PHP/MySQL coder, but with many ideas; so my code is probably "dirty".

So I solved the problem this way in order to move forward, but please better solutions are welcome. As for any strange syntax, I am using a database class built upon the PHP PDO because I am using many different RBDMS types on this project.

For the $myQuery_start variable, I added the names of the other columns as well as sensors 1 to 3. $myQuery_start = "(SELECT sensor1, sensor2, sensor3 FROM "; $myQueryCount_start = "(SELECT COUNT(*) FROM "; $myQuery_stop = " WHERE robotID=".$robotID." AND timestamp BETWEEN ".$timeStampStart." AND ".$timeStampStop.")";
$count_0 = DB::getDB("mysql", $myDB)->query($myQueryCount_start."table_0".$myQuery_stop)->fetchColumn(); $count_1 = DB::getDB("mysql", $myDB)->query($myQueryCount_start."table_1".$myQuery_stop)->fetchColumn(); $count_2 = DB::getDB("mysql", $myDB)->query($myQueryCount_start."table_2".$myQuery_stop)->fetchColumn(); $count_3 = DB::getDB("mysql", $myDB)->query($myQueryCount_start."table_3".$myQuery_stop)->fetchColumn(); $count_4 = DB::getDB("mysql", $myDB)->query($myQueryCount_start."table_4".$myQuery_stop)->fetchColumn(); $count_5 = DB::getDB("mysql", $myDB)->query($myQueryCount_start."table_5".$myQuery_stop)->fetchColumn();
And now I check to see if UNION ALL needs to be appended to each table's query or not. No need to have a UNION ALL if there is no data record to attach in the next table.
$union_0 = (($count_1 + $count_2 + $count_3 + $count_4 + $count_5) > 0)?" UNION ALL ":""; $union_1 = (($count_2 + $count_3 + $count_4 + $count_5) > 0)?" UNION ALL ":""; $union_2 = (($count_3 + $count_4 + $count_5) > 0)?" UNION ALL ":""; $union_3 = (($count_4 + $count_5) > 0)?" UNION ALL ":""; $union_4 = (($count_5) > 0)?" UNION ALL ":"";
and now we build up the table queries and combine to form the full query

$query_0 = ($count_0 > 0)?$myQuery_start."ip_minute_stats_0".$myQuery_stop.$union_0:""; $query_1 = ($count_1 > 0)?$myQuery_start."ip_minute_stats_1".$myQuery_stop.$union_1:""; $query_2 = ($count_2 > 0)?$myQuery_start."ip_minute_stats_2".$myQuery_stop.$union_2:""; $query_3 = ($count_3 > 0)?$myQuery_start."ip_minute_stats_3".$myQuery_stop.$union_3:""; $query_4 = ($count_4 > 0)?$myQuery_start."ip_minute_stats_4".$myQuery_stop.$union_4:""; $query_5 = ($count_5 > 0)?$myQuery_start."ip_minute_stats_5".$myQuery_stop:"";
Then concatenated:
$myQuery = $query_0.$query_1.$query_2.$query_3.$query_4.$query_5;
And finally $myQuery is executed to produce all the data as required.

At least this is roughly 8 times faster than the previous way I used UNION ALL, so I think this is valid. Any suggested further optimization?

If you can convince them to let you change the database structure, you can GREATLY optimize the layout of your database with the help of MySQL Partitioning. You'll want to research "Range Partitioning", and set up partitioning rules that will tell MySQL to automatically sort your data into invisible subtables for way quicker SELECT results. You won't even need multiple tables.

See http://dev.mysql.com/doc/refman/5.1/en/partitioning-overview.html

继续阅读：query-optimization select timestamp union-all

MySQL: Optimizing SELECT from 6 huge identical tables with different data split up by timestamp

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？