150M records order by name
I have a dataset of around 开发者_开发知识库150 million records that's generated daily it contains:
member_id
, member_name
, member_name_first_letter
I need to get the list ordered by member_name asc
I have try with MySQL if i break down by first letter of member_name and run the query for everyone once (around 40 different: A-Z, digits, special chars) I'm able to fill a table in around total 40 min, without breakdown to frist later it tkaes hours. for the test i'm using only 13 million records. Anyway I need to get them much faster (target: 1-2 min).
How I can do that? Can MongoDB help?
If you use InnoDB, and make member_name the primary key of the table, it will always be ordered by that column implicitly, so you can SELECT ... FROM members
with no order-by clause. InnoDB tables are actually a clustered index.
Re your comment: You could also define a compound primary key and the table would be in member_name order if member_name is the first column in the primary key.
Mongo is not an option, try to use index on member_name,
other solutions is to order and data and split to several tables or create index for member_name field using external tools
Extending Bill Karwin's idea, if you can add a new column while acquiring the data, you can generate a unique primary key by appending a timestamp to the name, separated by a space to avoid any weird concatenation problems (well its unlikely anyone's name would be numeric, but still).
精彩评论