开发者

Choosing a primary key,surrogate key, index in the mysql stock trading database

Database type: mysql

Columns: Date,time,price1,qty1,price2,qty2 time will be in milliseconds number of records approx 5.5 million for a month.

I cant choose date as primary key as it is not unique, but can choose date and time as combined but that is also not a good idea.

i will be running queries like select price and qty between 'this date and time' and 'that date and time' and result might be in millions range.

what could be the best choice in terms of primary key, index an开发者_开发技巧d surrogate key and what is the best way to implement this. how should i optimize the database.


Not sure why you say choosing both date and time would be a bad idea (are you against composite keys?)

A bigger problem for you is that time does not store milliseconds. See this bug for more data on that: http://bugs.mysql.com/bug.php?id=8523

Also, there seems to be something missing from the key that identifies the Stock such as Ticker. Since the ticker can change over time, it might be a good idea to introduce a surrogate for it such as StockID. You would do this in a table called Stock or similar.

Then for your Trade table, I would suggest using StockID, Date and Time (but store the time in something other than the TIME datatype so you can store milliseconds. Ask another question if you need help with that).

The order of the keys in the PK is important for both storage and retreival. For retrieval, you want to put the most selective keys for your query first. So if you tend to access all the data for a stock at once (or for a set of stocks), put StockID first so the index can be used to find them quickly. If you tend to access all data for a given interval, put Date then Time first.

For storage, its better to be appending so having Date and Time first is a good idea here too.

In case you want to access mostly in date ranges, but sometimes by Stock, put a secondary index on StockID.


As you don't have a natural key (so nothing unique within each row), you'd need to add a surrogate key (for the sake of argument "transactionid"). You can still have your index based on date time (that really, really should be a single column) for efficient period scanning.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜