Efficient database design for frequent logging of small things

2023-01-14 17:02 问答作者：

I have a database which stores temperature-logging data from various instruments. Data may be logged as often as once per minute. One approach to designing a log table would be to put each log entry in its own row along with the device ID, a time stamp, and a sequence number (even if the clock on a device is changed, it should be possible to sort entries in the order the measurements were actually taken). That would seem incredibly grossly inefficient, however, since every 16-bit measurement would have probably 16 bytes of other data attached to it, in addition to whatever the system adds for indexing. I recognize that it is often senseless to try to optimize every last byte out of a database, but expanding data by a factor of 9:1 or worse seems silly.

At present, I aggregate the records into groups of equally-spaced readings, and store one group per record in a va开发者_运维知识库riable-length opaque binary format along with the device ID, time stamp and sequence number for the first reading, and interval between readings. This works nicely, and for all I know may be the best approach, but it doesn't allow for much in the way of queries.

Is there any nice approach for handling such data sets without excessive redundancy?

Your data doesn't expand by a factor of 9. Your data stay roughly the same, because you do not have a 16 bit measurement to start with. Your measurement is the Nth measurement from a specific device at a specific moment. So your data does have a sequence number, a device ID and a timestamp even before you add them to the database, whether you're willing to account for it or not.

If you store data in a relational table (SQL), store it in a relational format: normalized. One record per row. Store information in queryable format. 'aggregating' records in a an opaque binary format makes your entire database useless, as the data cannot be queried, aggregated, filtered, nothing. Your definition of 'this works nicely' is basically 'I can write the data and nobody can make any use of it', which is hardly 'nice'. You may just as well dump the data to /dev/nul...

Store the data as proper records. Store data as proper database types, don't use 'opaque blobs'. And 'data may be logged as often as once per minute' is not 'frequent' by any database standards. If you'd say '100 times per second' then we'd have something to talk about.

Is this really a problem? Imagine we overestimate a bit and say that there is 50 bytes of data+metadata per measurement. Google suggests that you won't have many problems unless you are in a really tight environment.

I believe you should use RRDtool to store such data. Wikipedia article.

继续阅读：database-design sql

Efficient database design for frequent logging of small things

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？