SQLite inconsistent performance
I have a hand written ORM built on top of the Qt object system. I'm testing it with SQLite back end, and I'm seeing weird performance issues. There are about 10k objects store开发者_开发百科d in the database. The objects are loaded one by one with separate queries.
One of the queries exhibits changes in execution time: from 1 msec to 10 depending on the primary key ID. This time also includes some operations done by the Qt Sql module.
The query is very simple and looks like this (the id=100 differs between the queries):
SELECT * FROM t1, t2 WHERE t1.id = 100 AND t2.id = 100
What might cause the same query to perform 10 times worse depending on the row id?
Considering that you are timing operations in milliseconds the behaviour that you are observing makes quite a bit of sense. Benchmarking single query runs with this kind of time granularity normally makes no sense, unless you are only interested in latency, rather than throughput.
With your particular query, for example, you would see a significant difference depending on whether there are mathing rows in t1
, since that would determine whether SQLite should bother looking at t2
at all.
Even running exactly the same query will produce different results depending on the OS filesystem cache, the process scheduler, the SQLite cache, the position of the hard disk plates and heads and various other factors.
Two be more specific, there are two possibilities:
A. t1.id
and t2.id
are indexed
This is the most probable case - I'd expect a table column aptly named id
to be indexed.
Most SQL engines, including SQLite, use some variation of a B-tree for each index. On SQLite each tree node is a single page in the DB file. With your particular query, SQLite will have to go through:
- Some pages of the
t1.id
index - Some pages of the
t2.id
index - The DB pages where the matching rows from both tables are contained.
Depending on your hardware and how the pages are located on the physical medium (e.g. your hard drive), loading a page can easily add a latency of several milliseconds. This is especially visible on large or freshly-loaded databases where the pages are neither in the OS filesystem cache nor in the SQLite3 cache.
In addition, unless your DB is really small, it usually won't fit in the SQLite3 cache and cache hits and misses alone can account for rather severe variations in the time a single query needs to complete: an SQLite cache miss forces a read from the filesystem, which can easily cause the DB process to be rescheduled by the OS in favour of another process.
B. t1.id
and t2.id
are not indexed
This is probably easier to visualise: with no indexes, SQLite has to scan the whole table. Assuming that you have a limit in your SELECT
statement (you do not have one in your example), whether a matching entry will be found immediately or after going through the whole table is up to luck, hence the severe variation in query completion times.
精彩评论