sqlite: Fastest way to get all rows (consecutive disk access)

2023-03-31 00:49 问答作者：

I want to read all rows in a table using system.data.sqlite. As I have a very bi开发者_开发技巧g Table (>450GB, with > 6 billion rows) I want to be sure that sqlite will use consequtive disk access. As you may know a random access to hard disk is slow. Due to memory restictions I can not load all the data at once. So the optimal way would be if sqlite reads some hundred MB (consecutive), then I work with this data and sqlite reads the next.

How can I be sure that sqlite will do disk access this way and not jumping from one position on hard disk to another?

Things I know (I think these suggestions will show up):

It could be better to use an other DBMS. But I want/need to solve it wiht this one.
I know that the disk head will be positioned by the OS on other data while I am workign with thhe data. This doesn't matter. It is just that some hundret MB will be read consecutively.
I don't want to / can split the database file up into smaller pieces

I found this post, but it is not addressing my problem correctly:

Which is the fastest way to retrieve all items in SQLite?

That's what clustered index are for. sqlite doesn't support them though.

The following is copied from: http://www.sqlite.org/cvstrac/wiki?p=PerformanceTuningWindows

Four: Clustered Indexes

SQLite doesn't support clustered indexes (simply, indexes that force the data in the database to be physically laid down in the SAME order as the index needs it to be in.)

This means that if your index is sequential INTEGER, the records are physically laid out in the database in that INTEGERs order, 1 then 2 then 3.

You can't make a Clustered index, but you CAN sort your data in order so that any historical data is ordered nicely. Of course, as the database matures, you lose that, but it helps

Someone else posted this, and it is a nice example to use, so I will. If you have a table WIBBLE whose field KEY you want to access a lot, it would be nice if everything was in order. Using the command line tool, you can create a fake cluster by doing the following:
create table wibble2 as select * from wibble;
delete from wibble;
insert into wibble select * from wibble2 order by key;
drop table wibble2;

Bottom line is you can re-order your records manually, but I imagine this would be practical for you only if you don't intend to write to the table to often.

In SQLite, table rows are stored sorted by the rowid, so the most efficient way to read the rows in this order is to sort by this column (or the alias you declared with INTEGER PRIMARY KEY):

SELECT * FROM wibble ORDER BY rowid

For a WITHOUT ROWID table, you sort by the primary key column(s):

SELECT * FROM wibble ORDER BY MyPrimary, KeyColumns

From @CL. answer in this post:

In SQLite, indexes created with CREATE INDEX are non-clustered indexes.

Since version 3.8.2, SQLite supports WITHOUT ROWID tables, which are clustered indexes.

继续阅读：system.data.sqlite

sqlite: Fastest way to get all rows (consecutive disk access)

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？