How to load existing db file to memory in Python sqlite3?

2023-01-18 08:05 问答作者：

I have an existing sqlite3 db file, on which I need to make some extensive calculations. Doing the calculations from the file is painfully slow, and as the file is not large (~10 MB), so there should be no problem to l开发者_JS百科oad it into memory.

Is there a Pythonic way to load the existing file into memory in order to speed up the calculations?

Here is the snippet that I wrote for my flask application:

import sqlite3
from io import StringIO
    
def init_sqlite_db(app):
    # Read database to tempfile
    con = sqlite3.connect(app.config['SQLITE_DATABASE'])
    tempfile = StringIO()
    for line in con.iterdump():
        tempfile.write('%s\n' % line)
    con.close()
    tempfile.seek(0)

    # Create a database in memory and import from tempfile
    app.sqlite = sqlite3.connect(":memory:")
    app.sqlite.cursor().executescript(tempfile.read())
    app.sqlite.commit()
    app.sqlite.row_factory = sqlite3.Row

What about sqlite3.Connection.backup(...)? "This method makes a backup of a SQLite database even while it’s being accessed by other clients, or concurrently by the same connection." Availability: SQLite 3.6.11 or higher. New in version 3.7.

import sqlite3

source = sqlite3.connect('existing_db.db')
dest = sqlite3.connect(':memory:')
source.backup(dest)

sqlite3.Connection.iterdump "[r]eturns an iterator to dump the database in an SQL text format. Useful when saving an in-memory database for later restoration. This function provides the same capabilities as the .dump command in the sqlite3 shell."

Get such an iterator and dump the disk-based database into a memory-based one, and you're ready to compute. When the computation is done, just dump the other way around back to disk.

First you should try and find out what is causing the slowness you are observing. Are you writing to tables? Are your writes within large enough transactions so that you don't save needless temporary results to disk? Can you change writes to go to temporary tables (with pragma temp_store=memory)? Can you live with pragma synchronous=off?

I don't think this functionality is exposed in the Python module, but sqlite has a backup API that sounds like exactly what you are asking for: a way to copy from one database to another (either one of which may be an in-memory database) that works pretty much automatically without any user-visible enumeration of tables. (Maybe APSW exposes this?)

Another option is to create a ram disk (if you have sufficient control of the environment) and copy the file there.

if we must use a python wrapper,then there are no better solution than the two pass, read and write solution. but beginning with version 3.7.17, SQLite has the option of accessing disk content directly using memory-mapped I/O.sqlite mmap

if you want to use mmap,you have to use the C interface since no wrapper provide it.

and there is another hardware solution,the memory disk.then you have the convenient file IO and the speed of memory.

This has already been answered before, including code examples at In python, how can I load a sqlite db completely to memory before connecting to it?

You do not mention operating system, but one gotcha of Windows XP is that it defaults to a 10MB file cache, no matter how much memory you have. (This made sense in the days when systems came with 64MB etc). This message has several links:

http://marc.info/?l=sqlite-users&m=116743785223905&w=2

Here is a relatively simple way to read a SQLite db into memory. Depending upon your preferences with regard to manipulating data, you either use Pandas dataframe or write your table to a in-memory sqlite3 database. Similarly after manipulating your data you use the same df.to_sqlite approach to store your results back into a db table.

import sqlite3 as lite
from pandas.io.sql import read_sql
from sqlalchemy import create_engine

engine = create_engine('sqlite://')
c = engine.connect()
conmem = c.connection
con = lite.connect('ait.sqlite', isolation_level=None) #Here is the connection to <ait.sqlite> residing on disk
cur = con.cursor()
sqlx = 'SELECT * FROM Table'
df = read_sql(sqlx, con, coerce_float=True, params=None) 

#Read SQLite table into a panda dataframe
df.to_sql(con=conmem, name='Table', if_exists='replace', flavor='sqlite')

With the solution of Cenk Alti, I always had a MemoryError with Python 3.7, when the process reached 500MB. Only with the use of the backup functionality of sqlite3 (mentioned by thinwybk), I was able to to load and save bigger SQLite databases. Also you can do the same with just 3 lines of code, both ways.

The answers of @thinwybk and Crooner are both excellent.

When you have multiple connections to :memory: sqlite databases, for instance when using SQLAlchemy together with the source.backup(dest) function, then you may end up in not placing the backup into the "right" memory DB.

This can be fixed using a proper connection string: https://stackoverflow.com/a/65429612/1617295 and does not involve any hack not use of undocumented features.

sqlite supports in-memory databases.

In python, you would use a :memory: database name for that.

Perhaps you could open two databases (one from the file, an empty one in-memory), migrate everything from the file database into memory, then use the in-memory database further to do calculations.

继续阅读：performance python

How to load existing db file to memory in Python sqlite3?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？