Examining then updating rows in sqlite with Python

2023-01-20 11:14 问答作者：

I've decided that I'm not fond of the automatic filing and renaming provided by Mendeley, but have found that it keeps track of everything in an sqlite database that I can easily read and modify from Python.

My question is, if I'm going to iterate over the rows of a table containing file paths and hashes that are serving as identifiers that are used elsewhere in the database, what is an appropriate pattern to use to update the paths as I move along, check if the file needs renaming, and the开发者_StackOverflow中文版n want to update the row in the db as I move the file.

I started putting together some Python to do this, but it seems to me that I probably need to do a query upfront to get all the rows I'm going to later iterate through, and then subsequently do REPLACE or UPDATE statements for files I've decided to move.

I'm not familiar with the internals of the sqlite3 module, but I presume it would be a bad idea to do REPLACE/UPDATEs while iterating on the cursor used for the original select.

Is there another way to do this that doesn't require going back with REPLACE/UPDATEs? I do need to do queries from other tables, using the file hash to get some other metadata in order to construct the rename.

I don't think there's a way to modify the rows in the table other than via SQL -- UPDATE, or INSERT OR REPLACE. (It's tempting to think that the sqlite3.Row object might allow assignment, and write back to the row it represents, but no.)

As for doing UPDATEs (via a 2nd cursor) while there's an outstanding cursor used for the original SELECT -- you ask if this is a bad idea -- I don't know if you're worried about correctness or performance (correctness: the cursor gets confused and doesn't iterate all the rows it should exactly once; performance: the cursor does iterate all the rows it should exactly once but there are a bunch of expensive extra queries)?

WRT correctness, this seems to work fine -- I just did a quick test where I created two cursors, c1 and c2, then on c1 executed a SELECT command, then on c2 executed an UPDATE command affecting the same rows, then looked at the result of c1.fetchall(). It still contained all the right rows, though the data in the first row was stale (probably fetched with the initial query) and the data in the later rows was updated (probably fetched only with the fetchall after the update).

(Also, if the UPDATE on c2 affects which records would be returned by the SELECT on c1, such records will no longer be fetched, but obviously anything already fetched was already fetched, and again the cursor.execute() seems to fetch the first record immediately, so using the same command order as above and with an UPDATE command that would make the original SELECT find nothing, the later c1.fetchall() still returns one record.)

As for performance, I don't know -- from the preceding paragraph it looks to me like it has to repeat the query, which may or may not be expensive depending on the query. (It may not be literally repeating it; perhaps this work is deferred till the actual fetch anyway.)

Short story: I don't think there's another way to do this without UPDATE or REPLACE, and I think it will work ok to do those UPDATEs even while iterating on the original cursor.

继续阅读：python

Examining then updating rows in sqlite with Python

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？