开发者

Python + MongoDB document versioning

I have an app in the works for use internally as a project/task tracker in the company that I'm working for. Playing around with MongoDB atm. I have the following pseudo-schema in mind:

task
    _id
    name
    project
    initial_notes
    versions
        number
        versions
            version_1
                worker
                status
                date(if submitted)
                review_notes(if rejected)
                reply_on(if accepted/rejected)
            (version_n)(if any)

The problem that I'm having is with versioning the task. I've read a number of possible ways but I'm falling short of understanding t开发者_开发技巧hem all the way through. I read something that I liked here and really like the way mongoid does it's versioning

Thinking of it better I'd rather have it something like this

task
    _id
    versions
        number_of_versions: 3
        current_version
            version_no: 3
            worker: bob
            status: accepted
        old_versions
            version
                version_no: 2
                worker: bob

I would like to show the most recent version only when displaying a collection of tasks and I would like to show all versions of a particular task when entering the detailed information page for that particular task. Would this structure work? If yes, what would be some queries needed to run in order to achieve what I need?

Thank you in advance for your time reading this and maybe answering it. status: rejected version version_no: 1 worker: smith status: rejected


Yes, why not. That scheme would work. Also, have you considered something like this:

task 
    ...
    versions = [       # MongoDB array 
        {   version_id 
            worker
            status
            date(if submitted)
            review_notes(if rejected)
            reply_on(if accepted/rejected)
        },
        { version_id : ... }, 
        ... 

Possible version insertion query:

tasks.update( { # a query to locate a particular task}, 
              { '$push' : { 'versions', { # new version } } } )

Note, that retrieving the last version from the versions array in this case is done by the program, not by Mongo.


You might also consider a model like this:

task
    _id
    ...
    old_versions [
        {
            retired_on
            retired_by
            ...
        }
    ]

This has the advantage that your current data is always at the top level (you don't need to track the current version explicitly, the document is its own current version), and you can easily track history by taking the current version, removing the old_versions field, and $pushing it to the old_versions field in the db.

Since you seemed to want to minimize network IO, this also lets you easily avoid loading the old_versions when you don't need them:

> db.tasks.find({...}, {old_versions: 0})

You could also get fancier and store a list of old versions only of fields that have changed. This requires more delicate code in your application layer, and may not be necessary if you don't expect many revisions or for these documents to be very large.


I was dealing with the same issue, which is why I created HistoricalCollection:

https://pypi.org/project/historical-collection/

Works just like a normal collection, with the addition of a few additional methods:

  • patch_one()
  • patch_many()
  • find_revisions()
  • latest()

An example of the usage:

from historical_collection.historical import HistoricalCollection
from pymongo import MongoClient
class Users(HistoricalCollection):
    PK_FIELDS = ['username', ]  # <<= This is the only requirement

# ...

users = Users(database=db)

users.patch_one({"username": "darth_later", "email": "darthlater@example.com"})
users.patch_one({"username": "darth_later", "email": "darthlater@example.com", "laser_sword_color": "red"})

list(users.revisions({"username": "darth_later"}))

# [{'_id': ObjectId('5d98c3385d8edadaf0bb845b'),
#   'username': 'darth_later',
#   'email': 'darthlater@example.com',
#   '_revision_metadata': None},
#  {'_id': ObjectId('5d98c3385d8edadaf0bb845b'),
#   'username': 'darth_later',
#   'email': 'darthlater@example.com',
#   '_revision_metadata': None,
#   'laser_sword_color': 'red'}]
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜