Python In-memory table

2022-12-25 04:47 问答作者：

What is the right way to forming in-memory table in python with direct lookups for rows and columns.

I thought of using dict of dicts this way,

class Table(dict):
    def __getitem__(self, key):
        if key not in self:
             self[key]={}
        return dict.__getitem__(self, key)
table = Table()
table['row1']['column1'] = 'value11'
table['row1']['column2'] = 'value12'
table['row2']['column1'] = 'value21'
table['row2']['column2'] = 'value22'
>>>table
{'row1':{'column1':'value11','column2':'value12'},'row2':{'column1':'value21','column2'开发者_运维知识库:'value22'}}

I had difficulty in looking up for values in columns.

>>>'row1' in table
True
>>>'value11' in table['row1'].values()
True

Now how do I do lookup if 'column1' has 'value11'

Is this method of forming tables wrong?

Is there a better way to implement such tables with easier lookups?.

Now how do I do lookup if 'column1' has 'value11'

any(arow['column1'] == 'value11' for arow in table.iteritems())

Is this method of forming tables wrong?

No, it's just very "exposed", perhaps too much -- it could usefully be encapsulated in a class which exposes the methods you need, then the issue of how best to implement them does not affect all the rest of your application.

Is there a better way to implement such tables with easier lookups?

Once you have designed a class whose interface you'd like to use, you can experiment with very different implementation approaches and benchmark them on a workload that's representative of your usage pattern, so you can find out what's best for you (assuming table manipulation and lookup are a big part of your application's runtime, of course -- to find out, profile your app).

I had similar but not identical needs in a large internal app I maintain at work, except that the row indices are integer (only the column names are strings), the column order is important, and the workload is more about "editing" the table (adding, removing, reordering rows or columns, renaming columns, etc). I started with a table exposing the functionality I needed, with the simplest rough-and-ready implementation internally (a list of dicts, plus a list of column names for the column ordering); and by now I have evolved it (independently of the actual "application-level" parts, but based on profiling and benchmarking thereof) to completely different implementations (currently based on numpy).

I think you should proceed along similar lines: "clothe" your current implementation into a nice "interface" with all the methods you need, profile your app -- unless this table object is a performance bottleneck, you're done; if it is a bottleneck, you can optimize the implementation (experiment, measure, repeat;-) without disturbing any of the rest of your application.

Inheriting from dict is not a good idea because you probably don't want to expose all of dict's rich functionality; plus, what you're doing is, roughly, an inefficient implementation of collections.defaultdict(dict). So, encapsulate the latter:

import collections

class Table(object):
    def __init__(self):
        self.d = collections.defaultdict(dict)
    def add(self, row, col, val):
        self.d[row][col] = val
    def get(self, row, col, default=None):
        return self.d[row].get(col, default)
    def inrow(self, row, col):
        return col in self.d[row]
    def incol(self, col, val):
        return any(x[col]==val for x in self.d.iteritems())

etc, etc -- write all the methods your app needs, with useful, short names, then maybe see if you can alias some of them as special methods if they're often used that way, e.g maybe (assuming Python 2.* -- requires slightly different syntax in 3.*):

    def __setitem__(self, (row, col), val):
        self.add(row, col, val)

and so forth. Once you have the code working, then comes the right time for profiling, benchmarking, and -- just perhaps -- internal optimization of the implementation.

I'd use an in-memory database with SQLite for this. The sqlite module is even in the standard library since Python 2.5, which means this doesn't even add much to your requirements.

A nested list should be able to do the job here. I would only use nested dictionaries if elements are spread thin across the grid.

grid = []
for row in height:
  grid.append([])
    for cell in width:
      grid[-1].append(value)

Checking rows is easy:

def valueInRow(value, row):
  return value in grid[row]

Checking collumns takes a little more work, because the grid is a list of rows, not a list of collumns:

def collumnIterator(collumn):
  height = len(grid)
  for row in xrange(height):
    yield grid[row][collumn]

def valueInCollumn(value, collumn):
  return value in collumnIterator(collumn)

Now how do I do lookup if 'column1' has 'value11'

Are you asking about this?

found= False
for r in table:
    if table[r]['column1'] == 'value11'
        found= True
        break

Is this what you're trying to do?

继续阅读：python row

Python In-memory table

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？