Using lists and dictionaries to store temporary information

2023-01-18 12:50 问答作者：

I will have alot of similar objects with similar parameters. Example of an object parameters would be something like :

name, boolean, number and list.

The name must be unique value among all the objects while values for boolean, number and list parameters must not.

I could store the data as list of dictionaries i guess. Like that:

list = [
  {'name':'a', 'bool':true, 'number':123, 'list':[1,开发者_如何学Python 2, 3]},
  {'name':'b', 'bool':false, 'number':143, 'list':[1, 3, 5]},
  {'name':'c', 'bool':false, 'number':123, 'list':[1, 4, 5, 18]},
]

What would be the fastest way to check if the unique name exists in the list of dictionaries, before i create another dictionary in that list? Do i have to loop through the list and check what is the value of list[i][name]? What would be fastest and least memory conserving to hold and process that information, assuming, that different similar lists might be simultanously processed in different threads/tasks and that their size could be anywhere between 100 to 100 000 dictionaries per list. Should i store those lists in database instead of memory?

I understand that perhaps i should not be thinking about optimizing (storing the info and threads) before the project is working, so please, answer the unique name lookup question first :)

Thanks, Alan

If the name is the actual (unique) identifier of each inner data, you could just use a dictionary for the outer data as well:

data = {
  'a' : { 'bool':true, 'number':123, 'list':[1, 2, 3] },
  'b' : { 'bool':false, 'number':143, 'list':[1, 3, 5] },
  'c' : { 'bool':false, 'number':123, 'list':[1, 4, 5, 18] },
}

Then you could easily check if the key exists or not.

Btw. don't name your variables list or dict as that will overwrite the built-in objects.

once you come around to using a dict instead of a list, the fastest way to perform the check that you want is:

if 'newkey' not in items:
    # create a new record

since you want to be able to access these records from multiple threads, I would keep a collection of locks. BTW, this is the sort of thing that you design in the beginning as it's part of the application design and not an optimization.

class DictLock(dict):
    def __init__(self):
        self._lock = threading.Lock()

    def __getitem__(self, key):
        # lock to prevent two threads trying to create the same
        # entry at the same time. Then they would get different locks and
        # both think that they could access the key guarded by that lock
        with self._lock:
            if key not in self.iterkeys():
                self[key] = threading.Lock()
            return super(DictLock, self).__getitem__(key)

now if you want to modify your items, you can use the locks to keep it safe.

locks = DictLock()

with locks['a']:
    # modify a.

or to insert a new element

with locks['z']:
    #we are now the only ones (playing by the rules) accessing the 'z' key
    items['z'] = create_new_item()

What you want is an "intrusive" dictionary - something that looks for keys inside values. Unfortunately, I don't know of any implementation in Python. Boost's multi_index comes close.

If you don't want to change the data structure you have, then you can use the following. Otherwise, poke's answer is the way to go.

>>> my_list = [
...   {'name':'a', 'bool':True, 'number':123, 'list':[1, 2, 3]},
...   {'name':'b', 'bool':False, 'number':143, 'list':[1, 3, 5]},
...   {'name':'c', 'bool':False, 'number':123, 'list':[1, 4, 5, 18]},
... ]
>>> def is_present(data, name):
...     return any(name == d["name"] for d in data)
... 
>>> is_present(my_list, "a")
True
>>> is_present(my_list, "b")
True
>>> is_present(my_list, "c")
True
>>> is_present(my_list, "d")
False

If you pass any an iterable, it returns True if any one of its elements are True.

(name == d["name"] for d in data) creates a generator. Each time somebody (in this case, any) requests the next element, it does so by getting the next element, d, from data and transforms it by the expression name == d["name"]. Since generators are lazy i.e. the transformation is done when the next element is requested, this should use relatively little memory (and should use the same amount of memory regardless of the size of the list).

Store the objects in a dictionary with the name as the key:

objects = {'a' : {'bool':true, 'number':123, 'list':[1, 2, 3]},
           'b' : {'bool':false, 'number':143, 'list':[1, 3, 5]},
           'c' : {'bool':false, 'number':123, 'list':[1, 4, 5, 18]}}

This way you ensure that the names are unique since all the keys in the dictionary are unique. Checking is a name is in the dictionary is also easy:

name in objects

继续阅读：python

Using lists and dictionaries to store temporary information

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？