开发者

data structure to speed in-memory tagged object search on boolean function of tags?

If I have a set of tags (<100), and a set of objects (~25000), where each object has some sub-set of the tags, do you know of an existing data-structure that would allow for fast retrieval of those objects that satisfy some boolean function of the tags?

Addition/deletion of tags and objects need not be particularly fast, but selection of those objects with tags that satisfy the boolean function should be.

Now that I have written my question down, it looks as if I'm describing an in-memory database, but originally I was thinking of some binary tree like structure for the objects where, for each branch, taki开发者_如何学Gong the left/right branch would be equivalent to deciding on have/have-not some tag. But that would not allow don't-care tags? i am asking as I wondered if this has been done before and find it hard to google for data structures.

  • Thanks in advance - Paddy.


Here's a suggestion: Use a bit-array for each tag, with as many elements as there are objects; each index of which represents one object. The value at each index is 1 if that object has that tag.

Boolean functions on tags are then fast set operations on this bit-array. And the resulting bit-array gives you the documents that satisfy the criteria.

This not very efficient if the tags or objects keep changing frequently but is perhaps applicable for you.


How fast you would need? How complex you boolean function are i.e. how many tags are used in single typical function?

How about using some in memory SQL database? You could then express the boolean function with simple SELECT query.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜