Python: intersection of lists/sets

2023-01-16 09:26 问答作者：

def boolean_search_and(self, text):



    results = []
    and_tokens = self.tokenize(text)
    tokencount = len(and_tokens)

    term1 = and_tokens[0]
    print ' term 1:', term1

    term2 = and_tokens[1]
    print ' term 2:', term2

    #for term in and_tokens:
    if term1 in self._inverted_index.keys():
        resultlist1 = self._inverted_index[term1]
        print resultlist1
    if term2 in self._inverted_index.keys():
        resultlist2 = self._inverted_index[term2]
        print resultlist2
    #intersection of two sets casted into a list                
    results = list(set(resultlist1) & set(resultlist2)) 
    print 'result开发者_如何学JAVAs:', results

    return str(results)

This code works great for two tokens, ex: text= "Hello World" and so, tokens = ['hello', 'world']. I want to generalize it for multiple tokens, so the text can be a sentence, or an entire text file.

self._inverted_index is a dictionary that saves the tokens as keys and the values are the DocIDs in which the keys/tokens occur.

hello -> [1,2,5,6]

world -> [1,3,5,7,8]

result:

hello AND world -> [1,5]

I want to achieve result for: say, (((hello AND computer) AND science) AND world)

I am working on making this work for multiple words, not just two. I started working in python this mornin', so I'm unaware of a lot of features it has to offer.

Any ideas?

I want to generalize it for multiple tokens

def boolean_search_and_multi(self, text):
    and_tokens = self.tokenize(text)
    results = set(self._inverted_index[and_tokens[0]])
    for tok in and_tokens[1:]:
        results.intersection_update(self._inverted_index[tok])
    return list(results)

Would the built-in set type work for you?

$ python
Python 2.6.5 (r265:79063, Jun 12 2010, 17:07:01)
[GCC 4.3.4 20090804 (release) 1] on cygwin
Type "help", "copyright", "credits" or "license" for more information.
>>> hello = set([1,2,5,6])
>>> world = set([1,3,5,7,8])
>>> hello & world
set([1, 5])

继续阅读：information-retrieval python

Python: intersection of lists/sets

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？