Finding similar word lists using StringListProperty on App Engine
I have a list of tags defined in a StringListProperty().
The DB contains around 1 million entries and each entry has around 20 different values in the list.
e.g.
a = [ 'ab', 'bc', 'ca', 'x', ....]
b = ['x', 'm', 'a', .... ]
I am using Google App Engine so I have constraints on running batch jobs ... (only 30 sec allowed)
Here is my question:
Given a list a, I want to find all lists which have most number of elements common with a ... in descending order of num开发者_如何学运维ber of common elements...
how can i do this with app engine?
***update
I am storing tags for URLs - [shopping, shop, social-shopping, ....]
Basically, I want to find URLs which are of similar content by
(1) matching the tags (2) looking at the frequency of tags per URL to decide which URLs are "more" related content
I don't think there's any neat way to do this in App Engine - or for that matter, in any DBMS with only standard one-dimensional indexes available.
Perhaps if you expand on what you're trying to achieve, someone can suggest an alternative?
精彩评论