开发者

Making sense of Python

I am reading the book Programming Collective Intelligence, What exactly the following piece of python code do?

  # Add up the squares of all the differences 
  sum_of_squares=sum([pow(prefs[person1][item]-prefs[person2][item],2) 
                      for item in prefs[person1] if item in prefs[person2]]) 

I am trying to play with the examples in Java.

Prefs is a map of person to movie ratings, movie ratings is 开发者_如何学JAVAanother map of names to ratings.


First it constructs a list containing the results from:

for each item in prefs for person1:
    if that is also an item in the prefs for person2:
        find the difference between the number of prefs for that item for the two people
        and square it (Math.pow(x,2) is "x squared")

Then it adds those up.


This might be a little more readable if the call to pow were replaced with an explicit use of '**' exponentiation operator:

sum_of_squares=sum([(prefs[person1][item]-prefs[person2][item])**2
                   for item in prefs[person1] if item in prefs[person2]])

Lifting out some invariants also helps readability:

p1_prefs = prefs[person1]
p2_prefs = prefs[person2]

sum_of_squares=sum([(p1_prefs[item]-p2_prefs[item])**2
                      for item in p1_prefs if item in p2_prefs])

Finally, in recent versions of Python, there is no need for the list comprehension notation, sum will accept a generator expression, so the []'s can also be removed:

sum_of_squares=sum((p1_prefs[item]-p2_prefs[item])**2
                      for item in p1_prefs if item in p2_prefs)

Seems a bit more straightforward now.

Ironically, in pursuit of readability, we have also done some performance optimization (two endeavors that are usually mutually exclusive):

  • lifted invariants out of the loop
  • replaced the function call pow with inline evaluation of '**' operator
  • removed unnecessary construction of a list

Is this a great language or what?!


01 sum_of_squares =
02 sum(
03  [
04      pow(
05         prefs[person1][item]-prefs[person2][item],
06         2
07      ) 
08    for
09       item
10    in
11       prefs[person1]
12    if
13       item in prefs[person2]
14  ]
15 )

Sum (line 2) a list, that consists of the values computed in lines 4-7 for each 'item' defined in the list specified on line 11 which the condition on line 13 holds true for.


It computes the sum of the squares of the difference between prefs[person1][item] and prefs[person2][item], for every item in the prefs dictionary for person1 that is also in the prefs dictionary for person2.

In other words, say both person1 and person2 have a rating for the film Ratatouille, with person1 rating it 5 stars, and person2 rating it 2 stars.

prefs[person1]['Ratatouille'] = 5
prefs[person2]['Ratatouille'] = 2

The square of the difference between person1's rating and person2's rating is 3^2 = 9.

It's probably computing some kind of Variance.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜