Making sense of Python
I am reading the book Programming Collective Intelligence, What exactly the following piece of python code do?
# Add up the squares of all the differences
sum_of_squares=sum([pow(prefs[person1][item]-prefs[person2][item],2)
for item in prefs[person1] if item in prefs[person2]])
I am trying to play with the examples in Java.
Prefs is a map of person to movie ratings, movie ratings is 开发者_如何学JAVAanother map of names to ratings.
First it constructs a list containing the results from:
for each item in prefs for person1:
if that is also an item in the prefs for person2:
find the difference between the number of prefs for that item for the two people
and square it (Math.pow(x,2) is "x squared")
Then it adds those up.
This might be a little more readable if the call to pow were replaced with an explicit use of '**' exponentiation operator:
sum_of_squares=sum([(prefs[person1][item]-prefs[person2][item])**2
for item in prefs[person1] if item in prefs[person2]])
Lifting out some invariants also helps readability:
p1_prefs = prefs[person1]
p2_prefs = prefs[person2]
sum_of_squares=sum([(p1_prefs[item]-p2_prefs[item])**2
for item in p1_prefs if item in p2_prefs])
Finally, in recent versions of Python, there is no need for the list comprehension notation, sum will accept a generator expression, so the []'s can also be removed:
sum_of_squares=sum((p1_prefs[item]-p2_prefs[item])**2
for item in p1_prefs if item in p2_prefs)
Seems a bit more straightforward now.
Ironically, in pursuit of readability, we have also done some performance optimization (two endeavors that are usually mutually exclusive):
- lifted invariants out of the loop
- replaced the function call pow with inline evaluation of '**' operator
- removed unnecessary construction of a list
Is this a great language or what?!
01 sum_of_squares =
02 sum(
03 [
04 pow(
05 prefs[person1][item]-prefs[person2][item],
06 2
07 )
08 for
09 item
10 in
11 prefs[person1]
12 if
13 item in prefs[person2]
14 ]
15 )
Sum (line 2) a list, that consists of the values computed in lines 4-7 for each 'item' defined in the list specified on line 11 which the condition on line 13 holds true for.
It computes the sum of the squares of the difference between prefs[person1][item]
and prefs[person2][item]
, for every item
in the prefs
dictionary for person1
that is also in the prefs
dictionary for person2
.
In other words, say both person1
and person2
have a rating for the film Ratatouille, with person1
rating it 5 stars, and person2
rating it 2 stars.
prefs[person1]['Ratatouille'] = 5
prefs[person2]['Ratatouille'] = 2
The square of the difference between person1
's rating and person2
's rating is 3^2 = 9
.
It's probably computing some kind of Variance.
精彩评论