building a pairwise matrix in scipy/numpy in Python from dictionaries
I have a dictionary whose keys are strings and values are numpy arrays, e.g.:
data = {'a': array([1,2,3]), 'b': array([4,5,6]), 'c': array([7,8,9])}
I want to compute a statistic between all pairs of values in 'data' and build an n by x matrix that stores the result. Assume that I know the ord开发者_JAVA技巧er of the keys, i.e. I have a list of "labels":
labels = ['a', 'b', 'c']
What's the most efficient way to compute this matrix?
I can compute the statistic for all pairs like this:
result = []
for elt1, elt2 in itertools.product(labels, labels):
result.append(compute_statistic(data[elt1], data[elt2]))
But I want result to be a n by n matrix, corresponding to "labels" by "labels". How can I record the results as this matrix? thanks.
You could use a nested loop, or a list comprehension like:
result = [[compute_stat(data[row], data[col]) for col in labels]
for row in labels]
Convert the result list into a matrix and then adjust the shape.
myMatrix = array(result) # or use matrix(result)
myMatrix.shape = (len(labels), len(labels))
If you want to index the matrix with the labels you could do
myMatrix[labels.index('a'), labels.index('b')]
This gets the a*b value. If this is your intention it would be better to store the indexes in a dictionary.
labelsIndex = {'a' : 0, 'b' : 1, 'c' : 2 }
myMatrix[labelsIndex['a'], labelsIndex['b']]
Hope this helps.
精彩评论