开发者

How to calculate the columnwise minimum of a dask pivot table?

I would like to create a pivot table in dask and then calculate the column wise minimum.

import dask.dataframe as dd
from dask.distributed import Client

client = Client()

df = dd.read_csv("data.csv")

# In order to use pivot_table, the columns use as index and columns need to be categorical:
df = df.categorize(columns=['A', 'B'])

#df['A'] = df['A'].cat.开发者_开发知识库as_ordered()
#df['B'] = df['B'].cat.as_ordered()

pt = df.pivot_table(index='A', columns='B', values='C', aggfunc='mean')

pt.min().compute()

TypeError: Categorical is not ordered for operation min you can use .as_ordered() to change the Categorical to an ordered one

...

# Trying to uncategorize the index, takes forever
pt.index = list(pt.index)
pt.min().compute()

Is there a better way to archive this?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜