What is the efficent way to convert a Pandas DataFrame to a PyTorch TensorDataset

2022-12-07 20:02 问答作者：

I want to convert this Pandas DataFrame to a TensorDataset

import pandas as pd
df = pd.DataFrame({'A': [[1, 2, 3], [1, 2, 3], [1, 2, 3]], 'B': [0, 1, 0]})

I figured out I can do it this way without getting an error.

A = torch.tensor(df['A'].values.tolist())
B = torch.tensor(df['B'].values)
dataset = torch.utils.data.TensorDataset(A, B)

However, I get the Warning:

UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a开发者_JAVA技巧 single numpy.ndarray with numpy.array() before converting to a tensor.

When I try it this way:

data_numpy = df.to_numpy()
data_tensor = torch.from_numpy(data_numpy)
dataset = torch.utils.data.TensorDataset(data_tensor)

I get the error:

can't convert np.ndarray of type numpy.object_

So the question arises, what is the efficient way to convert a Pandas Data Frame with this structure to a TensorDataset?

Code:

import torch


def get_device() -> torch.device:
    return torch.device("cuda:0") if torch.cuda.is_available() else torch.device("cpu")


def df_to_tensor(x: pd.DataFrame) -> torch:
    return torch.from_numpy(x.values).to(get_device())


df = pd.DataFrame({"spam": [1, 2, 3, 4], "eggs": [5, 6, 7, 8], "ham": [9, 10, 11, 12]})
tensor = df_to_tensor(df)
print(tensor)

Output:

tensor([[ 1,  5,  9],
        [ 2,  6, 10],
        [ 3,  7, 11],
        [ 4,  8, 12]])

继续阅读：numpy pandas python pytorch

What is the efficent way to convert a Pandas DataFrame to a PyTorch TensorDataset

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？