开发者

How to rank main features after Feature Selection on OneHotEncoded data?

Consider having a dataset with Categorical-nominal feature types and one numerical output variable. Feature selection algorithms like InfoGain, Pearson or wrapper ones only accept numerical features as input, so i have to OneHotEncode non-ordinal data and it produce lots of dummy features.

If I apply a feature selection in python and get the rank of features, how to retrieve main features(before onehot encoding) ranking?

For example if 3 features and their categories are A(1,2, 3) - B(1,2,3) - C(1,2,3) and result o开发者_开发百科f Pearson (with SelectKBest) on dummy features become:

  1. B2
  2. A1
  3. B1
  4. C3
  5. A3

Is this ranking correct to say:

  1. B
  2. A
  3. C

Since i have to rank features with at least 6 methods of feature selection, i really appreciate any guidance on nominal feature selection and ranking.

I did research, but lots of resources did: ordinal encoding or doesn't show an implementation on pure nominal input and seems other encoding methods like Base N encoding, Hash encoding or dummy encoding still make more features out of one nominal variable.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜