Reshape error - invalid factor

2023-01-16 11:59 问答作者：

I am somewhat new to R and I have run into a point where I need some help. I figure the reshape package can accomplish what I need to do.

Here is the structure of the original data frame:

> str(bruins)
'data.frame':   10 obs. of  6 variables:
 $ gameid  : Factor w/ 1 level "20090049": 1 1 1 1 1 1 1 1 1 1
 $ team    : chr  "NYI" "BOS" "NYI" "BOS" ..开发者_如何学运维.
 $ home_ind: chr  "V" "H" "V" "H" ...
 $ period  : Factor w/ 5 levels "1","2","3","4",..: 1 1 2 2 3 3 4 4 5 5
 $ goals   : int  0 0 3 0 0 3 0 0 3 3
 $ shots   : int  16 7 9 7 8 12 5 4 38 30

Here are the first few rows:

> head(bruins)
      gameid team home_ind period goals shots
409 20090049  NYI        V      1     0    16
410 20090049  BOS        H      1     0     7
411 20090049  NYI        V      2     3     9
412 20090049  BOS        H      2     0     7
413 20090049  NYI        V      3     0     8
414 20090049  BOS        H      3     3    12

I am looking to create a new data frame that pivots on gameid and period, with the rest of the columns summarizing the data for each home_ind row (10 columns in all).

When I run the following code:

b.melt <- melt(bruins, id=c("gameid", "period"), na.rm=TRUE)

I get the following error:

Warning messages:
1: In `[<-.factor`(`*tmp*`, ri, value = c(0L, 0L, 3L, 0L, 0L, 3L, 0L,  :
  invalid factor level, NAs generated
2: In `[<-.factor`(`*tmp*`, ri, value = c(16L, 7L, 9L, 7L, 8L, 12L,  :
  invalid factor level, NAs generated

Any help will be very much appreciated!

Edit: This is what I am hoping to get the restructured data to look like

    gameid period vis_team vis_goals vis_shots home_team home_goals home_shots
1 20090049      1     NYI      0      16       BOS          0          7
2 20090049      2     NYI      3      9        BOS          0          7
3 20090049      3     NYI      0      8        BOS          3         12

since after melting, all measure variables will be in the same column, they should be of same type. In your case, "team" are character, "goals" are numeric, so you got that error.

Now I see what you're trying to do, here's an approach using summarise from plyr:

home <- summarise(subset(per, home_ind == "V"), 
  gameid = gameid, period = period, 
  vis_team = team, vis_goals = goals, vis_shots = shots)

away <- summarise(subset(per, home_ind == "H"), 
  gameid = gameid, period = period, 
  home_team = team, home_goals = goals, home_shots = shots)

join(home, away)

There are also a number of ways to do it using just base functions (e.g. by subsetting and then modifying names)

I think you'd be better off using ddply from the plyr package for this problem. You didn't say how you wanted to summarise the data, but check out the summarise functions if you want to use a different summary function for each variable, or the colwise function if you want to summarise all variables the same way.

Thanks for the help. I ended up going a different route and broke the problem into little pieces. I am sure this is quicker, more elegant way, but I got to where I needed to be and wanted to share the code in case this helps someone else.

## load libraries 
library(sqldf)

## assume that the dataset is loaded
## restructure the data and merge together
sql.1 <- "SELECT gameid, period, team `vis_team`, goals `vis_goals`, shots `vis_shots`"
sql.2 <- "FROM per WHERE home_ind='V' GROUP BY gameid, period "
sql.cmd <- paste(sql.1, sql.2, sep="")
vis <- sqldf(sql.cmd)

sql.1 <- "SELECT gameid, period, team `home_team`, goals `home_goals`, shots `home_shots`"
sql.2 <- "FROM per WHERE home_ind='H' GROUP BY gameid, period "
sql.cmd <- paste(sql.1, sql.2, sep="")
home <- sqldf(sql.cmd)

my.dataset <- merge(vis, home)

继续阅读：data-manipulation r

Reshape error - invalid factor

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？