开发者

grouping categorical variables in R

Suppose I have a variable named 'Fever' in which I have 4 options like mild, moderate, severe and very severe. 开发者_如何学PythonI want to club moderate and mild together and severe and very severe together, how can I do it in 'R'?

Please suggest


This can also be done using base:

## If going from character to factor
fever_vec <- c("mild", "moderate", "severe", "very severe")
fever_fact <- factor(fever_vec,
                     levels = c("mild", "moderate", "severe", "very severe"),
                     labels = c("mild/moderate", "mild/moderate",
                                "severe/very severe", "severe/very severe"))

## If already going from a factor
fever_already_fact <- factor(c("mild", "moderate", "severe", "very severe"))
levels(fever_already_fact) <- list("mild/moderate" = c("mild", "moderate"),
                                   "severe/very severe" = c("severe", "very severe"))

Also, the 1st variant only works from R version >= 3.5.0.


I think you're looking for something like this:

library(tidyverse)
df <- tibble(fever = c("mild","moderate","severe","very severe"))
newdf <- mutate(df,highfever = case_when(fever == "mild" | fever == "moderate" ~ 0,
                                         fever == "severe" | fever == "very severe" ~ 1))


This type of vectors are normally factors.

library(forcats)

First create a vector of fevers


fever_lvl <- c("mild", "moderate", "severe", "very severe")
set.seed(1)
fevers <- factor(sample(fever_lvl, 10, T), levels = fever_lvl)

fevers
> fevers
 [1] mild        very severe severe      mild        moderate    mild        severe     
 [8] severe      moderate    moderate   
Levels: mild moderate severe very severe

Regrouping as desired


fevers_regrouped <- fct_recode(fevers, mild_or_moderate = "mild", mild_or_moderate = "moderate",
                               severe_or_higher = "severe", severe_or_higher = "very severe")

fevers_regrouped
> fevers_regrouped
 [1] mild_or_moderate severe_or_higher severe_or_higher mild_or_moderate mild_or_moderate
 [6] mild_or_moderate severe_or_higher severe_or_higher mild_or_moderate mild_or_moderate
Levels: mild_or_moderate severe_or_higher

or use fct_collapse as-

fevers_regrouped2 <- fct_collapse(fevers, mild_or_mod = c("mild", "moderate"),
                                  severe_or_up = c("severe", "very severe"))
fevers_regrouped2
 [1] mild_or_mod  severe_or_up severe_or_up mild_or_mod  mild_or_mod  mild_or_mod  severe_or_up severe_or_up
 [9] mild_or_mod  mild_or_mod 
Levels: mild_or_mod severe_or_up
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜