开发者

Changing text in a data frame

I am working with a data frame in which I need to edit the entries in one particular column to allow for easy sorting. The data frame looks like this when imported:

     Assay    Genotype Description Sample   Operator
1    CCT6-18  C    A.Conservative  1_062911 Automatic   
2    CCT6-2开发者_StackOverflow社区4  C       E.User Call  1_062911   charles
3    CCT6-25  A    A.Conservative  1_062911 Automatic

I need to change the assay column from CCT6-18 to CCT6-018. This "assay" appears multiple times within the data frame and I'd like to change all of the entries at once. Ive tried the gsub function but it returns data in a format that I am unfamiliar with. I'd like to get the data back in a data frame.

Help!


df$Assay <- replace(df$Assay, df$Assay=="CCT6-18", "CCT6-018")

Should see you right.

Also, try str(df) or class(df$Assay) to see what class your Assay column is. If it is a factor this could be the reason you're getting tripped up. If so run df$Assay <- as.character(df$Assay) first.


It depends on whether you want to change the other entries in Assay as well. An easy way is just to add a 0 after the dash:

df$Assay <- gsub('-', '-0', df$Assay)

A regexp solution would be something along the lines of:

df$Assay <- gsub('(\\d\\d)','0\\1', df$Assay)

This would replace any two digits by a 0 followed by those same two digits. You have to be careful with regexps because you have to know your data well in order to be sure that you don't change anything incorrectly. For example, if you have CCT62-18 as an entry in Assay, then you would not want to use this regexp because it would change the 62 to 062.


I would go about by replacing the factor level.

sam <- data.frame(assay = c("CCT6-18", "CCT6-23", "CCT6-25"),
    genetype = sample(letters, 3), operator = runif(3), sample = runif(3))
str(sam)
  'data.frame': 3 obs. of  4 variables:
   $ assay   : Factor w/ 3 levels "CCT6-18","CCT6-23",..: 1 2 3
   $ genetype: Factor w/ 3 levels "f","u","w": 1 2 3
   $ operator: num  0.595 0.912 0.76
   $ sample  : num  0.525 0.626 0.377
levels(sam$assay)[1] <- "CCT6-018"
sam
       assay genetype  operator    sample
   1 CCT6-018        f 0.5950434 0.5249502
   2  CCT6-23        u 0.9123185 0.6257186
   3  CCT6-25        w 0.7595744 0.3769029
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜