Changing text in a data frame
I am working with a data frame in which I need to edit the entries in one particular column to allow for easy sorting. The data frame looks like this when imported:
Assay Genotype Description Sample Operator
1 CCT6-18 C A.Conservative 1_062911 Automatic
2 CCT6-2开发者_StackOverflow社区4 C E.User Call 1_062911 charles
3 CCT6-25 A A.Conservative 1_062911 Automatic
I need to change the assay column from CCT6-18 to CCT6-018. This "assay" appears multiple times within the data frame and I'd like to change all of the entries at once. Ive tried the gsub function but it returns data in a format that I am unfamiliar with. I'd like to get the data back in a data frame.
Help!
df$Assay <- replace(df$Assay, df$Assay=="CCT6-18", "CCT6-018")
Should see you right.
Also, try str(df)
or class(df$Assay)
to see what class your Assay column is. If it is a factor this could be the reason you're getting tripped up. If so run df$Assay <- as.character(df$Assay)
first.
It depends on whether you want to change the other entries in Assay
as well. An easy way is just to add a 0
after the dash:
df$Assay <- gsub('-', '-0', df$Assay)
A regexp solution would be something along the lines of:
df$Assay <- gsub('(\\d\\d)','0\\1', df$Assay)
This would replace any two digits by a 0
followed by those same two digits. You have to be careful with regexps because you have to know your data well in order to be sure that you don't change anything incorrectly. For example, if you have CCT62-18
as an entry in Assay
, then you would not want to use this regexp because it would change the 62 to 062.
I would go about by replacing the factor level.
sam <- data.frame(assay = c("CCT6-18", "CCT6-23", "CCT6-25"),
genetype = sample(letters, 3), operator = runif(3), sample = runif(3))
str(sam)
'data.frame': 3 obs. of 4 variables:
$ assay : Factor w/ 3 levels "CCT6-18","CCT6-23",..: 1 2 3
$ genetype: Factor w/ 3 levels "f","u","w": 1 2 3
$ operator: num 0.595 0.912 0.76
$ sample : num 0.525 0.626 0.377
levels(sam$assay)[1] <- "CCT6-018"
sam
assay genetype operator sample
1 CCT6-018 f 0.5950434 0.5249502
2 CCT6-23 u 0.9123185 0.6257186
3 CCT6-25 w 0.7595744 0.3769029
精彩评论