Difficult Problem in R. Split Character String of Dataset, but Maintain Information in other Columns
I am using R. Take the dataset I created below for an example. I want to be able to separate ip
by "."
while at the same time keeping the original row information in color
and status
. I r开发者_Python百科ecognize that this will create a longer dataset, where entries for color
and status
will repeat themselves.
a <- data.frame(cbind(color=c("yellow","red","blue","red"),
status=c("no","yes","yes","no"),
ip=c("162.131.58.26","2.131.58.16","2.2.58.10","162.131.58.17")))
Unclear whether the OP wanted new rows or columns, so here's both:
columns:
library(reshape)
a <- data.frame(a, colsplit(a$ip, split = "\\.", names = c("foo", "bar", "baz", "phi")))
or rows (after adding the columns above)
a.m <- melt(a, id.vars = c("color", "status", "ip"))
a <- cbind(a[,1:2], t(matrix(as.numeric(unlist(strsplit(as.character(a[,3]), "\\."))), nrow = nrow(a), ncol = 4)))
Not sure if this is what you want, and I'm sure that there is a nicer looking way to do it even if it is what you want.
# give a an id to match cases
a$id <- 1:nrow(a)
# split the ip address and store in datab
datab <- unlist(strsplit(as.character(a$ip),"\\."))
# put the parts of the ip address against the correct ids in a new dataframe
datac <- data.frame(id=sort(rep(1:4,nrow(a))),ip=datab)
# merge the data together, remove unwanted variables, correct column name
final <- merge(datac,a,by="id")
final <- final[c("ip.x","color","status")]
colnames(final)[1] <- "ip"
This will give you each part of the ip address on a new line with the color and status variables repeating. I hope this is what you were after. Otherwise, the previous answer looks like a good one to have the ip data go into columns instead of rows.
精彩评论