开发者

Difficult Problem in R. Split Character String of Dataset, but Maintain Information in other Columns

I am using R. Take the dataset I created below for an example. I want to be able to separate ip by "." while at the same time keeping the original row information in color and status. I r开发者_Python百科ecognize that this will create a longer dataset, where entries for color and status will repeat themselves.

a <- data.frame(cbind(color=c("yellow","red","blue","red"),
       status=c("no","yes","yes","no"),
       ip=c("162.131.58.26","2.131.58.16","2.2.58.10","162.131.58.17")))


Unclear whether the OP wanted new rows or columns, so here's both:

columns:

library(reshape)
a <- data.frame(a, colsplit(a$ip, split = "\\.", names = c("foo", "bar", "baz", "phi")))

or rows (after adding the columns above)

a.m <- melt(a, id.vars = c("color", "status", "ip"))


a <- cbind(a[,1:2], t(matrix(as.numeric(unlist(strsplit(as.character(a[,3]), "\\."))), nrow = nrow(a), ncol = 4)))

Not sure if this is what you want, and I'm sure that there is a nicer looking way to do it even if it is what you want.


# give a an id to match cases
a$id <- 1:nrow(a)

# split the ip address and store in datab
datab <- unlist(strsplit(as.character(a$ip),"\\."))

# put the parts of the ip address against the correct ids in a new dataframe
datac <- data.frame(id=sort(rep(1:4,nrow(a))),ip=datab)

# merge the data together, remove unwanted variables, correct column name
final <- merge(datac,a,by="id")
final <- final[c("ip.x","color","status")]
colnames(final)[1] <- "ip"

This will give you each part of the ip address on a new line with the color and status variables repeating. I hope this is what you were after. Otherwise, the previous answer looks like a good one to have the ip data go into columns instead of rows.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜