Combine two data frames and remove duplicate columns
I want to cbind
two data frames and remove duplicated columns. For example:
df1 <- data.frame(var1=c('a','b','c'), var2=c(1,2,3))
df2 <- data.frame(var1=c('a','b','c'), var3开发者_开发问答=c(2,4,6))
cbind(df1,df2) #this creates a data frame in which column var1 is duplicated
I want to create a data frame with columns var1
, var2
and var3
, in which column var2
is not repeated.
merge
will do that work.
try:
merge(df1, df2)
In case you inherit someone else's dataset and end up with duplicate columns somehow and want to deal with them, this is a nice way to do it:
for (name in unique(names(testframe))) {
if (length(which(names(testframe)==name)) > 1) {
## Deal with duplicates here. In this example
## just print name and column #s of duplicates:
print(name)
print(which(names(testframe)==name))
}
}
The function mutate
in dplyr
can take two dataframes as arguments and all columns in the second dataframe will overwrite existing columns in the first dataframe. Columns that don't exist in the first dataframe will be constructed in the new dataframe.
> mutate(df1,df2)
var1 var2 var3
1 a 1 2
2 b 2 4
3 c 3 6
精彩评论