Find columns with different values in duplicate rows

2022-12-07 22:32 问答作者：

I have a data set that has some duplicate records. For those records, most of the column values are the same, but 开发者_StackOverflow社区a few ones are different.

I need to identify the columns where the values are different, and then subset those columns.

This would be a sample of my dataset:

library(data.table)

dat <- "ID location date status observationID observationRep observationVal latitude longitude setSource
FJX8KL loc1 2018-11-17 open 445 1 17.6 -52.7 -48.2 XF47
FJX8KL loc2 2018-11-17 open 445 2 1.9  -52.7 -48.2 LT12"

dat <- setDT(read.table(textConnection(dat), header=T))

And this is the output I would expect:

   observationRep observationVal setSource
1:              1           17.6      XF47
2:              2            1.9      LT12

One detail is: my original dataset has 189 columns, so I need to check all of them.

How to achieve this?

Two issues, first, use text= argument rather than textConnection, second, use as.data.table, since seDT modifies object in place, but it yet isn't there.

dat1 <- data.table::as.data.table(read.table(text=dat, header=TRUE))
dat1[, c('observationRep', 'observationVal', 'setSource')]
#    observationRep observationVal setSource
# 1:              1           17.6      XF47
# 2:              2            1.9      LT12

继续阅读：data.table duplicates r

Find columns with different values in duplicate rows

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？