Calculate scores across columns
First the sample data:
bbbv[1:25] <-1
bbbv[26:开发者_高级运维50] <-2
bbbw <- 1:25
bbbx <- sample(1:5, 50, replace=TRUE)
bbby <- sample(1:5, 50, replace=TRUE)
bbb <- data.frame(pnum=bbbv, trialnum=bbbw, guess=bbbx, target=bbby)
If the target is the same number as the guess then we score 1, else 0.
bbb$hit <- ifelse(bbb$guess==bbb$target, 1, 0)
This is the problem. I want to calculate four more columns:
bbb$hitpone trialnum(n) guess == trial(n+1) target
bbb$hitptwo trialnum(n) guess == trial(n+2) target
bbb$hitmone trialnum(n) guess == trial(n-1) target
bbb$hitmtwo trialnum(n) guess == trial(n-2) target
To be clear. For hitmone we look at the trial guess and compare it to the target for the trial before (-1 from the current trial). For hitmtwo we look at the trial guess and compare it to the target 2 back (-2 from the current trial). hitpone and hitptwo are the same but in a positive direction (+1 and +2 from current trial).
And just to be clear, as before we're interested in determining If the target is the same number as the guess then we score 1, else 0 (according to our new calculations).
Now there is some minor difficulty with this task. Each pnum has 25 trials. For hitpone we cannot calculate a +1 for trial 25. For hitptwo we cannot calculate a +2 for trials 25 nor trial 24. The same follows for the hitmone: we cannot calculate -1 for trial 1, nor -2 for trials 1 and 2.
This is how I want the table to look. I have mocked it up by hand, showing the first 1-3 trials and last 23-25 trials.
dput(bbb)
structure(list(pnum = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), trialnum = c(1L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L,
16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 1L, 2L, 3L,
4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L,
18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L), guess = c(5L, 1L, 1L,
3L, 1L, 3L, 1L, 5L, 2L, 3L, 1L, 1L, 5L, 3L, 5L, 1L, 2L, 2L, 3L,
1L, 4L, 1L, 4L, 4L, 3L, 4L, 5L, 2L, 4L, 5L, 5L, 5L, 4L, 5L, 2L,
3L, 1L, 1L, 5L, 1L, 1L, 3L, 1L, 2L, 4L, 1L, 2L, 3L, 1L, 1L),
target = c(4L, 3L, 4L, 5L, 5L, 1L, 1L, 1L, 1L, 1L, 1L, 3L,
1L, 2L, 5L, 1L, 3L, 2L, 1L, 4L, 4L, 1L, 1L, 3L, 4L, 4L, 2L,
3L, 2L, 1L, 1L, 5L, 4L, 3L, 5L, 1L, 1L, 1L, 2L, 5L, 2L, 4L,
3L, 1L, 1L, 2L, 5L, 3L, 3L, 3L), hit = c(0, 0, 0, 0, 0, 0,
1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0,
1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0,
0, 0, 0, 1, 0, 0)), .Names = c("pnum", "trialnum", "guess",
"target", "hit"), row.names = c(NA, -50L), class = "data.frame")
Here are the basics. You can extend this out to handle negative increments and use by()
to wrap a call to hitp()
to avoid subsetting.
hitp <- function(dtf,inc) {
target.shift <- shift(dtf$target,inc,wrap=FALSE,pad=TRUE)
return(dtf$guess==target.shift)
}
bbb1 <- subset(bbb,pnum==1)
bbb1$hitpone <- hitp(bbb1,1)
bbb1$hitptwo <- hitp(bbb1,2)
bbb1$hitmone <- hitp(bbb1,-1)
Call to by would look something like this:
unlist(by(bbb,bbb$pnum,hitp,inc=1))
Where shift
is a program I wrote for another purpose:
shift <- function(vec,n=1,wrap=TRUE,pad=FALSE) {
if(length(vec)<abs(n)) {
#stop("Length of vector must be greater than the magnitude of n \n")
}
if(n==0) {
return(vec)
} else if(length(vec)==n) {
# return empty
length(vec) <- 0
return(vec)
} else if(n>0) {
returnvec <- vec[seq(n+1,length(vec) )]
if(wrap) {
returnvec <- c(returnvec,vec[seq(n)])
} else if(pad) {
returnvec <- c(returnvec,rep(NA,n))
}
} else if(n<0) {
returnvec <- vec[seq(1,length(vec)-abs(n))]
if(wrap) {
returnvec <- c( vec[seq(length(vec)-abs(n)+1,length(vec))], returnvec )
} else if(pad) {
returnvec <- c( rep(NA,abs(n)), returnvec )
}
}
return(returnvec)
}
This all relies pretty heavily on proper sorting, so make sure it's sorted before you run.
精彩评论