开发者

Count number of entries in a row based on external criteria

I have the following data frame:

    Date1              Date2            Date3               Date4              Date5 
1    25 April 2开发者_StackOverflow社区005       10 May 2006   28 March 2007   14 November 2007      1 April 2008  
2    25 April 2005       10 May 2006   28 March 2007   14 November 2007      1 April 2008  
3  29 January 2008   4 December 2008    6 April 2009       1 March 2010   NA 
4  29 January 2008   4 December 2008    6 April 2009       1 March 2010   1 February 2010  
5  29 January 2008   4 December 2008    6 April 2009       1 March 2010   1 February 2010  
6  29 January 2008   4 December 2008    6 April 2009       NA             NA 

And the following vector:

   1 01/09/2004 
   2 20/03/2007 
   3 16/09/2009 
   4 16/09/2009 
   5 15/07/2008 
   6 16/09/2009

I would like to make a count of the dates in each row of the data frame that are the same or before the dates in the vector. For instance for the first row the count should be zero as all the dates are after the corresponding date in the vector.

Anyone know how this can be done?

Here is output from the dput() command so you guys can read the data into R more easily for testing (if you want to):

Dataframe:

structure(c(" 25 April 2005 ", " 25 April 2005 ", " 29 January 2008 ", 
" 29 January 2008 ", " 29 January 2008 ", " 29 January 2008 ", 
" 10 May 2006 ", " 10 May 2006 ", " 4 December 2008 ", " 4 December 2008 ", 
" 4 December 2008 ", " 4 December 2008 ", " 28 March 2007 ", 
" 28 March 2007 ", " 6 April 2009 ", " 6 April 2009 ", " 6 April 2009 ", 
" 6 April 2009 ", " 14 November 2007 ", " 14 November 2007 ", 
" 1 March 2010 ", " 1 March 2010 ", " 1 March 2010 ", " 1 March 2010 ", 
" 1 April 2008 ", " 1 April 2008 ", " 1 February 2010 ", " 1 February 2010 ", 
" 1 February 2010 ", " 1 February 2010 "), .Dim = c(6L, 5L), .Dimnames = list(
    c("1", "2", "3", "4", "5", "6"), c("Rep1", "Rep2", "Rep3", 
    "Rep4", "Rep5")))

Vector:

c("01/09/2004", "20/03/2007", "16/09/2009", "16/09/2009", "15/07/2008", 
"16/09/2009")


If the data.frame is called m and vector v, simple

rowSums(m<=v)

should do (this works because m is represented by R as a vector glued of following columns, and v will be recycled). Still, first ensure that all dates are POSIXcts or Dates; see this question for info about the conversion itself.


First thing : You really have to transform everything to Dates, and that can be a bit tricky. I read in the matrix as Data, and the vector as vect. Then :

vect <- as.Date(vect,format="%d/%m/%Y")

# Due to the apart nature of the Date class, the normal apply-solutions 
# don't give the result you're looking for.
Data <- as.data.frame(Data)
for (i in 1:ncol(Data)){
    Data[,i] <- as.Date(Data[,i],format="%d %B %Y")
}
> apply(Data,2,"<=",vect)
      Rep1  Rep2  Rep3  Rep4
[1,] FALSE FALSE FALSE FALSE
[2,]  TRUE  TRUE FALSE FALSE
[3,]  TRUE  TRUE  TRUE FALSE
[4,]  TRUE  TRUE  TRUE FALSE
[5,]  TRUE FALSE FALSE FALSE
[6,]  TRUE  TRUE  TRUE FALSE

> rowSums(apply(Data,2,"<=",vect))
[1] 0 2 3 3 1 3
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜