Count number of entries in a row based on external criteria
I have the following data frame:
Date1 Date2 Date3 Date4 Date5
1 25 April 2开发者_StackOverflow社区005 10 May 2006 28 March 2007 14 November 2007 1 April 2008
2 25 April 2005 10 May 2006 28 March 2007 14 November 2007 1 April 2008
3 29 January 2008 4 December 2008 6 April 2009 1 March 2010 NA
4 29 January 2008 4 December 2008 6 April 2009 1 March 2010 1 February 2010
5 29 January 2008 4 December 2008 6 April 2009 1 March 2010 1 February 2010
6 29 January 2008 4 December 2008 6 April 2009 NA NA
And the following vector:
1 01/09/2004
2 20/03/2007
3 16/09/2009
4 16/09/2009
5 15/07/2008
6 16/09/2009
I would like to make a count of the dates in each row of the data frame that are the same or before the dates in the vector. For instance for the first row the count should be zero as all the dates are after the corresponding date in the vector.
Anyone know how this can be done?
Here is output from the dput() command so you guys can read the data into R more easily for testing (if you want to):
Dataframe:
structure(c(" 25 April 2005 ", " 25 April 2005 ", " 29 January 2008 ",
" 29 January 2008 ", " 29 January 2008 ", " 29 January 2008 ",
" 10 May 2006 ", " 10 May 2006 ", " 4 December 2008 ", " 4 December 2008 ",
" 4 December 2008 ", " 4 December 2008 ", " 28 March 2007 ",
" 28 March 2007 ", " 6 April 2009 ", " 6 April 2009 ", " 6 April 2009 ",
" 6 April 2009 ", " 14 November 2007 ", " 14 November 2007 ",
" 1 March 2010 ", " 1 March 2010 ", " 1 March 2010 ", " 1 March 2010 ",
" 1 April 2008 ", " 1 April 2008 ", " 1 February 2010 ", " 1 February 2010 ",
" 1 February 2010 ", " 1 February 2010 "), .Dim = c(6L, 5L), .Dimnames = list(
c("1", "2", "3", "4", "5", "6"), c("Rep1", "Rep2", "Rep3",
"Rep4", "Rep5")))
Vector:
c("01/09/2004", "20/03/2007", "16/09/2009", "16/09/2009", "15/07/2008",
"16/09/2009")
If the data.frame is called m
and vector v
, simple
rowSums(m<=v)
should do (this works because m
is represented by R as a vector glued of following columns, and v
will be recycled). Still, first ensure that all dates are POSIXct
s or Date
s; see this question for info about the conversion itself.
First thing : You really have to transform everything to Dates, and that can be a bit tricky. I read in the matrix as Data, and the vector as vect. Then :
vect <- as.Date(vect,format="%d/%m/%Y")
# Due to the apart nature of the Date class, the normal apply-solutions
# don't give the result you're looking for.
Data <- as.data.frame(Data)
for (i in 1:ncol(Data)){
Data[,i] <- as.Date(Data[,i],format="%d %B %Y")
}
> apply(Data,2,"<=",vect)
Rep1 Rep2 Rep3 Rep4
[1,] FALSE FALSE FALSE FALSE
[2,] TRUE TRUE FALSE FALSE
[3,] TRUE TRUE TRUE FALSE
[4,] TRUE TRUE TRUE FALSE
[5,] TRUE FALSE FALSE FALSE
[6,] TRUE TRUE TRUE FALSE
> rowSums(apply(Data,2,"<=",vect))
[1] 0 2 3 3 1 3
精彩评论