开发者

r -- finding difference between business days

I have several years of data (only for business days (no weekends or holidays)) in an [r] data frame and would like to find the difference between the data on the 2nd and 5th business day of each month. So the solution needs to go thru the list, determine the 2nd and 5th business day, get the data and the full date for the corresponding dates and then find the difference.

the data looks like:

1/19/1990  1.22

1/20/1990  1.25

1/23/1990  1.26   ## (Gap in date is weekend)

...

2/1/1990   1.34

2/2/1990   1.36

2/5/1990   1.22   ## (Gap in date is weekend)
开发者_如何学Python

I have tried using dateTime() but it doesn't handicap for weekends and holidays. Any suggestions would be appreciated, thanks.


The basic Date type works for calendar days, but not for business days. You need extra logic to take care of business days. I am aware of two efforts:

  1. the timeDate package which is part of rMetrics has a number of calendars

  2. my RQuantLib package can do so too by relying in the logic from QuantLib

Here is just two examples from RQuantLib, there are a number of related other functions:

R>        from <- as.Date("2009-04-07")
R>        to <-as.Date("2009-04-14")
R>        getHolidayList("UnitedStates", from, to)
NULL
R>        to <- as.Date("2009-10-7")
R>        getHolidayList("UnitedStates", from, to)
[1] "2009-05-25" "2009-07-03" "2009-09-07"
R>     

and

R>        from <- as.Date("2009-04-07")
R>        to<-as.Date("2009-04-14")
R>        businessDaysBetween("UnitedStates", from, to)
[1] 5
R> 


I assume that by the 2nd and 5th business day you mean that 2nd and 5th day of data that is actually present in the data for each month. If that is the question then its as follows. We read in the data and convert the first column to "Date" class. Then we aggregate the data by month taking the required difference.

Lines <- "1/19/1990 1.22
1/20/1990 1.25
1/23/1990 1.26 
1/24/1990 1.26 
1/25/1990 1.26 
1/26/1990 1.26 
2/1/1990 1.34
2/2/1990 1.36
2/5/1990 1.22 
2/6/1990 1.22 
2/7/1990 1.22 
2/8/1990 1.22"

DF <- read.table(text = Lines, col.names = c("Date", "Value"))
DF$Date <- as.Date(DF$Date, "%m/%d/%Y")
aggregate(DF$Value, list(ym = format(DF$Date, "%Y-%m")), 
   function(x) if (length(x) >= 5) x[5] - x[2] else NA)

Using zoo and chron it can be done entirely via read.zoo:

library(zoo)
library(chron)
read.zoo(text = Lines, FUN = chron, FUN2 = as.yearmon, 
  aggregate =  function(x) if (length(x) >= 5) x[5] - x[2] else NA)

Update Since this was first written the text= argument to read.table and read.zoo was added in R and the answer has been updated to use this.


Here is a little function that lets you input a start date, end date and a vector of dates that correspond to holidays (useful if you are using a non standard holiday calendar) and returns the number of working days between them, counting both the start and end date

workdays = function(iniDate, endDate, holidays) {
  theDates = seq(from=iniDate,to=endDate,by="day")
  isHoliday = theDates %in% holidays
  isWeekend = (as.POSIXlt(theDates)$wday) %in% (c(0,6))
  return (sum(!isHoliday & !isWeekend))
}


You can find the difference between business days using the package bizdays, but you must have a list of holidays (nonworking days) and that's not your case. Anyway I think it can help others.

With bizdays the following code would compute the amount of business days between 2 dates.

library(bizdays)
cal <- Calendar(holidaysANBIMA, weekdays=c('sunday', 'saturday'), dib=252)
from_dates <- c('2013-07-12', '2012-06-13')
to_dates <- seq(as.Date('2014-02-17'), as.Date('2016-07-21'), by='months')
bizdays(from_dates, to_dates, cal = cal)

##  [1]  153  442  194  483  234  526  276  570  321  613  364  655  404  695
## [15]  446  735  486  779  529  822  571  863  614  904  654  946  695  987
## [29]  738 1029

EDIT:

Since version 1.0.0 bizdays comes with a few built in calendars

library(bizdays)
from_dates <- c('2013-07-12', '2012-06-13')
to_dates <- seq(as.Date('2014-02-17'), as.Date('2016-07-21'), by='months')
bizdays(from_dates, to_dates, cal = "Brazil/ANBIMA")

The Calendar function will be deprecated.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜