Step 1: Insert data, Step2: collect money… If it were only that simple…

Often, I’m pulling data in to R from datasets that are updated monthly; however, there are many cases where I am interested in aggregating data by quarters. In these cases I need to make sure that in each aggregation bin, the data represent full quarters (and full months as well–but I tacked that in an earlier post).

Most of the time I use RODBC to bring my data from our warehouse into R, and I suppose you could implement the technique below as part of the data import step, but in this particular case, I implemented the code below after the data had been imported into r.

## simulate some data
library(zoo)
n <- 100
date <- seq(as.Date('2011-01-01'),as.Date('2012-04-30'),by = 1)
values <- rnorm(length(date), n, 25)
df1 <- data.frame(date, values)
df1$yrqtr <- as.yearqtr(df1$date)
df1$yrmon <- as.yearmon(df1$date)
tapply(df1$values, df1$yrqtr, sum) # beware the last quarter is incomplete
range(df1$yrqtr) #you can't tell by looking here
range(df1$yrmon) #but you can tell by looking here
# this code fixes your problem
test1 <- as.yearmon(as.Date(max(df1$yrqtr), frac=1)) ## returns the last month of the last QTR that we have data for
result1 <- as.yearmon(as.Date(max(df1$yrqtr), frac=0)) ## returns the FIRST month ofthe last QTR what we have data for
#if max yearmon in data is not equal to what should be the last yearmon for the last qtr in data
# cut data to last full quarter
if(max(df1$yrmon) != test1)
df1 <- df1[ df1$yrmon < result1 , ]
tapply(df1$values, df1$yrqtr, sum) # this is more like it!
range(df1$yrqtr)
range(df1$yrmon)

### Like this:

Like Loading...

*Related*