Curating data to eliminate partial quarters, months (of even years if that’s your game)…

Step 1: Insert data, Step2: collect money... If it were only that simple...

Step 1: Insert data, Step2: collect money… If it were only that simple…

Often, I’m pulling data in to R from datasets that are updated monthly; however, there are many cases where I am interested in aggregating data by quarters. In these cases I need to make sure that in each aggregation bin, the data represent full quarters (and full months as well–but I tacked that in an earlier post).

Most of the time I use RODBC to bring my data from our warehouse into R, and I suppose you could implement the technique below as part of the data import step, but in this particular case, I implemented the code below after the data had been imported into r.

## simulate some data

n <- 100
date <- seq(as.Date('2011-01-01'),as.Date('2012-04-30'),by = 1)
values <- rnorm(length(date), n, 25)
df1 <- data.frame(date, values)
df1$yrqtr <- as.yearqtr(df1$date)
df1$yrmon <- as.yearmon(df1$date)

tapply(df1$values, df1$yrqtr, sum) # beware the last quarter is incomplete
range(df1$yrqtr) #you can't tell by looking here
range(df1$yrmon) #but you can tell by looking here

# this code fixes your problem
test1 <- as.yearmon(as.Date(max(df1$yrqtr), frac=1)) ## returns the last month of the last QTR that we have data for
result1 <- as.yearmon(as.Date(max(df1$yrqtr), frac=0)) ##  returns the FIRST month ofthe last QTR what we have data for
#if max yearmon in data is not equal to what should be the last yearmon for the last qtr in data
# cut  data to last full quarter 
if(max(df1$yrmon) != test1)
  df1 <- df1[ df1$yrmon < result1 , ]

tapply(df1$values, df1$yrqtr, sum) # this is more like it!


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s