With good data analysis must come great coffee…

I think many will agree with me when I say that good data analysis is often fueled by great coffee.  Living where I do, the Pacific Northwest region of the US, we have a bit of a coffee (and beer) culture and my palette has become spoiled by the many great local coffee shops serving your choice of locally roasted single varietal beans (Ethiopian Duromina or Indoesian Gajah Acehc, anyone?).  Unfortunately, due to the way most of my days unfold, I usually don’t have the time to make the trip to a local shop to buy an espresso (which happens to be my drink–3-4 shots of espresso, a tiny dollop of milk and a half a packet of sweetener).

The good news is, a few years ago, I discovered the moka pot (or stovetop espresso machine), and it is now a part of my everyday routine.  Which glosses over the point that it makes some really… really good coffee.   So, for those of you who have yet to give it a shot (did you see what I did there…) here are my tips and tricks for making a great pot of coffee with one of these devices.

Bialetti Moka Express

Bialetti Moka Express

In terms of tools and ingredients all you need are a Moka pot (I purchased a Bialetti from Amazon, but I’m sure you can find one locally if you prefer), some finely ground dark roast coffee beans and some clean water.  I use a 9-cup maker and I find that it provides the 3 “triple shot”-sized servings I require to make it through a day.

When making a coffee with one of these the first thing you need to do is fill the water reservoir.  There should be a raised mark inside of the reservoir cylinder to use as a guide. DO NOT fill over this mark, as you will disable the proper functioning of the pressure release valve.  I tend to under-fill by just a bit.

This is about where I fill to

This is about where I fill to

Next fill the coffee chamber with ground beans.  Experiment with different roasts/blends until you find what you like.  While I do sometimes splurge on locally micro-roasted single varietals, I do also find that Starbucks Verona or Espresso roast do well as daily drinkers.  Here is where my method of brewing differs from what I’ve read from various sources.  Most instructions will tell you not to pack or tamp the grinds down in the chamber. While I do not vigorously pack the grinds in, I do take the back of a tablespoon and gently apply some pressure to the beans in the chamber to smooth out the top layer.  If you are incapable of applying only a small amount of force, I’d just skip this step.

DO NOT press with much force... a gentle tamp will do.

DO NOT press with much force… a gentle tamp will do.

Lastly, I set the gas burner on our stove to medium-high, and wait for the magic to happen (the cook time will vary with your BTUs/the size and type of Moka pot, etc.,).  This is where there will have to be some trial and error on your part.  Also, the cooking procedure is where I’ve read different advice as well.  Some will say to set your burner on high, for example.  What I find is that, If I set the burner on medium-high, I have a greater amount of control over the very end of the cook, which is an important part of the process for me.  At medium-high, I can come back to the pot when the cook is almost finished and turn the burner off for the very last stages of the brew.  Turning off the heat stalls pressurization within the water chamber and allows the water vapor to push its way through the grinds with slightly less force at the end of the cook forming a pseudo-crema at the end of the cook.  This is in contrast to the forceful popping and gurgling you will experience if you keep the burner on high until the end.

My pseudo-crema...

My pseudo-crema…

Next, pour and enjoy!  Now, go out and make some great data analysis!

Slope graphs and exporting multi-paneled plots (with data) in R

I’ve been trying to carve out the time to write a post about the GooleVis package and, more specifically, how AWESOME motion charts are within GoogleVis, but in the mean-time, I’ve been spending a lot of time thinking about data visualizations. I think that the act of putting together an entry for Naomi Robbin’s Forbes Graph Challenge had this kind of an impact on my consciousness. Anyhow, a few early responders to Naomi’s post were discussing slope graphs (http://www.edwardtufte.com/bboard/q-and-a-fetch-msg?msg_id=0003nk) and it inspired me to seek out ways to incorporate these into some of my data presentations. I found a most excellent post here about producing them in R (http://www.jameskeirstead.ca/r/slopegraphs-in-r/).  I will look for ways to incorporate these where they make sense in any future reporting.

Also, my issue of the week involved putting together a report that did an analysis of 300+ therapeutic groupings, where for each grouping I had to work out a way to get both an informative plot AND multiple pieces of data (2 data frames) to export to a single image. I resolved my issue with some help from gplots textplot() function.

testMat <- matrix(1:20, ncol = 5)##  create data
testMatDF <- as.data.frame(testMat)
names(testMatDF) <- c("Hey there", "Column 2",
         "Some * Symbols", "And ^ More",
         "Final Column")
rownames(testMatDF) <- paste("Group", 1:4)

library(gplots) ##  gplots needed for textplot()
layout(matrix(c(1, 1, 2, 3, 3, 3),  2, 3, byrow = TRUE))
curve(dnorm, -3, 4)
##  produces what I want within R

layout(matrix(c(1, 1, 2, 3, 3, 3),  2, 3, byrow = TRUE))
curve(dnorm, -3, 4)
dev.off() ##  how to export the same

Here’s a new years tip: This year resolve to get your resolution right when exporting plots in R

Winston Chang’s Cookbook for R (http://wiki.stdout.org/rcookbook/) has become one of my most favorite R reference sites as of late. In large part, this is because he does a most excellent job of providing some easily digestible ggplot examples, and I’ve been trying to make a move away from base to ggplot for all of my plotting needs.

While there are many very useful nuggets to be found in perusing his site, one useful tip I use every time I export a plot in R is this hint on adjusting the resolution of plots regardless of any device I’m printing to.

His code is found on this page (http://wiki.stdout.org/rcookbook/Graphs/Output%20to%20a%20file/), but in essence the gist of it all goes like this:

# First try it without adjusting the ppi
plot(mtcars$wt, mtcars$mpg, main="Scatterplot w/o adjustment", xlab="Car Weight ", ylab="Miles Per Gallon ", pch=19)
abline(lm(mpg~wt), col="red") # regression line (y~x) 
lines(lowess(wt,mpg), col="blue") # lowess line (x,y)
#  Now try it with adjustment -- as you can see you have to feed the device output function the actual width and height
ppi <- 300
png("plot2.png", width=6*ppi, height=6*ppi, res=ppi)
plot(mtcars$wt, mtcars$mpg, main="Scatterplot w/adjustment", xlab="Car Weight ", ylab="Miles Per Gallon ", pch=19)
abline(lm(mpg~wt), col="red") # regression line (y~x) 
lines(lowess(wt,mpg), col="blue") # lowess line (x,y)

I’ve been having quite a bit of fun with bubble plots and the googleVis package recently. The googleVis package, for one, is absolutely MIND-BLOWING! I’m hoping to carve out the time to do a post about that sometime in the very near future!

Forbes Graph Makeover Challenge ggplot2 style

I recently came across the Forbes Graph Makeover Challenge (http://www.forbes.com/sites/naomirobbins/2012/11/27/graph-makeover-contest/), and decided to give it a shot.  My goal was to create an entry using nothing but R code (that is, no post image processing at all), which to be totally honest, due to my lack of any experience at all with post image processing software/applications is the only way I could ever go about something like this.  The great thing about this challenge is that it really pushed me to get out of my ggplot2 shell and explore facets of ggplot2 that I never did have a handle on.

I haven’t yet had any time to annotate and clean up the code; however, to make things easier to digest–if you are so inclined to work your way through it–I cut the code into layers that build successively on p so you can see what happens as you go along.  Anyhow, here is the code (quick note–something happens when wordpress converts my R-code to HTML and I need to go back in and escape all of my “<-" in the code… for now just realize that until I fix them, these are getting converted to $lt; in "HTML-ese"):

forbes1 <- structure(list(TrafficSource = structure(c(3L, 1L, 2L, 3L, 1L, 2L), .Label = c("Twitter", "LinkedIn", "Facebook"), class = "factor"), Type = structure(c(1L, 1L, 1L, 2L, 2L, 2L), .Label = c("B2B", "B2C"), class = "factor"), Value = c(72L, 12L,16L, -84L, -15L, -1L)), .Names = c("TrafficSource", "Type", "Value"), row.names = c(NA, -6L), class = "data.frame")

cbPalette <- c("dodgerblue4", "gray35")

p <- ggplot(forbes1, aes(x= TrafficSource, y=Value, fill= Type, width = 0.7)) + geom_bar(color = "black", stat="identity", data=subset(forbes1, Type == "B2B")) + geom_bar(color = "black", stat="identity", data=subset(forbes1, Type == "B2C")) + scale_fill_manual(values = cbPalette)+ guides(fill=FALSE)

p <- p + ggtitle("Social Traffic Sources for\n B2B and B2C Companies") + theme(plot.title = element_text(lineheight=.8, face="bold", size=18), axis.text.x = element_text(vjust=0.2, size=18), axis.text.y = element_text(vjust=0.1, size=13), axis.title.y = element_text(face="bold", size=18))+ scale_x_discrete(name="")

abs_format <- function(){
p <- p + scale_y_continuous(breaks=seq(-100, 100, 25), labels = abs_format(), limits=c(-100, 100), name= "% of Referral Traffic") + geom_hline(aes(yintercept= 0))

p <- p + geom_text(aes(TrafficSource,Value,Type,label = texthere), data.frame(TrafficSource="Twitter", Type="B2B" ,Value=95, texthere="B2B"), size = 12, fontface=2, color="dodgerblue4") + geom_text(aes(TrafficSource,Value,Type,label = texthere), data.frame(TrafficSource="Twitter", Type="B2B" ,Value=-93, texthere="B2C"), size = 12, fontface=2, color= "gray35")

p<- p + geom_text(aes(TrafficSource,Value,Type,label = texthere), data.frame(TrafficSource="Twitter", Type="B2B" ,Value=30, texthere="12%") , size = 8, fontface=2) + geom_text(aes(TrafficSource,Value,Type,label = texthere), data.frame(TrafficSource="Twitter", Type="B2B" ,Value=-30, texthere="15%"), size = 8, fontface=2)

p <- p + geom_text(aes(TrafficSource,Value,Type,label = texthere), data.frame(TrafficSource="LinkedIn", Type="B2B" ,Value=30, texthere="16%"), size = 8, fontface=2) + geom_text(aes(TrafficSource,Value,Type,label = texthere), data.frame(TrafficSource="LinkedIn", Type="B2B" ,Value=-30, texthere="1%"), size = 8, fontface=2)

p <- p + geom_text(aes(TrafficSource,Value,Type,label = texthere), data.frame(TrafficSource="Facebook", Type="B2B" ,Value=30, texthere="72%"), size = 8, fontface=2) + geom_text(aes(TrafficSource,Value,Type,label = texthere), data.frame(TrafficSource="Facebook", Type="B2B" ,Value=-30, texthere="84%"), size = 8, fontface=2)


Here is the Example Image:

ggplot2 output for Forbes Graph Makeover Challenge