July | 2012 | failuretoconverge

I’ve been really trying to take my data visualization skills to the next level. Some outlets I’ve been experimenting with include: heatmaps and bubble plots.

In working with heatmaps, the data I’ve been visualizing includes trends of things like monthly utilization and such, where it is important for the consumer to absorb trends, and relative values. While I am still in search of good step-by-step tutorials and texts to help me along down that path–I’ve found Chapter 8 of Hrishi V. Mittal’s R Graph Cookbook a decent source in this regard (http://www.amazon.com/R-Graph-Cookbook-Hrishi-Mittal/dp/1849513066). In addition, the flowing data blog has a most excellent step-by-step (the consiceness and clarity of that blog, just nudged me to reward his work with a purchase of his text at amazon- http://www.amazon.com/Visualize-This-FlowingData-Visualization-ebook/dp/B005CCT19M/ref=kinw_dp_ke).

With bubble charts, it’s all about going beyond visualizing a table of numbers and truly visualizing the relationships of three separate elements of quantification. For my needs it would be something like what were total sales last year, this year (in dollars) and total utiliation this year (in units) OR over the last year, what are total sales ($), how many unique members were using, and what is utilization in total claims. In terms of bubble charts, I’ve found an excellent step-by-step–also from the Flowing Data blog (http://flowingdata.com/2010/11/23/how-to-make-bubble-charts/), which I intend to give a whack today!

…on to take my plotting to z-rd base!

I’ve finished Matloff’s excellent R book, and I’m now wading through some books on using R for Bayesian inference and trying to get handle on machine learning with R in a catch-as-catch can kind of a way (with some help from Conway’s O’Reilly text: http://www.amazon.com/Machine-Learning-Hackers-Drew-Conway/dp/1449303714).

One of the machine learning methods that has really captured my interest as of late–which unfortunately is not expicitly addressed in Conway’s book–is MARS (which I’ve come to learn is now a term “owned” by Salford Systems–but stands for Multivariate adaptive regression splines http://en.wikipedia.org/wiki/Multivariate_adaptive_regression_splines). As someone who has used quite a bit of MLR, MARS represents a really cool extension in that it creates inflection points (splines) in linear functions to optimize model fit.

In medicine, where heterogineity of patient populations can manifest as different treatment effects (based on underlying demographic, behavioral, or genetic characteristics), this method of modeling data seems really appaealing. I can see how knot placment, in-and-of-itself, may lend some interesting insights as to how subpopulations may gain differential relative benefit from medications or other kinds of interventions!.. which may lead to a better unerstanding of what interventions bring the highest value for target sub-populations!

That said, I’m looking forward to applying MARS to a future project.

As far as I can tell the R package that implements MARS is called earth, and I have began to browse through the manuals available on CRAN (http://cran.r-project.org/web/packages/earth/index.html). Fun stuff indeed….

failuretoconverge

father, husband, son, brother, data hacker, seeking optimization

Monthly Archives: July 2012

Time to take my charting to the z-rd dimension–bubble plots and heatmaps

Ground control to Major Tom… to MARS and earth and back…