While analyzing data sets, I sometimes stop and wonder how a user-defined function might automate a task that I see myself carrying out frequently. An example came up recently that involved something that I used to do with every new data set back when I used SPSS (I think it was the “Explore” function in SPSS…): visualize all the categorical data using barplots and the continuous data using histograms.
To that end, I started to wonder if I could use for to loop over an index within a dataframe to carry this out. So I gave writing the loop a shot and failed miserably; however, I think I had enough of a well-formed attempt to post on stack overflow (http://stackoverflow.com/questions/10643841/simple-barplot-of-variables-in-df-using-a-loop).
My attempt looked like this:
y <- LETTERS[as.integer(rnorm(100, mean=5, sd=1))]
z <- LETTERS[as.integer(rnorm(100, mean=10, sd=1))]
x <- round(rnorm(100, mean=5, sd=2.5),2) data <- as.data.frame(cbind(x,y,z))
A<- do.call(“cbind”, lapply(data, class))
B<- as.vector( A[1,])
C <- grep(“character|factor”, B)
for (i in 1:length(C)) { x <- C[i] counti <- table(data[,x]) y <- barplot(counti, main=paste(“Barplot for var”, x)) return(y)}
Here’s hoping that those much smarter than me can come to my rescue…