helpful tips on sampling from beta distributions (getting from a to b to means and modes)….

I’ve been playing around with using R for simulations recently, and I ran into a use case that required I sample from a skewed distribution with min and max values of 0 and 1. A beta distribution was a nice representative form (http://en.wikipedia.org/wiki/Beta_distribution#Mode), and base R has a built in rbeta function; however, I needed to be able to feed into rbeta the proper alpha and beta parameters to produce the required distribution. So, I created two simple R functions that allowed me to start with an alpha value (take your pick) and either a target mean or mode, and the function produced the appropriate beta that I could then use to produce a beta distribution and my random sample.

# first by mean
a1 <- 20
mu1 <- .95
solmean <- function(alpha, mean) (alpha-(mean*alpha))/mean
beta <- solmean(a1, mu1)
x1 <- (rbeta(1000, shape1=a1, shape2=beta)) # sample of n=1000 from the beta dist
head(x1)
x1 <- round(x1,3)
hist(x1, breaks=100)# let's give it a look
mean(x1)
x2 <- table(x1)
which.max(x2) # let's see what the mode is
quantile(x1) # more descriptives

#  Now by mode
a2 <- 20
mo1 <- .95
solmode <- function(alpha, mode) (alpha-1-(mode*alpha)+(2*mode))/mode
beta <- solmode(a2, mo1)
x1 <- (rbeta(1000, shape1=a2, shape2=beta)) # sample of n=1000 from the beta dist
head(x1)
x1 <- round(x1,3)
hist(x1, breaks=100)# let's give it a look
mean(x1)
x2 <- table(x1)
which.max(x2) # let's see what the mode is
quantile(x1) # more descriptives


## Here's the code I used to make the plots that are above:
set.seed(123)
#  Now by mode
a2 <- 20
mo1 <- .95
solmode <- function(alpha, mode) (alpha-1-(mode*alpha)+(2*mode))/mode
beta <- solmode(a2, mo1)
x1 <- (rbeta(1000, shape1=a2, shape2=beta)) # sample of n=1000 from the beta dist
head(x1)
x1 <- round(x1,3)
xlimits <- pretty(range(x1))[c(1,length(pretty(range(x1))))]
hist(x1, breaks=100, xlim=xlimits, main="")# let's give it a look
test <- (x1, breaks=100, xlim=xlimits, main="")
u <- par("usr")
rect(u[1], u[3], u[2], u[4], col = "gray88", border = "black")
abline(v = pretty((range(x1))), col='white') ## draw h lines
abline(h = pretty(test$counts), col='white') ## draw h lines
par(new=T)
hist(x1, breaks=100, xlim=xlimits, col='darkgrey', main=paste('Beta Distr. with mode=', mo1, 'and alpha=', a2))# let's give it a look

#  Now by mode
a2 <- 40
mo1 <- .95
solmode <- function(alpha, mode) (alpha-1-(mode*alpha)+(2*mode))/mode
beta <- solmode(a2, mo1)
x2 <- (rbeta(1000, shape1=a2, shape2=beta)) # sample of n=1000 from the beta dist
head(x2)
x2 <- round(x2,3)
hist(x2, breaks=50, xlim=xlimits, main="")# let's give it a look
test <- hist(x2, breaks=50, xlim=xlimits, main="")
u <- par("usr")
rect(u[1], u[3], u[2], u[4], col = "gray88", border = "black")
abline(v = pretty((range(x1))), col='white') ## draw h lines
abline(h = pretty(test$counts), col='white') ## draw h lines
par(new=T)
hist(x2, breaks=50, xlim=xlimits, col='darkgrey', main=paste('Beta Distr. with mode=', mo1, 'and alpha=', a2))# let's give it a look

#  Now by mean
a3 <- 40
mu1 <- .95
solmean <- function(alpha, mean) (alpha-(mean*alpha))/mean
beta <- solmean(a3, mu1)
x3 <- (rbeta(1000, shape1=a1, shape2=beta)) # sample of n=1000 from the beta dist
head(x3)
mean(x3)
x3 <- round(x3,3)
hist(x3, breaks=50, xlim=xlimits, main="")# let's give it a look
test <- hist(x3, breaks=50, xlim=xlimits, main="")
u <- par("usr")
rect(u[1], u[3], u[2], u[4], col = "gray88", border = "black")
abline(v = pretty((range(x1))), col='white') ## draw h lines
abline(h = pretty(test$counts), col='white') ## draw h lines
par(new=T)
hist(x3, breaks=50, xlim=xlimits, col='darkgrey', main=paste('Beta Distr. with mean=', mu1, 'and alpha=', a3))# let's give it a look

That’s it. I’ve noticed that changing your prespecified alpha does some interesting things to the kurtosis of the resulting beta distribution, but it’s up to you to play around with things until you get a distribution that suits your needs!

One last thing… if you are really into fat-tailed distributions (let’s say your a black swan fan), and you really don’t require a “curvy” form. One great way to sample from a skewed distribution that is realtively easy to specify (in terms of a mode–that is), I’ve found that the triangle package (http://cran.r-project.org/web/packages/triangle/index.html) a nice way to go…. and the code is super easy to run.

 require(triange)
 tri <- round(rtriangle(10000, 0, 1, .85),3)  # just feed it the sample size, min bound, max bound, and mode... and voila!
 hist(tri)

ENJOY!

2 thoughts on “helpful tips on sampling from beta distributions (getting from a to b to means and modes)….”

John Eintrach on November 13, 2012 at 7:08 am said:

I need some help understanding rbeta. My problem is as following…

Use simulation to approximate P(S <=10), by simulating 100,000 of each of X1;…… ;X20. Note that Xi have a beta distribution. Give your approximation value here.

Reply ↓
Dru Tirpatel on November 13, 2012 at 4:54 pm said:

I need some help with rbeta. Following is my problem…

Problem: Use simulation to approximate P(S <=10), by simulating 100,000 of each of X1; ;X20. Note that Xi have a beta distribution. Give your approximation value here.

Reply ↓

failuretoconverge

father, husband, son, brother, data hacker, seeking optimization

helpful tips on sampling from beta distributions (getting from a to b to means and modes)….

2 thoughts on “helpful tips on sampling from beta distributions (getting from a to b to means and modes)….”

Leave a comment Cancel reply

Share this:

Related

2 thoughts on “helpful tips on sampling from beta distributions (getting from a to b to means and modes)….”

Leave a comment Cancel reply