Wednesday, June 26, 2013

Doing Bayesian Data Analysis at Jacob Bernoulli's grave

The book visited the grave of Jacob Bernoulli (1655-1705) in Basel, Switzerland, as shown in these photos taken by Dr. Benjamin Scheibehenne. Jacob Bernoulli pre-dated Bayes (1701-1761), of course, but Bernoulli established foundational concepts and theorems of probability.

The cutie admiring the book in the first photo is Benjamin's daughter. Many thanks to Benjamin and family for venturing out to take these photos. Thanks also to Benjamin and colleagues for hosting my visit to Basel for a day back in June, during which we unfortunately did not have time to visit Jacob Bernoulli's grave.

If you find this amusing, you might also like the book at Bayes' tomb and the book at R. A. Fisher's remains.

Sunday, June 16, 2013

Bayesian robust regression for Anscombe quartet

In 1973, Anscombe presented four data sets that have become a classic illustration for the importance of graphing the data, not merely relying on summary statistics. The four data sets are now known as "Anscombe's quartet." Here I present a Bayesian approach to modeling the data. The Anscombe quartet is used here as merely another illustration of two frequently made points in Bayesian analysis: JAGS/BUGS is a very flexible language for expressing a variety of models (e.g., robust non-linear regression), and it is important to conduct a posterior predictive check of the descriptive adequacy of the model.

(I first messed around with analyzing Anscombe's quartet back in October 2010, for personal interest. A conversation I had last week in Basel, Switzerland, suggested that this analysis might interest a larger audience. Hence this blog post.)

Below are the four data sets, shown as black circles. They are modeled with typical linear regression that assumes normality:
y ~ dnorm( b0+b1*x , sigma )
with vague priors on the intercept b0, the slope b1, and the standard deviation sigma.

Actually, for easy consistency with the subsequent analysis, I used a t distribution with the normality (df) parameter given a prior that insists it is very large, near 500, which is essentially normal. Thus, the model is
y ~ dt( b0+b1*x , sigma , nu )
with vague priors on b0, b1, and sigma, and the prior on normality being
nu ~ dnorm( 500 , sd=0.1 )

The figures below show the data in the left panels, with a smattering of credible regression lines superimposed, and veritical bars that indicate the 95% HDI of posterior predicted y values at selected x values:

Notice that the marginal parameter estimates are the same for all four data sets (up to MCMC sampling error), which is exactly the point that Anscombe wanted to emphasize for least-squares regression. Here we see that it is also true for Bayesian estimation, and this is no surprise when we remember from formal Bayesian analysis of linear regression (with normally distributed noise) that the "sufficient statistics" of the  data determine the posterior distribution, and the sufficient statistics are identical for the four data  sets.

One of the beauties of the flexibility of JAGS/BUGS is that it is easy to implement other models. In particular, it is easy to implement regression that is robust to outliers, using the t distribution. I used the same model as above,

y ~ dt( b0+b1*x , sigma , nu )
with vague priors on b0, b1, and sigma, but I put a vague prior on the normality of the t distribution:
nu <- nuMinusOne + 1
nuMinusOne ~ dexp( 1/29 )
This is the same prior on nu as is used in the "BEST" model for robust comparison of two groups. The resulting posteriors look like this:

The normality parameter (nu) is not much different from its prior for Types 1, 2, and 4, resulting in slightly smaller estimates of sigma than when forcing nu to be 500.  Most notably, however, for Type 3, the model detects that the data fall on a line and have a single outlier, indicated by the SD (sigma) estimated as an extremely small value, and the normality (nu) hugging 1.0, which is its smallest possible.

Analogously, we could easily do a robust quadratic trend analysis, and thereby detect the non-linear trend in Type 2. See this previous blog post for how to extend a JAGS/BUGS linear regression to include a quadratic trend.

To reiterate and conclude, the Anscombe quartet not only makes its original point, that we should look at the posterior predictions to make sure that the model is a reasonable description of the data, but it also highlights that for Bayesian analysis using JAGS/BUGS, it is easy to try other models that may be better descriptions of the data.
ADDED 18 JUNE 2013: Rasmus Baath, after reading the above post, has actually implemented the extensions that I merely suggested above. Please continue reading his very interesting follow up.

Tuesday, June 4, 2013

New R Package for BEST (Bayesian ESTimation supersedes the t test)

A completely re-packaged version of the BEST software (from the article, "Bayesian estimation supersedes the t test") has been prepared by Michael E. Meredith. Mike is a key member of the Wildlife Conservation Society in Malaysia. For his new R package, Mike included additional MCMC diagnostic information, combined the two-group and one-group cases into a single function, made additional plot options and numerical-summary output, made the whole thing match conventional R style for plot commands, wrote new documentation, etc. Mike did all this completely independently of any prompt or help from me. I am very grateful for his efforts. The package looks great, and here's how to get it:

Open R. At the R command prompt, simply type:
Or, in RStudio, go through menu Tools,  Install Packages. You only need to install the BEST package once.

[Updated 05 June: When I initially posted this announcement yesterday, the binaries were not yet available on CRAN, and so I provided a set of instructions for how to install the package from its source code. Since then, the binaries have been posted, and installation takes only the single step above. Binaries for MacOS might be delayed another day or so.]

At the command line in R, type
The functions for BEST are now in R's working memory. You need to do this command every time you invoke R.

to display a help file with a complete example of using the functions. Further documentation is available in this PDF that Mike prepared. Please note that the syntax of Mike's R package is a bit different than the original BEST programs. Perhaps the best way to learn the new syntax is to run the examples provided in the help file and PDF.

If you like the package, please let Mike know; his e-mail address is shown in the help files.