Article URL: https://www.ams.org/notices/201010/rtx101001303p.pdf
Comments URL: https://news.ycombinator.com/item?id=40890847
Points: 179
# Comments: 143
Article URL: https://www.ams.org/notices/201010/rtx101001303p.pdf
Comments URL: https://news.ycombinator.com/item?id=40890847
Points: 179
# Comments: 143
Kiran Gauthier writes:
After attending your talk at the University of Minnesota, I wanted to ask a follow up regarding the structure of hierarchical / multilevel models but we ran out of time. Do you have any insight on the thought that probabilistic programming languages are so flexible, and the Bayesian inference algorithms so fast, that there is a balance to be struck between “simple” hierarchical models and more “complex” hierarchical models that augment the simple frameworks with more modeled interactions when analyzing real data?
I think that a real benefit of the Bayesian paradigm is that (in theory) if the data doesn’t converge my uncertainty in a parameter, then the inference engine should return my prior (or something close to it). Does this happen in reality? I know you’ve written about canary variables before as an indication of model misspecification which I think is an awesome idea, I’m just wondering how to strike that balance between a simple / approximate model, and a more complicated model given that the true generative process is unknown, and noisy data with bad models can lead good inference engines astray.
My reply: I think complex models are better. As Radford Neal put it so memorably, nearly thirty years ago,
Sometimes a simple model will outperform a more complex model . . . Nevertheless, I believe that deliberately limiting the complexity of the model is not fruitful when the problem is evidently complex. Instead, if a simple model is found that outperforms some particular complex model, the appropriate response is to define a different complex model that captures whatever aspect of the problem led to the simple model performing well.
That said, I don’t recommend fitting the complex model on its own. Rather, I recommend building up to it from something simpler. This building-up occurs on two time scales:
1. When working on your particular problem, start with simple comparisons and then fit more and more complicated models until you have what you want.
2. Taking the long view, as our understanding of statistics progresses, we can understand more complicated models and fit them routinely. This is kind of the converse of the idea that statistical analysis recapitulates the development of statistical methods.
p‐Values are commonly transformed to lower bounds on Bayes factors, so‐called minimum Bayes factors. For the linear model, a sample‐size adjusted minimum Bayes factor over the class of g‐priors on the regression coefficients has recently been proposed (Held & Ott, The American Statistician 70(4), 335–341, 2016). Here, we extend this methodology to a logistic regression to obtain a sample‐size adjusted minimum Bayes factor for 2 × 2 contingency tables. We then study the relationship between this minimum Bayes factor and two‐sided p‐values from Fisher's exact test, as well as less conservative alternatives, with a novel parametric regression approach. It turns out that for all p‐values considered, the maximal evidence against the point null hypothesis is inversely related to the sample size. The same qualitative relationship is observed for minimum Bayes factors over the more general class of symmetric prior distributions. For the p‐values from Fisher's exact test, the minimum Bayes factors do on average not tend to the large‐sample bound as the sample size becomes large, but for the less conservative alternatives, the large‐sample behaviour is as expected.