hello.
34 stories
·
1 follower

Is there a balance to be struck between simple hierarchical models and more complex hierarchical models that augment the simple frameworks with more modeled interactions when analyzing real data?

1 Share

Kiran Gauthier writes:

After attending your talk at the University of Minnesota, I wanted to ask a follow up regarding the structure of hierarchical / multilevel models but we ran out of time. Do you have any insight on the thought that probabilistic programming languages are so flexible, and the Bayesian inference algorithms so fast, that there is a balance to be struck between “simple” hierarchical models and more “complex” hierarchical models that augment the simple frameworks with more modeled interactions when analyzing real data?

I think that a real benefit of the Bayesian paradigm is that (in theory) if the data doesn’t converge my uncertainty in a parameter, then the inference engine should return my prior (or something close to it). Does this happen in reality? I know you’ve written about canary variables before as an indication of model misspecification which I think is an awesome idea, I’m just wondering how to strike that balance between a simple / approximate model, and a more complicated model given that the true generative process is unknown, and noisy data with bad models can lead good inference engines astray.

My reply: I think complex models are better. As Radford Neal put it so memorably, nearly thirty years ago,

Sometimes a simple model will outperform a more complex model . . . Nevertheless, I believe that deliberately limiting the complexity of the model is not fruitful when the problem is evidently complex. Instead, if a simple model is found that outperforms some particular complex model, the appropriate response is to define a different complex model that captures whatever aspect of the problem led to the simple model performing well.

That said, I don’t recommend fitting the complex model on its own. Rather, I recommend building up to it from something simpler. This building-up occurs on two time scales:

1. When working on your particular problem, start with simple comparisons and then fit more and more complicated models until you have what you want.

2. Taking the long view, as our understanding of statistics progresses, we can understand more complicated models and fit them routinely. This is kind of the converse of the idea that statistical analysis recapitulates the development of statistical methods.

Read the whole story
jthaman
13 days ago
reply
Maryland
Share this story
Delete

On the Term “Randomization Test”

1 Share
.
Read the whole story
jthaman
70 days ago
reply
Maryland
Share this story
Delete

Bayesian Calibration of p‐Values from Fisher's Exact Test

2 Shares

Summary

p‐Values are commonly transformed to lower bounds on Bayes factors, so‐called minimum Bayes factors. For the linear model, a sample‐size adjusted minimum Bayes factor over the class of g‐priors on the regression coefficients has recently been proposed (Held & Ott, The American Statistician 70(4), 335–341, 2016). Here, we extend this methodology to a logistic regression to obtain a sample‐size adjusted minimum Bayes factor for 2 × 2 contingency tables. We then study the relationship between this minimum Bayes factor and two‐sided p‐values from Fisher's exact test, as well as less conservative alternatives, with a novel parametric regression approach. It turns out that for all p‐values considered, the maximal evidence against the point null hypothesis is inversely related to the sample size. The same qualitative relationship is observed for minimum Bayes factors over the more general class of symmetric prior distributions. For the p‐values from Fisher's exact test, the minimum Bayes factors do on average not tend to the large‐sample bound as the sample size becomes large, but for the less conservative alternatives, the large‐sample behaviour is as expected.

Read the whole story
jthaman
475 days ago
reply
Maryland
Share this story
Delete

Those who live by ChatGPT are destined to get advice of unpredictable quality

1 Share





Read the whole story
jthaman
512 days ago
reply
Maryland
Share this story
Delete

Improving a graph

1 Share
A lot of buses are being cancelled in Auckland at the moment. This is partly due to Covid, but also due to difficulty in recruiting bus drivers because of poor pay and conditions. And probably other reasons, too. I’ve put about six weeks of daily cancellation data in a Github gist Here’s a default graph: d<-read.table("https://gist.githubusercontent.com/tslumley/9ac8df14309ecc5936183de84b57c987/raw/9ebf665b2ff9a93c1dbc73caf5ff346909899827/busdata.txt",header=TRUE) d$date<-as.Date(paste(2022, d$mo, d$d,sep="-")) plot(cancels~date, data=d) There are a lot of cancellations, but otherwise it’s not all that clear.
Read the whole story
jthaman
588 days ago
reply
Maryland
Share this story
Delete

Let's practice what we preach: Planning and interpreting simulation studies with design and analysis of experiments

1 Share
Abstract Statisticians recommend design and analysis of experiments (DAE) for evidence‐based research but often use tables to present their own simulation studies. Could DAE do better? We outline how DAE methods can be used to plan and analyze simulation studies. Tools for planning include cause‐and‐effect diagrams and factorial and fractional factorial designs. Analysis is carried out via analysis of variance, main effect and interaction plots, and other DAE tools. We also demonstrate how Taguchi robust parameter design can be used to study the robustness of methods to a variety of uncontrollable population parameters. Résumé Les statisticiens prônent le recours aux plans et analyse d'expériences (DAE) en recherche factuelle, mais le plus souvent, ils se contentent de tableaux pour présenter leurs propres études de simulation. Les auteurs de ce travail examinent l'apport et l'amélioration que pourrait apporter l'adoption de l'approche (DAE). Pour y répondre, ils présentent une description de la façon dont les méthodes DAE peuvent être employées pour planifier et analyser les études de simulation. Les outils de planification proposés comprennent les diagrammes de cause à effet et les plans factoriels et fractionnaires. L'analyse préconisée repose sur l'ANOVA, les graphiques d'effets principaux et d'interactions et d'autres outils DAE. Les auteurs montrent également comment les plans robustes de Taguchi peuvent être utilisés pour étudier la robustesse des méthodes à divers paramètres de population non contrôlables.
Read the whole story
jthaman
654 days ago
reply
Maryland
Share this story
Delete
Next Page of Stories