jthaman's blurblog

Teaching general problem-solving skills is not a substitute for teaching math [pdf] (2010) by JustinSkycak
Saturday July 6^th, 2024 at 8:16 PM

Hacker News: Newest

Article URL: https://www.ams.org/notices/201010/rtx101001303p.pdf

Comments URL: https://news.ycombinator.com/item?id=40890847

Points: 179

# Comments: 143

Read the whole story

jthaman

20 days ago

reply

Maryland

Is there a balance to be struck between simple hierarchical models and more complex hierarchical models that augment the simple frameworks with more modeled interactions when analyzing real data? by Andrew
Thursday May 30^th, 2024 at 6:38 PM

Statistical Modeling, Causal Inference, and Social Science

Kiran Gauthier writes:

After attending your talk at the University of Minnesota, I wanted to ask a follow up regarding the structure of hierarchical / multilevel models but we ran out of time. Do you have any insight on the thought that probabilistic programming languages are so flexible, and the Bayesian inference algorithms so fast, that there is a balance to be struck between “simple” hierarchical models and more “complex” hierarchical models that augment the simple frameworks with more modeled interactions when analyzing real data?

I think that a real benefit of the Bayesian paradigm is that (in theory) if the data doesn’t converge my uncertainty in a parameter, then the inference engine should return my prior (or something close to it). Does this happen in reality? I know you’ve written about canary variables before as an indication of model misspecification which I think is an awesome idea, I’m just wondering how to strike that balance between a simple / approximate model, and a more complicated model given that the true generative process is unknown, and noisy data with bad models can lead good inference engines astray.

My reply: I think complex models are better. As Radford Neal put it so memorably, nearly thirty years ago,

Sometimes a simple model will outperform a more complex model . . . Nevertheless, I believe that deliberately limiting the complexity of the model is not fruitful when the problem is evidently complex. Instead, if a simple model is found that outperforms some particular complex model, the appropriate response is to define a different complex model that captures whatever aspect of the problem led to the simple model performing well.

That said, I don’t recommend fitting the complex model on its own. Rather, I recommend building up to it from something simpler. This building-up occurs on two time scales:

1. When working on your particular problem, start with simple comparisons and then fit more and more complicated models until you have what you want.

2. Taking the long view, as our understanding of statistics progresses, we can understand more complicated models and fit them routinely. This is kind of the converse of the idea that statistical analysis recapitulates the development of statistical methods.

Read the whole story

jthaman

57 days ago

reply

Maryland

On the Term “Randomization Test” by Jesse HemerikEconometric Institute, Erasmus University, Rotterdam, The Netherlands
Wednesday April 3^rd, 2024 at 9:13 PM

tandf: The American Statistician: Table of Contents

.

Read the whole story

jthaman

114 days ago

reply

Maryland

Bayesian Calibration of p‐Values from Fisher's Exact Test by Manuela Ott, Leonhard Held
Friday February 24^th, 2023 at 9:10 AM

Wiley: International Statistical Review: Table of Contents

Summary

p‐Values are commonly transformed to lower bounds on Bayes factors, so‐called minimum Bayes factors. For the linear model, a sample‐size adjusted minimum Bayes factor over the class of g‐priors on the regression coefficients has recently been proposed (Held & Ott, The American Statistician 70(4), 335–341, 2016). Here, we extend this methodology to a logistic regression to obtain a sample‐size adjusted minimum Bayes factor for 2 × 2 contingency tables. We then study the relationship between this minimum Bayes factor and two‐sided p‐values from Fisher's exact test, as well as less conservative alternatives, with a novel parametric regression approach. It turns out that for all p‐values considered, the maximal evidence against the point null hypothesis is inversely related to the sample size. The same qualitative relationship is observed for minimum Bayes factors over the more general class of symmetric prior distributions. For the p‐values from Fisher's exact test, the minimum Bayes factors do on average not tend to the large‐sample bound as the sample size becomes large, but for the less conservative alternatives, the large‐sample behaviour is as expected.

Read the whole story

jthaman

518 days ago

reply

Maryland

Those who live by ChatGPT are destined to get advice of unpredictable quality by xi'an
Tuesday January 17^th, 2023 at 10:19 PM

Xi'an's Og

Read the whole story

jthaman

556 days ago

reply

Maryland

Improving a graph
Wednesday November 2^nd, 2022 at 10:41 PM

Biased and Inefficient

A lot of buses are being cancelled in Auckland at the moment. This is partly due to Covid, but also due to difficulty in recruiting bus drivers because of poor pay and conditions. And probably other reasons, too. I’ve put about six weeks of daily cancellation data in a Github gist Here’s a default graph: d<-read.table("https://gist.githubusercontent.com/tslumley/9ac8df14309ecc5936183de84b57c987/raw/9ebf665b2ff9a93c1dbc73caf5ff346909899827/busdata.txt",header=TRUE) d$date<-as.Date(paste(2022, d$mo, d$d,sep="-")) plot(cancels~date, data=d) There are a lot of cancellations, but otherwise it’s not all that clear.

Read the whole story

jthaman

632 days ago

reply

Maryland

Teaching general problem-solving skills is not a substitute for teaching math [pdf] (2010) by JustinSkycak Saturday July 6th, 2024 at 8:16 PM

Is there a balance to be struck between simple hierarchical models and more complex hierarchical models that augment the simple frameworks with more modeled interactions when analyzing real data? by Andrew Thursday May 30th, 2024 at 6:38 PM

On the Term “Randomization Test” by Jesse HemerikEconometric Institute, Erasmus University, Rotterdam, The Netherlands Wednesday April 3rd, 2024 at 9:13 PM

Bayesian Calibration of p‐Values from Fisher's Exact Test by Manuela Ott, Leonhard Held Friday February 24th, 2023 at 9:10 AM

Summary

Those who live by ChatGPT are destined to get advice of unpredictable quality by xi'an Tuesday January 17th, 2023 at 10:19 PM

Improving a graph Wednesday November 2nd, 2022 at 10:41 PM

Teaching general problem-solving skills is not a substitute for teaching math [pdf] (2010) by JustinSkycak
Saturday July 6^th, 2024 at 8:16 PM

Is there a balance to be struck between simple hierarchical models and more complex hierarchical models that augment the simple frameworks with more modeled interactions when analyzing real data? by Andrew
Thursday May 30^th, 2024 at 6:38 PM

On the Term “Randomization Test” by Jesse HemerikEconometric Institute, Erasmus University, Rotterdam, The Netherlands
Wednesday April 3^rd, 2024 at 9:13 PM

Bayesian Calibration of p‐Values from Fisher's Exact Test by Manuela Ott, Leonhard Held
Friday February 24^th, 2023 at 9:10 AM

Those who live by ChatGPT are destined to get advice of unpredictable quality by xi'an
Tuesday January 17^th, 2023 at 10:19 PM

Improving a graph
Wednesday November 2^nd, 2022 at 10:41 PM