Rebel Bayes Day 2

Prior beliefs about Bayesian statistics, updated by reading Statistical Rethinking by Richard McElreath.

Duncan Garmonsway
February 19, 2019

Reading week

This week I am reading Statistical Rethinking by Richard McElreath. Each day I post my prior beliefs about Bayesian Statistics, read a bit, and update them. See also Day 1, Day 3, Day 4 and Day 5.

Prior beliefs

  1. Inference is a mug’s game. Unless you have so much data that you don’t need stats, then you don’t have enough data to do stats.
  2. Linear models combine values of features of a subject, to specify the parameters of a distribution, from which you draw a prediction of a value of a feature.
  3. Distributions are normal because everything is either in one state or another, and the combination of many yes/no decisions is approximately normal.
  4. Don’t you dare do this unless everything is independent and identically distributed.
  5. A Bayesian linear model with uniform priors is exactly the same as a frequentist linear model.
  6. The distributions of Bayesian parameter estimates are hard to interpret because they are all conditional on each other.
  7. It’s a pain to specify priors.
  8. Polynomial regression is the very devil because it will make false prophecies (overfitting) and polynomial relationships don’t occur in nature.
  9. Categorical variables don’t have distributions – treat them as probabilities and pretend everything’s okay.
  10. Ordinary Least Squares worked for my grandfather and if you have to use some other obscure method to make the data support you then you’re up to no good.
  11. Even Bayes can’t protect you from your own willpower to find signal in noise.
  12. Even Bayes can’t detect signals buried in ambiguous, indirect observation.
  13. The AIC Information Criterion isn’t how acronyms work but it’s how Google search terms work.
  14. Bayesians invented the BIC to be more punishing than the AIC and retake the moral high ground.
  15. p-values are contrived, essentially meaningless and nobody knows how they really work – oh look there’s an information criterion let’s use that!

Undecided from previous days

  1. If computers had been invented before frequentist statistics, nobody would have invented frequentist statistics.
  2. Bayesian A/B testing is the acceptable face of early stopping, aka ethical and pragmatic experimental design.
  3. The Bayesian revival was masterminded by publishers to double their market by publishing Bayesian variants of everything.
  4. Physicists worked out all the useful Bayesian methods ages ago but they have way cooler things to boast about.
  5. WinBUGS is a leading indicator of a bad course.
  6. STAN is the man.

New data

4. Linear Models

4.1 Why normal distributions are normal

4.2 A language for describing models

4.3 A Gaussian model of height

4.4 Adding a predictor

4.5 Polynomial regression

5. Multivariate Linear Models

5.1 Spurious association

5.2 Masked relationship

5.3 When adding variables hurts

5.4 Categorical variables

No notes.

5.5 Ordinary least squares and lm

6. Overfitting, Regularization, and Information Criteria

6.1 The problem with parameters

6.2 Information theory and model performance

6.3 Regularization

6.4 Information Criteria

6.5 Using information criteria

Updated beliefs

  1. ✓ Inference is a mug’s game. Unless you have so much data that you don’t need stats, then you don’t have enough data to do stats. This is harsh, but it does seem that the only safe purpose of a model is prediction or to suggest other things to examine, and that any interpretation of parameters is unwise.
  2. ✓ Linear models combine values of features of a subject, to specify the parameters of a distribution, from which you draw a prediction of a value of a feature.
  3. ✓ Distributions are normal because everything is either in one state or another, and the combination of many yes/no decisions is approximately normal. The text adds a lot more maths to this, but the gist is there.
  4. ✕ Don’t you dare do this unless everything is independent and identically distributed. Apparently, like everything else, it might be okay as long as [litany of checks and balances]
  5. ✓ A Bayesian linear model with uniform priors is exactly the same as a freqentist linear model.
  6. ✓ The distributions of Bayesian parameter estimates are hard to interpret because they are all conditional on each other. Not Bayes’ fault. The world is difficult to understand, and so are difficult models of it.
  7. It’s a pain to specify priors. Not explicitly put in the text, but the author knows the distributions well enough to choose parameter estimates for the shape he wants, rather than have to work back from the shape to the parameters, or adjust after guessing.
  8. ✓ Polynomial regression is the very devil because it will make false prophecies (overfitting) and polynomial relationships don’t occur in nature. Perhaps first-order ones do.
  9. ? Categorical variables don’t have distributions – treat them as probabilities and pretend everything’s okay. No logits were mentioned
  10. ✓ Ordinary Least Squares worked for my grandfather and if you have to use some other obscure method to make the data support you then you’re up to no good. They turn out to be at the heart of Bayesian methods.
  11. ✓ Even Bayes can’t protect you from your own willpower to find signal in noise.
  12. ✓ Even Bayes can’t detect signals buried in ambiguous, indirect observation.
  13. ✓ The AIC Information Criterion isn’t how acronyms work but it’s how Google search terms work.
  14. ✕ Bayesians invented the BIC to be more punishing than the AIC and retake the moral high ground. I was utterly wrong about this.
  15. p-values are contrived, essentially meaningless and nobody knows how they really work – oh look there’s an information criterion let’s use that! I don’t see how using an information criterion to inform a decision is any more justifiable than using a p-value. They’re all rules of thumb
  16. ✕ The Bayesian revival was masterminded by publishers to double their market by publishing Bayesian variants of everything. Bayesian methods seem more and more to be generalisations of frequentist methods.
  17. ✓ If computers had been invented before frequentist statistics, nobody would have invented frequentist statistics. See above
  18. ✓ WinBUGS is a leading indicator of a bad course. There’s no way this book could be translated to WinBUGS.

Still undecided from previous days

  1. ? Bayesian A/B testing is the acceptable face of early stopping, aka ethical and pragmatic experimental design.
  2. ? Physicists worked out all the useful Bayesian methods ages ago but they have way cooler things to boast about.
  3. ? STAN is the man.

Critic’s Choice

My new favourite illustration of overfitting is the plot of different polynomial regressions fitted to leave-one-out samples of one data set (p173). I like the idea of talking about the strength of a prior in terms of which data would lead to the same posterior distribution. I also like the idea of using the Gaussian likelihood merely to estimate the mean and variance of a variable.

The general impression of these chapters is that Bayesian methods are the same as frequentist ones, with the following differences:

The usual limitations of data and modelling remain.

Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. Source code is available at https://github.com/nacnudus/duncangarmonsway, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Garmonsway (2019, Feb. 19). Duncan Garmonsway: Rebel Bayes Day 2. Retrieved from https://nacnudus.github.io/duncangarmonsway/posts/2019-02-19-rebel-bayes-day-2/

BibTeX citation

@misc{garmonsway2019rebel,
  author = {Garmonsway, Duncan},
  title = {Duncan Garmonsway: Rebel Bayes Day 2},
  url = {https://nacnudus.github.io/duncangarmonsway/posts/2019-02-19-rebel-bayes-day-2/},
  year = {2019}
}