All the data you need.

Tag: Bayesian

Beta approximation to binomial
It is well-known that you can approximate a binomial distribution with a normal distribution. Of course there are a few provisos … It is also well-known that you can approximate a beta distribution with a normal distribution as well. This means you could directly approximate a binomial distribution with a …
Can you have confidence in a confidence interval?
“The only use I know for a confidence interval is to have confidence in it.” — L. J. Savage Can you have confidence in a confidence interval? In practice, yes. In theory, no. If you have a 95% confidence interval for a parameter θ, can you be 95% sure that …
First names and Bayes’ theorem
Is the woman in this photograph more likely to be named Esther or Caitlin? Yesterday Mark Jason Dominus published wrote about statistics on first names in the US from 1960 to 2021. For each year and state, the data tell how many boys and girls were given each name. Reading …
We Live in a Bayesian World
“Fail fast, pivot, and try again” is the heart of learning. And in knowledge-based industries, the economies of learning are more powerful than the economies of scale. In February 2020, Dr. Anthony Fauci wrote that store-bought face masks would not be very effective at protecting against the COVID-19 pandemic and …
A Bayesian approach to pricing
Suppose you want to determine how to price a product and you initially don’t know what the market is willing to pay. This post outlines some of the things you might think about, and how Bayesian modeling might help. This post is not the final word on the subject, or …
MLflow for Bayesian Experiment Tracking
This post is the third in a series on Bayesian inference ([1], [2] ). Here we will illustrate how to use managed MLflow on Databricks to perform and track Bayesian experiments using the Python package PyMC3. This results in systematic and reproducible experimentation ML pipelines that can be shared across …
Universal confidence interval
Here’s a way to find a 95% confidence interval for any parameter θ. With probability 0.95, return the real line. With probability 0.05, return the empty set. Clearly 95% of the time this procedure will return an interval that contains θ. This example shows the difference between a confidence interval …
Confidence interval widths
Suppose you do N trials of something that can succeed or fail. After your experiment you want to present a point estimate and a confidence interval. Or if you’re a Bayesian, you want to present a posterior mean and a credible interval. The numerical results hardly differ, though the two …
Scaled Beta2 distribution
I recently ran across a new probability distribution called the “Scaled Beta2” or “SBeta2” for short in [1]. For positive argument x and for positive parameters p, q, and b, its density is This is a heavy-tailed distribution. For large x, the probability density is O(x–q-1), the same as a …
Predictive probability for large populations
Suppose you want to estimate the number of patients who respond to some treatment in a large population of size N and what you have is data on a small sample of size n. The usual way of going about this calculates the proportion of responses in the small sample, …
Book Review: Bayesian Statistics the Fun Way by Will Kurt
"Bayesian Statistics the Fun Way: Understanding Statistics and Probability with Star Wars, Lego, and Rubber Ducks," by Will Kurt (2019 No Starch Press) is an excellent introduction to subjects critical to all data scientists. Will Kurt, in fact, is a data scientist! I always advise my data science classes at …
Automatic data reweighting
Suppose you are designing an autonomous system that will gather data and adapt its behavior to that data. At first you face the so-called cold-start problem. You don’t have any data when you first turn the system on, and yet the system needs to do something before it has accumulated …
HyperOpt: Bayesian Hyperparameter Optimization
This article covers how to perform hyperparameter optimization using a sequential model-based optimization (SMBO) technique implemented in the HyperOpt Python package. There is a complementary Domino project available. Introduction Feature engineering and hyperparameter optimization are two important model building steps. Over the years, I have debated with many colleagues as …
Multi-arm adaptively randomized clinical trials
This post will look at adaptively randomized trial designs. In particular, we want to focus on multi-arm trials, i.e. trials of more than two treatments. The aim is to drop the less effective treatments quickly so the trial can focus on determining which of the better treatments is best. We’ll …
Multi-arm adaptively randomized clinical trials
This post will look at adaptively randomized trial designs. In particular, we want to focus on multi-arm trials, i.e. trials of more than two treatments. The aim is to drop the less effective treatments quickly so the trial can focus on determining which of the better treatments is best. We’ll …
The cold start problem
How do you operate a data-driven application before you have any data? This is known as the cold start problem. We faced this problem all the time when I designed clinical trials at MD Anderson Cancer Center. We uses Bayesian methods to design adaptive clinical trial designs, such as clinical …