Suppose you want to estimate the number of patients who respond to some treatment in a large population of size N and what you have is data on a small sample of size n. The usual way of going about this calculates the proportion of responses in the small sample, …

"Bayesian Statistics the Fun Way: Understanding Statistics and Probability with Star Wars, Lego, and Rubber Ducks," by Will Kurt (2019 No Starch Press) is an excellent introduction to subjects critical to all data scientists. Will Kurt, in fact, is a data scientist! I always advise my data science classes at …

Suppose you are designing an autonomous system that will gather data and adapt its behavior to that data. At first you face the so-called cold-start problem. You don’t have any data when you first turn the system on, and yet the system needs to do something before it has accumulated …

March 4, 2020, 11:10 a.m.

This article covers how to perform hyperparameter optimization using a sequential model-based optimization (SMBO) technique implemented in the HyperOpt Python package. There is a complementary Domino project available. Introduction Feature engineering and hyperparameter optimization are two important model building steps. Over the years, I have debated with many colleagues as …

This post will look at adaptively randomized trial designs. In particular, we want to focus on multi-arm trials, i.e. trials of more than two treatments. The aim is to drop the less effective treatments quickly so the trial can focus on determining which of the better treatments is best. We’ll …

This post will look at adaptively randomized trial designs. In particular, we want to focus on multi-arm trials, i.e. trials of more than two treatments. The aim is to drop the less effective treatments quickly so the trial can focus on determining which of the better treatments is best. We’ll …

How do you operate a data-driven application before you have any data? This is known as the cold start problem. We faced this problem all the time when I designed clinical trials at MD Anderson Cancer Center. We uses Bayesian methods to design adaptive clinical trial designs, such as clinical …