Federal budget scaled to per person dollars
For The Upshot, Alicia Parlapiano and Quoctrung Bui scaled down the federal budget…
K-Nearest Neighbors explained
In this post, I explain the intuition and logic behind KNN algorithm and show simple implementation written in pure pandas which yield 98% accuracy on the IRIS dataset.
Billionaire’s spending scaled to your net worth
We hear about billionaires spending millions of dollars on ads, acquisitions, etc. It…
Data problems in Iowa caucus results
It wasn't just issues with an app. There appears to be many more…
Privacy algorithm could lead to Census undercount of small towns
To increase anonymity in the Census records, the bureau is testing an algorithm…
Effect of College Selection on ROI
Description The current project applied web scraping techniques to investigate effect of degree type, college type, college major, and regionality, on early- and mid-career earnings and estimated 20 year return on investment of higher education Background College tuition is at an all-time high, and a barrier to entry for many …
What is a Chi-Square Test and Why Do We use it?
This post will give you an insight on how Chi-Square tests work and when to use them.
To get your personal data, provide more personal data
File another one under the sounds-good-on-paper-but-really-challenging-in-practice. Kashmir Hill, for The New York Times,…
How police use facial recogntion
For The New York Times, Jennifer Valentino-DeVries looked at the current state of…
New Statistics Course: Conditional Probability in R
Learn the fundamentals of conditional probability in R with this interactive statistics course. Master Naive Bayes and learn to build a spam filter with R! The post New Statistics Course: Conditional Probability in R appeared first on Dataquest.
Squirrel census count in Central Park
In 2018, there was a squirrel census count at Central Park in New…
Best of 2019: 60 Years of Progress in AI
[January 8, 2019] Today is the first day of CES 2019 and artificial intelligence (AI) “will pervade the show,” says Gary Shapiro, chief executive of the Consumer Technology Association. One hundred and thirty years ago today (January 8, 1889), Herman … Continue reading →
Sufficient statistic paradox
A sufficient statistic summarizes a set of data. If the data come from a known distribution with an unknown parameter, then the sufficient statistic carries as much information as the full data [0]. For example, given a set of n coin flips, the number of heads is a sufficient statistic. …
Analysis of online sermons
Pew Research Center analyzed online sermons in U.S. searches, taking a closer look…
Arithmetic, Geometric, and Harmonic Means for Machine Learning
Calculating the average of a variable or a list of numbers is a common operation in machine learning. It is an operation you may use every day either directly, such as when summarizing data, or indirectly, such as a smaller step in a larger procedure when fitting a model. The …
AI-generated pies
Janelle Shane applied her know-how with artificial intelligence to generate new types of…Tags: artificial intelligence, Janelle Shane, pie
Looking for similar NBA games, based on win probability time series
Inpredictable, a sports analytics site by Michael Beuoy, tracks win probabilities of NBA…Tags: basketball, probability
Translating Between Computer Science and Statistics
Terence Parr: “I am a computer scientist retooling as a machine learning droid and have found the nomenclature used by statisticians to be peculiar to say the least, so I thought I’d put this document together. It’s meant as good-natured … Continue reading →