All the data you need.

Tag: Pandas

Building an Automated Data Pipeline for Retail Trade Survey Data
The US Census Bureau traces the sales number of retail and food services periodically since 1992. This data is used widely among the government and various types of organizations to predict industry trends, market shares, and all sorts of other applications. However, the data is not formatted in a way …
Building an Automated Data Pipeline for Retail Trade Survey Data
Please see my Github repo for details on the code and pipeline set up. 1. Introduction The objective of this project is to build an ETL pipeline that could support the analysis of the Monthly Retail Trade Survey (MRTS) data. The data is stored in an Excel file by year …
Predicting Customer Churn at Telco
Background Telco is a hypothetical telecommunications company. Like any company that offers contract / subscription based services customers canceling services is an ongoing problem. Services they provide include: Phone Service Internet Service Online Security Online Backup Streaming TV Streaming Movies They have asked me to build a model to predict …
The Data Behind EV Driving
Debunking Common Electric Vehicle Myths with a Self Collected Dataset Watch My Presentation Github Repository Background This past February, my wife and I purchased a fully-electric 2019 Chevy Bolt with an EPA estimated range of 238 mi. None of our friends and family had an electric vehicle, so before our …
Veg Stars Up For Grabs: Perspectives from Yelp
As someone interested in the future of sustainable food and the carbon footprint of our food system, I realized that the Yelp data set hosted by Kaggle would be a great chance to explore food choices across consumers and restaurants. Given that plant-based diets can cut one’s carbon footprint by …
Blind Dating Ensemble Classifier
Introduction Dating is hard, especially if one is bad at first impressions. What's even harder, is that according to Fishman et al. 2006, men and women look for different things. According to the study, men primarily prefer women based on attractiveness and dislike women who have more intelligence or ambition …
#01 | Getting Started with Pandas
A clear introduction to Pandas, a Python library to manipulate tabular data, where you can discover its many possibilities and get a concise overview.
#01 | Getting Started with Pandas
A clear introduction to Pandas, a Python library to manipulate tabular data, where you can discover its many possibilities and get a concise overview.
#01 | Machine Learning with the Linear Regression
Dive into the essence of Machine Learning by developing several Regression models with a practical use case in Python to predict accidents in the USA.
Finding the Best Liquor Store Location in Iowa
Background and Objective According to Mintel, between 2016 and 2021, spirit sales have outpaced those of beer and wine. Because of this and the growth in alcohol consumption in general, there is opportunity to open a spirit’s store. A prospective client is looking to tap an untapped niche by opening …
Tutorial: Plotting Data with Pandas
Pandas is a data analysis tool that also offers great options for data visualization. Here’s how to get started plotting in Pandas. Data visualization is an essential step in making data science projects successful — an effective plot tells a thousand words. Data visualization is a powerful way to capture …
NYC Open Restaurants and the Rise of 311 Complaints
Background Within a few weeks of COVID hitting New York, the city shut down indoor dining. This was a devastating impact for the restaurant industry resulting in over 1,000 NYC restaurants permanently closing in 2020. In an effort to save the restaurant industry, NYC launched the Open Restaurants Program in …
NYC Open Restaurant Program
Background Within a few weeks of COVID hitting New York, the city shut down indoor dining. This was a devastating impact for the restaurant industry resulting in over 1,000 NYC restaurants permanently closing in 2020. In an effort to save the restaurant industry, NYC launched the Open Restaurants Program in …
New York City: Booms and Blooms
🗽 New York City neighborhoods bloomed through five construction booms. How did each neighborhood develop, and what is the new trend in NYC construction? The post New York City: Booms and Blooms first appeared on Data Science Blog.
New York City: Booms and Blooms
🗽 New York City neighborhoods bloomed through five construction booms. How did each neighborhood develop, and what is the new trend in NYC construction? The post New York City: Booms and Blooms first appeared on Data Science Blog.
Gender Inequality in the Movie Industry
Introduction Over the past 100 years, women's participation in the workforce has drastically changed. In 1920, women made up only 20% of the labor force, but changes in laws and women's rights led to an increase in women not only joining but also staying in the workforce. This increase peaked …
Gender Inequality in the Movie Industry
Introduction Over the past 100 years, women's participation in the workforce has drastically changed. In 1920, women made up only 20% of the labor force, but changes in laws and women's rights led to an increase in women not only joining but also staying in the workforce. This increase peaked …
Coding a Decision Tree in Python Using Scikit-learn, Part #2: Classification Trees and Gini Impurity
You can’t get enough of decision trees, can you? 😉 If coding regression trees is already at your fingertips, then you should definitely learn how to code classification... The post Coding a Decision Tree in Python Using Scikit-learn, Part #2: Classification Trees and Gini Impurity appeared first on Data36.