All the data you need.

Tag: Pandas

Coding a Decision Tree in Python Using Scikit-learn, Part #2: Classification Trees and Gini Impurity
You can’t get enough of decision trees, can you? 😉 If coding regression trees is already at your fingertips, then you should definitely learn how to code classification... The post Coding a Decision Tree in Python Using Scikit-learn, Part #2: Classification Trees and Gini Impurity appeared first on Data36.
Pandas in Python
Pandas in Python
Dataquest’s Philosophy: Building the Perfect Data Science Learning Tool
Learn how Dataquest's philosophy sets our platform apart from other data science learning tools, and what we've learned from years of teaching data science. The post Dataquest’s Philosophy: Building the Perfect Data Science Learning Tool appeared first on Dataquest.
A Dashboard Application for Data-driven House Flipping
Background and Business Objectives U.S. house prices have skyrocketed in 2021, with April-July 2021 marking four consecutive months of record high year-on-year home value appreciation. In any competitive real estate market, commercial real estate developers need to be able to quickly identify which opportunities to pursue. This includes understanding: How …
Pandas API on Upcoming Apache Spark™ 3.2
We’re thrilled to announce the pandas API as part of the upcoming Apache Spark™ 3.2 release. pandas is a powerful, flexible library and has grown rapidly to become one of the standard data science libraries. Now pandas users can leverage the pandas API on their existing Spark clusters. A few …
Beautiful Soup Tutorial 4. – Saving Scraped Data to a CSV File, then Analyzing it with Pandas
This is the final part of the Beautiful Soup tutorial series. Just to remind you, here’s what you’ve done so far: in episode #1 you learnt the basics... The post Beautiful Soup Tutorial 4. – Saving Scraped Data to a CSV File, then Analyzing it with Pandas appeared first on …
Data Exploration with Pandas Profiler and D-Tale
In this blog post we cover the use of Pandas Profiler and D-Tale for Exploratory Data Analysis. The post Data Exploration with Pandas Profiler and D-Tale appeared first on Data Science Blog by Domino.
Navigating the Return to In-Person Dining in NYC: Analysis of Data Scraped from OpenTable
Background The COVID-19 pandemic has had huge impacts on the economy of the U.S., and the restaurant industry has been among the hardest hit. To adapt to the pandemic, restaurants turned to technology. 2020 brought about contactless ordering on tablets, QR code menus, and an explosion in the usage of …
The History of Content: performance analysis from 1900 - 2010
This chart shows the performance of each letter over the years, with the length shown as the color dimension. The representation of shade as a continuous variable allows us to examine up to 26 different segments, and their performance, at any given time. The post The History of Content: performance …
Data Cleaning and Exploratory Data Analysis Using the OkCupid Dataset (Part 1)
This article is about dating and data science! Please welcome our guest author, Amy Birdee, who has done multiple data science hobby projects recently and built a truly... The post Data Cleaning and Exploratory Data Analysis Using the OkCupid Dataset (Part 1) appeared first on Data36.
Yunnan Sourcing Tea Storefront and Analysis of the High End Tea Market
Github | LinkedIn | Yunnan Sourcing Introduction Where many online tea wholesalers curate particular, international selections of teas, Yunnan Sourcing distinguishes itself by highlighting local sources. Furthermore what makes it a compelling target for analysis is its focus on "verified purchase reviews." We will begin our analysis by laying the …
How to supercharge data exploration with Pandas Profiling
Producing insights from raw data is a time-consuming process. Predictive modeling efforts rely on dataset profiles, whether consisting of summary statistics or descriptive charts. Pandas Profiling, an open-source tool leveraging Pandas Dataframes, is a tool that can simplify and accelerate such tasks. This blog explores the challenges associated with doing …
Awesome functions in pandas and seaborn
Just a couple of handy functions to visualise and overview data
Python Autocomplete Improvements for Databricks Notebooks
At Databricks, we strive to provide a world-class development experience for data scientists and engineers, and new features are constantly getting added to our notebooks to improve our users’ productivity. We are especially excited about the latest of these features, a new autocomplete experience for Python notebooks (powered by the …
Retention-Driven Marketing for Music Apps
Github Repository | LinkedIn: Rob Davis, James Welch, Sita Thomas Background For this project we were tasked with designing a marketing strategy for KKBox, a streaming music service. We were given four datasets describing user demographics, transaction history, listening history, and churn rate. This project explores which users are the …
Beginner Python Tutorial: Analyze Your Personal Netflix Data
How much time have you spent watching The Office on Netflix? Find out with this entry-level tutorial on analyzing your own Netflix usage data! The post Beginner Python Tutorial: Analyze Your Personal Netflix Data appeared first on Dataquest.
Do You Post Too Much? Analyze Your Personal Facebook Data with Python
As of Q2 2020, Facebook claims more than 2.7 billion active users. That means that if you're reading this article, chances are you're a Facebook user. But just how much of a Facebook user are you? How much do you really post? We can find out using Python! Specifically, we're …
How and why I built Machine Learning model to predict tennis table matches results
'm a Data Professional who loves building data products to solve problems. I'm currently working together with professionals from various backgrounds to provide new analytical insights in industry. I'd love to combine my passion for open data to continue contributing to change people lives in a better and analytical world.