All the data you need.

Tag: Scraping

Data Science Hobby Project: Web Scraping Book Ratings With BeautifulSoup
Tamas Ujhelyi was one of the first participants in my 6-week data science course (the Junior Data Scientist’s First Month). After finishing the course, he started a cool... The post Data Science Hobby Project: Web Scraping Book Ratings With BeautifulSoup appeared first on Data36.
Finding the Best Gluten Free Metro Areas and Restaurants with Scrapy
Python scrapy code & jupyter notebook for visualizations: https://github.com/datatodavid/GF_Restaurant_Scraper Why I Scraped FindMeGlutenFree.com FindMeGlutenFree.com is a go-to website in the gluten free / celiac community. The platform works as a Yelp-like user-based restaurant search engine, except one that is meant only for restaurants with gluten-free offerings. This focus makes it …
Mining Parler data
Just before the social network Parler went down, a researcher who goes by…Tags: Capitol, metadata, Motherboard, Parler, scraping
Crawler with Selenium
Introduction The crawler is comprised of several different components to make the unstructured data accessible for cleaning. As the data we are looking to scrap here is financial in nature, we take on several webpages to comprise and give the data structure. First, the structure is given with two feats, …
AI in Drug Discovery Trends
Having worked the past 3 years as a data analyst for a life science startup accelerator, I was interested in using Python data visualization tools to discover trends in healthcare startups. I was able to find data in two BenSci blogs that focused specifically on startups working with AI in …
Detecting Bias in Music Reviews at pitchfork.com
Introduction With over a quarter of a million unique readers per day, the music review website Pitchfork is one of the most influential online publications in independent music. Several well-known artists, including Sufjan Stephens and Arcade Fire, have experienced surges in popularity and sales following positive album reviews on Pitchfork. …
AutoScraper and Flask: Create an API From Any Website in Less Than 5 Minutes And with Fewer Than 20 Lines of Python
In this tutorial, we are going to create our own e-commerce search API with support for both eBay and Etsy without using any external APIs. With the power of AutoScraper...
Fish Picks: An Exploration of Seafood Watch Data
This project used Selenium to scrape data from the Monterey Bay Aquarium website www.seafoodwatch.org. This organization collects a range of data on seafood production techniques and fisheries to generate scores for each seafood type. These scores serve to indicate the environmental impact and sustainability of each seafood type. The goal …
Which Streaming Service Should I Subscribe to?
Photo by Ivan Marc on Shutterstock GitHub Repository | LinkedIn Motivation As a dedicated Netflix user for the past several years, I often find myself scrolling over and over trying to find a TV series to watch. At some point, I came to a realization that maybe it is about …
Industry Trends in NYC Start-ups
In this project, information was scraped from the website Builtinnyc.com., a job search site for NYC based start-ups and tech companies. The site contains almost 3,500 unique company listings. Because new industry trends and technologies are often first tested out in the start-up community, the information should be useful for …
Data Analysis of Newegg Computer Monitors
I love computers. I have no problem sitting at my computer all day long working on various projects with R and Python. I love it. I like experimenting and learning new things. Since I use my computer a lot, I knew I would have to have a good computer, so …
Yelp Reviews: BBQ Analysis
The data collected is from the ten most populated cities in the US with a focus on BBQ restaurants sorted by number of reviews. Data for the top 100 restaurants was collected for each city. Yelp data was collected using Scrapy. The purpose of this analysis is to determine which …
Discount Strategy Analysis on Saks.com and Saks Off5th.com
In the late 1980’s, the off-price store concept started to gain more and more popularity in the retail industry. OP retailers provide merchandise at discounted price, as the products they sell are usually coming from the pass-seasoned or overstocked inventory from a regular department store or the brand. The low …
Carvana vs. Vroom: The Two Largest Online Used Car Retailers
Background The used car market is massive. It amounts to a TAM (total addressable market) of almost $1 trillion with 40 million+ cars sold each year. Despite such a size, the industry is notorious for its lack of transparency and poor customer service. Competition is fierce, as the largest player …
Introducing AutoScraper: A Smart, Fast and Lightweight Web Scraper For Python
Scraping the web just got a lot more automated
Web Scraping Product Details from Sunglass Hut and Woot!
Sunglasses product details were scraped from the Sunglass Hut and Woot! websites in order to perform an exploratory data analysis (EDA) and to compare the deals on Woot! to the retail prices on Sunglass Hut. The above word cloud was produced using the descriptions of the sunglasses on Sunglass Hut. …
Movie Recommender with TMDB
A classic problem people have is finding a good movie to watch without doing a lot of research. To overcome this problem, recommender systems based on Machine Learning or Deep Learning are used to find movies that users are most likely to enjoy. There are 2 main types of recommender …
Manhattan Food Trends
The restaurant industry is notoriously competitive, and New York City is surely one of the toughest places for this industry. As the effects of COVID-19 continue to batter all industries, restaurants are suffering immensely. The idea crossed my mind to try and take a snapshot of restaurants currently serving food …