All the data you need.

Tag: Scraping

Extract Data from Website to Excel Automatically
To extract data from websites, you can take advantage of data extraction tools like Octoparse. These tools can pull data from websites automatically and save them into many formats such as Excel,…
What is Web Scraping and How Does It Work
Web scraping, also known as web harvesting and web data extraction, basically refers to collecting data from websites via the Hypertext Transfer Protocol (HTTP) or through web browsers. Table of…
Massive Open Online Courses Planning
Massive open online courses (MOOCs) have been growing tremendously in the past five year, meeting the demand of making learning more flexible and accessible. MOOCs differ from traditional online courses because they offer interaction and feedback to support the students throughout the course. Many institutions have jumped on the MOOCs …
Best of Ulta Men by Ratings and Reviews.
Background / Interest According to CNBC, men grooming products markets are expected to hit $166 billion in 2022 and male skin care products sales have jumped 7% and are currently valued at $ 122 millions. Ulta Beauty is one of the largest U.S. beauty retailers for cosmetics, fragrance, skin care …
Scraping Sephora: An Ingredients Analysis
In 2017, Senator Chuck Schumer (NY) began campaigning for the Food and Drug Administration (FDA) to remove 1,4-dioxane from consumer products goods, citing concerns over the levels of the known carcinogen in Long Island's water supply. 1,4-dioxane is never listed as an ingredient in a product, instead it is often …
Portfolio Optimization
Introduction The main motivation behind this web scraping project is based on Modern Portfolio Theory (MPT), a quantitative framework applied to investment portfolios that optimizes the relationship between risk vs. reward. This financial theory was founded by Harry Markowitz in the 1950s, and at the time, mathematics were severely underused …
Portfolio Optimization
Introduction The main motivation behind this web scraping project is based on Modern Portfolio Theory (MPT), a quantitative framework applied to investment portfolios that optimizes the relationship between risk vs. reward. This financial theory was founded by Harry Markowitz in the 1950s, and at the time, mathematics were severely underused …
Scraping IMDB for Insights
Introduction A daunting task in today’s day and age is to discover what are some of the best movies that are currently out there? What are people saying about these movies and what is that makes these movies good? This is a common question I find that I’m always asking …
Netflix Content Dynamics
The Project One of the primary motivations for the project is the capability to uncover "hidden" information that can be seen when analyzing the actual month-by-month dynamics of titles coming and leaving Netflix. While it is not a problem to find a list of titles currently on Netflix, there is, …
Fun facts on Foodnetwork recipes.
Out of many free time activities cooking is one of my favorite one. It is relaxing as it really turns off the wheels in my head. I focus on the meal fully with engineering detail orientation. Also, the result of the work is a lovely satisfying meal, which I like …
Webscraping Epicurious: An Analysis of American Taste Preferences
Just What Do We Like? Living in New York City, I often walk by a restaurant I've seen hundreds of times only to notice that it's changed ownership and is now serving a completely different type of food. It makes sense, too -- according to a 2011 Business Insider study, …
Scraping Glassdoor Interview Reviews
Congratulations! You have been selected to interview for... It’s a situation that we’ve all experienced before. The excitement of securing an interview for a promising job opportunity along with the preparation that follows. In such situations, there are many resources that job candidates can utilize, but in the humble opinion …
Luxury Handbag Price Comparison
Coming from luxury retail background, I'm interested to see how traditional and iconic luxury brand Saks Fifth Avenue is competing against other emerging ecommerce sites whose inventory is also luxury products. So I picked Farfetch, a UK-based online luxury fashion retail platform and compared their handbag category. I scraped their …
Place for Your Pet: What the Data Reveals Apartment that Allow Animals
Introduction As a dog owner, I notice it can be quite challenging to find an apartment where pets are welcome. Sites like Zillow and StreetEasy do not actually provide pet information for their apartment listings, indicating you have to ‘contact the broker’ in order to find the details. On the …
Github Scraper: A tool for Examining the State of Machine Learning
Github is one of the most popular version control systems in use today, with over 100 million projects available to users. Because of this, it is one of the best sources to check on the current state of Computer Science. My Github Analyzer application scrapes thousands of machine learning projects …
Plant-based products: popularity on the online grocery website
Project Summary The project's goal was to understand the large picture of plant-based product market.My main interest came from the news headlines we frequently encounter these days that meat alternatives and other bio-engineered alternatives are on the rise (30% growth of sales from 2018) and that large fast-food chains such …
Scraping PTEN's US Land Rig Fleet
Introduction I decided to scrape and analyze rig data from Patterson-UTI Energy’s (NASDAQ: PTEN) website. PTEN is the second largest contract land driller in the US and one of the largest pressure pumping (e.g. fracking) service providers to Oil & Gas companies. The company is publicly traded with a current …
Scraping Multiple Webpages with For Loops (in bash) — Web Scraping Tutorial ep#2
This is the second episode of my web scraping tutorial series. In the first episode, I showed you how you can get and clean the data from one... The post Scraping Multiple Webpages with For Loops (in bash) — Web Scraping Tutorial ep#2 appeared first on Data36.