All the data you need.

Tag: Data Pipeline

How to build a data extraction pipeline with Apache Airflow
Data extraction pipelines might be hard to build and manage, so it's a good idea to use a tool that can help you with these tasks. Apache Airflow (https://airflow.apache.org/) is a popular...
Learn the Comcast Architecture for Enterprise Metadata and Security
Comcast will present a live session on their architecture for metadata and security at our upcoming Databricks AWS Cloud Data Lake DevDay. The event includes a hands-on lab with Databricks notebooks that integrate with Amazon Web Services (AWS) Services like AWS Glue and Amazon Redshift. Our partner Privacera will also …
Global Study Sponsored by Qlik Finds Strong Relationship Between Optimizing Data Pipelines and Business Value
Qlik® announced a global study that shows organizations that strategically invest in creating data-to-insights capabilities through modern data and analytics pipelines are seeing significant bottom line impact. The global IDC survey, sponsored by Qlik, of 1,200 business leaders* shows that companies with a higher ability to identify, gather, transform, and …
How to make data lakes reliable
Data professionals across industries recognize they must effectively harness data for their businesses to innovate and gain competitive advantage. High quality, reliable data forms the backbone for all successful data endeavors, from reporting and analytics to machine learning. Delta Lake is an open-source storage layer that solves many concerns around …
Ascend Creates Automated and Intelligent Dataflows to Power Successful Digital Transformations
Ascend, provider of the Autonomous Dataflow Service, emerged from stealth with $19M in funding to de-risk big data projects and accelerate digital transformations. Ascend operates the only solution with which data engineering teams can quickly build, scale, and operate continuously optimized, Apache Spark-based pipelines. By combining declarative configurations and deep …
Loan Risk Analysis with XGBoost and Databricks Runtime for Machine Learning
For companies that make money off of interest on loans held by their customer, it’s always about increasing the bottom line. Being able to assess the risk of loan applications can save a lender the cost of holding too many risky assets. It is the data scientist’s job to run …