All the data you need.

Tag: Open Source

What it means to be customer obsessed
One of our values at Databricks is to be customer obsessed. We deeply care about the impact and success of our customers, and are proud to be recognized by Gartner for focusing on this. A key part of that is how we strategize on making the world better through the …
Introducing Glow: an open-source toolkit for large-scale genomic analysis
The key to solving some of today’s most challenging medical problems lies in the analysis of genomics data. Understanding the impact of the minor changes in an individual’s genome on their overall health is fundamentally a data driven challenge that requires integration across hundreds of thousands of individuals. By analyzing …
Delta Lake Now Hosted by the Linux Foundation to Become the Open Standard for Data Lakes
At today’s Spark + AI Summit Europe in Amsterdam, we announced that Delta Lake is becoming a Linux Foundation project. Together with the community, the project aims to establish an open standard for managing large amounts of data in data lakes. The Apache 2.0 software license remains unchanged. Delta Lake …
Announcing the MLflow 1.1 Release
We’re excited to announce today the release of MLflow 1.1. In this release, we’ve focused on fleshing out the tracking component of MLflow and improving visualization components in the UI. Some of the major features include: Automatic logging from TensorFlow and Keras Parallel coordinate plots in the tracking UI Pandas …
What’s new with MLflow? On-Demand Webinar and FAQs now available!
On June 6th, our team hosted a live webinar—Managing the Complete Machine Learning Lifecycle: What’s new with MLflow—with Clemens Mewald, Director of Product Management at Databricks. Machine learning development brings many new complexities beyond the traditional software development lifecycle. Unlike in traditional software development, ML developers want to try multiple …
Attending the Instructional Design Summit at the 2019 Open edX Conference
The unofficial first day of the 2019 Open edX Conference included one of the best seminars/workshops for content creators: The Instructional Design Summit.
Video: New Survey Looks at What’s Driving Companies to the Cloud
In this video from KubeCon 2018 in Seattle, Abby Kearns from the Cloud Foundry Foundation looks at the results of a recent survey on key factors driving the enterprise to the Cloud. “We’re seeing a virtuous cycle, as comfortability with one technology results in lightning-speed adoption of more advanced technologies. …
Themes and Conferences per Pacoid, Episode 3
Paco Nathan‘s column covers themes that include open source, “intelligence is a team sport”, and “implications of massive latent hardware”. Introduction Welcome to our monthly series about data science! Themes to consider here: Open Source wins; Learning is not enough Intelligence is a team sport Implications of massive latent hardware …
Combining the Benefits of Commercial & Open Analytics
A new e-book explores how organizations in many industries are using open source analytics and SAS, getting the most from both, and what role SAS plays throughout the analytics life cycle.
A Certification for R Package Quality
There are more than 12,000 packages for R available on CRAN, and many others available on Github and elsewhere. But how can you be sure that a given R package follows best development practices for high-quality, secure software? Based on a recent survey of R users related to challenges in …
On the Importance of Community-Led Open Source
Wes McKinney, Director of Ursa Labs and creator of pandas project, presented the keynote, “Advancing Data Science Through Open Source” at Rev. McKinney’s keynote covered open source’s symbiotic relationship with data science and the importance of community-led open source. This blog post includes distilled highlights, the full video, and transcript …
A Journey Through Spark
Extending SparkSQLBastian Haase is an alum from the Insight Data Engineering program in Silicon Valley, now working as a Program Director at Insight Data Science for the Data Engineering and the DevOps Engineering programs. In this blog post, he shares his experiences on how to get started working on open …
Introducing MLflow: an Open Source Machine Learning Platform
View the MLflow Spark+AI Summit keynote Everyone who has tried to do machine learning development knows that it is complex. Beyond the usual concerns in the software development, machine learning (ML) development comes with multiple new challenges. At Databricks, we work with hundreds of companies using ML, and we have …