All the data you need.

Tag: Model Management

Evaluating Generative Adversarial Networks (GANs)
This article provides concise insights into GANs to help data scientists and researchers assess whether to investigate GANs further. If you are interested in a tutorial as well as hands-on code examples within a Domino project, then consider attending the upcoming webinar, “Generative Adversarial Networks: A Distilled Tutorial”. Introduction With …
Data Drift Detection for Image Classifiers
This article covers how to detect data drift for models that ingest image data as their input in order to prevent their silent degradation in production. Run the example in a complementary Domino project. Introduction: preventing silent model degradation in production In the real word, data is recorded by different …
Model Interpretability: The Conversation Continues
This Domino Data Science Field Note covers a proposed definition of interpretability and distilled overview of the PDR framework. Insights are drawn from Bin Yu, W. James Murdoch, Chandan Singh, Karl Kumber, and Reza Abbasi-Asi’s recent paper, “Definitions, methods, and applications in interpretable machine learning”. Introduction Model interpretability continues to …
On Being Model-driven: Metrics and Monitoring
This article covers a couple of key Machine Learning (ML) vital signs to consider when tracking ML models in production to ensure model reliability, consistency and performance in the future. Many thanks to Don Miner for collaborating with Domino on this article. For additional vital signs and insight beyond what …
Understanding Causal Inference
This article covers causal relationships and includes a chapter excerpt from the book Machine Learning in Production: Developing and Optimizing Data Science Workflows and Applications by Andrew Kelleher and Adam Kelleher. A complementary Domino project is available. Introduction As data science work is experimental and probabilistic in nature, data scientists …
Towards Predictive Accuracy: Tuning Hyperparameters and Pipelines
This article provides an excerpt of “Tuning Hyperparameters and Pipelines” from the book, Machine Learning with Python for Everyone by Mark E. Fenner. The excerpt and complementary Domino project evaluates hyperparameters including GridSearch and RandomizedSearch as well as building an automated ML workflow. Introduction Data scientists, machine learning (ML) researchers, …
Data Ethics: Contesting Truth and Rearranging Power
This Domino Data Science Field Note covers Chris Wiggins‘s recent data ethics seminar at Berkeley. The article focuses on 1) proposed frameworks for defining and designing for ethics and for understanding the forces that encourage industry to operationalize ethics, as well as 2) proposed ethical principles for data scientists to …
Announcing the MLflow 1.1 Release
We’re excited to announce today the release of MLflow 1.1. In this release, we’ve focused on fleshing out the tracking component of MLflow and improving visualization components in the UI. Some of the major features include: Automatic logging from TensorFlow and Keras Parallel coordinate plots in the tracking UI Pandas …
Seeking Reproducibility within Social Science: Search and Discovery
Julia Lane, NYU Professor, Economist and cofounder of the Coleridge Initiative, presented “Where’s the Data: A New Approach to Social Science Search & Discovery” at Rev. Lane described the approach that the Coleridge Initiative is taking to address the science reproducibility challenge. The approach is to provide remote access for …
Data Science at The New York Times
Chris Wiggins, Chief Data Scientist at The New York Times, presented “Data Science at the New York Times” at Rev. Wiggins advocated that data scientists find problems that impact the business; re-frame the problem as a machine learning (ML) task; execute on the ML task; and communicate the results back …
What’s new with MLflow? On-Demand Webinar and FAQs now available!
On June 6th, our team hosted a live webinar—Managing the Complete Machine Learning Lifecycle: What’s new with MLflow—with Clemens Mewald, Director of Product Management at Databricks. Machine learning development brings many new complexities beyond the traditional software development lifecycle. Unlike in traditional software development, ML developers want to try multiple …
Announcing Trial and Domino 3.5: Control Center for Data Science Leaders
Even the most sophisticated data science organizations struggle to keep track of their data science projects. Data science leaders want to know, at any given moment, not just how many data science projects are in flight but what the latest updates and roadblocks are when it comes to model development …
Product Management for AI
Pete Skomoroch presented “Product Management for AI” at Rev. This post provides a distilled summary, video, and full transcript. Session Summary Pete Skomoroch’s “Product Management for AI” session at Rev provided a “crash course” on what product managers and leaders need to know about shipping machine learning (ML) projects and …
Themes and Conferences per Pacoid, Episode 10
Co-chair Paco Nathan provides highlights of Rev 2, a data science leaders summit. Introduction Welcome back to our monthly burst of themespotting and conference summaries. We held Rev 2 May 23-24 in NYC, as the place where “data science leaders and their teams come to learn from each other.” The …
Machine Learning Product Management: Lessons Learned
This Domino Data Science Field Note covers Pete Skomoroch’s recent Strata London talk. It focuses on his ML product management insights and lessons learned. If you are interested in hearing more practical insights on ML or AI product management, then consider attending Pete’s upcoming session at Rev. Machine Learning Projects …
Announcing Domino 3.4: Furthering Collaboration with Activity Feed
Our last release, Domino 3.3 saw the addition of two major capabilities: Datasets and Experiment Manager. “Datasets”, a high-performance, revisioned data store offers data scientists the flexibility they need to make use of large data resources when developing models. And “Experiment Manager” acts as a data scientist’s “modern lab notebook” …
Themes and Conferences per Pacoid, Episode 9
Paco Nathan’s latest article features several emerging threads adjacent to model interpretability. Introduction Welcome back to our monthly burst of themes and conferences. Several technology conferences all occurred within four fun-filled weeks: Strata SF, Google Next, CMU Summit on US-China Innovation, AI NY, and Strata UK, plus some other events. …
Addressing Irreproducibility in the Wild
This Domino Data Science Field Note provides highlights and excerpted slides from Chloe Mawer’s “The Ingredients of a Reproducible Machine Learning Model” talk at a recent WiMLDS meetup. Mawer is a Principal Data Scientist at Lineage Logistics as well as an Adjunct Lecturer at Northwestern University. Special thanks to Mawer …