Julia Lane, NYU Professor, Economist and cofounder of the Coleridge Initiative, presented “Where’s the Data: A New Approach to Social Science Search & Discovery” at Rev. Lane described the approach that the Coleridge Initiative is taking to address the science reproducibility challenge. The approach is to provide remote access for …
This post provides a distilled overview regarding the rediscovery of 50,000 samples within the MNIST dataset. MNIST: The Potential Danger of Overfitting Recently, Chhavi Yadav (NYU) and Leon Bottou (Facebook AI Research and NYU) indicated in their paper, “Cold Case: The Lost MNIST Digits”, how they reconstructed the MNIST (Modified …
This Domino Data Science Field Note provides highlights and excerpted slides from Chloe Mawer’s “The Ingredients of a Reproducible Machine Learning Model” talk at a recent WiMLDS meetup. Mawer is a Principal Data Scientist at Lineage Logistics as well as an Adjunct Lecturer at Northwestern University. Special thanks to Mawer …
Key highlights from Clare Gollnick’s talk, “The limits of inference: what data scientists can learn from the reproducibility crisis in science”, are covered in this Domino Data Science Field Note. The full video is available for viewing here. Introduction Within Clare Gollnick’s Strata San Jose talk, “The limits of inference: …
Aug. 23, 2018, 11:38 p.m.
This Domino Data Science Field Note blog post provides highlights of Hadley Wickham’s ACM Chicago talk, “You Can’t Do Data Science in a GUI”. In his talk, Wickham advocates that, unlike a GUI, using code provides reproducibility, data provenance, and the ability to track changes so that data scientists have …
Pete Warden is the Technical Lead on the TensorFlow Mobile Embedded Team at Google doing Deep Learning. He is formerly the CTO of Jetpac, which was acquired by Google. He is also an Apple alumnus and blogs at petewarden.com. This post candidly discusses some of the real world reproducibility challenges …
Data Scientist, Author, and manager of data science teams Enda Ridge talks to us about data governance, data provenance, reproducible analysis, work pipelines and products, and people, among other topics covered in his book "Guerrilla Analytics - A practical Approach to Working with Data: The Savvy Manager's Guide". Podcast Audio …
March 15, 2016, 5:47 a.m.