Data Provenance and Reproducibility with Pachyderm
Data Skeptic3 Feb 2017

Data Provenance and Reproducibility with Pachyderm

Versioning isn't just for source code. Being able to track changes to data is critical for answering questions about data provenance, quality, and reproducibility. Daniel Whitenack joins me this week to talk about these concepts and share his work on Pachyderm. Pachyderm is an open source containerized data lake.

During the show, Daniel mentioned the Gopher Data Science github repo as a great resource for any data scientists interested in the Go language. Although we didn't mention it, Daniel also did an interesting analysis on the 2016 world chess championship that complements our recent episode on chess well. You can find that post here

Supplemental music is Lee Rosevere's Let's Start at the Beginning.

Thanks to Periscope Data for sponsoring this episode. More about them at periscopedata.com/skeptics

Denne episoden er hentet fra en åpen RSS-feed og er ikke publisert av Podme. Den kan derfor inneholde annonser.

Episoder(601)

Reducing the Impact of Ship Noise on Marine Mammals

Reducing the Impact of Ship Noise on Marine Mammals

Human shipping operations have increased significantly in the past few decades.  While that means international trade and cheap goods for humans, it also means the ocean has experienced an increase in...

1 Jul 202436min

Analysis of Unstructured Data

Analysis of Unstructured Data

Robbie Moon from the Georgia Tech Scheller College of Business joins us to discuss the analysis of unstructured data and the application of NLP methodologies towards financial data.

28 Jun 202427min

iNaturalist

iNaturalist

Have you ever participated in citizen science? Do you want to? One of the most popular platforms for crowdsourcing biodiversity data is iNaturalist. In addition to being a great science tool, the iNat...

24 Jun 202437min

Learn to Code

Learn to Code

Do you code or are you interested in learning to code? Join us today and hear from three individuals that are at very different stages of their coding journeys. Becky Hansis-O'Neill (also our co-host ...

18 Jun 202449min

Animal Computer Interaction

Animal Computer Interaction

You've heard of Human Computer Interaction (HCI), now get ready for Animal Computer Interaction (ACI). Ilyena has made a career developing computer interfaces for non-human animals. She has worked wit...

10 Jun 202442min

Ape Gestures

Ape Gestures

Cat observes great apes in the wild and in the lab to crack the code of their gestural communication. We discussed the challenges and benefits of studying apes in the wild vs in the lab. Cat also shar...

3 Jun 202449min

Evaluating AI Abilities

Evaluating AI Abilities

In this episode, Kozzy discusses his endeavors to compare the cognitive abilities of humans, animals, and AI programs. Specifically, we discussed object permanence, the ability to understand an object...

27 Mai 202449min

HMMs for Behavior

HMMs for Behavior

Théo Michelot has made a career out of tackling tough ecological questions using time-series data. How do scientists turn a series of GPS location observations over time into useful behavioral data? G...

20 Mai 202445min

Populært innen Vitenskap

fastlegen
tingenes-tilstand
jss
forskningno
rekommandert
rss-zahid-ali-hjelper-deg
rss-paradigmepodden
sinnsyn
vett-og-vitenskap-med-gaute-einevoll
rss-overskuddsliv
nordnorsk-historie
kvinnehelsepodden
tidlose-historier
villmarksliv
liberal-halvtime
rss-inn-til-kjernen-med-sunniva-rose
fjellsportpodden
grunnstoffene
nevropodden
rss-rekommandert