[MINI] Data Provenance
Data Skeptic9 Jan 2015

[MINI] Data Provenance

This episode introduces a high level discussion on the topic of Data Provenance, with more MINI episodes to follow to get into specific topics. Thanks to listener Sara L who wrote in to point out the Data Skeptic Podcast has focused alot about using data to be skeptical, but not necessarily being skeptical of data.

Data Provenance is the concept of knowing the full origin of your dataset. Where did it come from? Who collected it? How as it collected? Does it combine independent sources or one singular source? What are the error bounds on the way it was measured? These are just some of the questions one should ask to understand their data. After all, if the antecedent of an argument is built on dubious grounds, the consequent of the argument is equally dubious.

For a more technical discussion than what we get into in this mini epiosode, I recommend A Survey of Data Provenance Techniques by authors Simmhan, Plale, and Gannon.

Episoder(588)

Long Term Time Series Forecasting

Long Term Time Series Forecasting

Alex Mallen, Computer Science student at the University of Washington, and Henning Lange, a Postdoctoral Scholar in Applied Math at the University of Washington, join us today to share their work "Deep Probabilistic Koopman: Long-term Time-Series Forecasting Under Periodic Uncertainties."

25 Okt 202137min

Fast and Frugal Time Series Forecasting

Fast and Frugal Time Series Forecasting

Fotios Petropoulos, Professor of Management Science at the University of Bath in The U.K., joins us today to talk about his work "Fast and Frugal Time Series Forecasting."

17 Okt 202137min

Causal Inference in Educational Systems

Causal Inference in Educational Systems

Manie Tadayon, a PhD graduate from the ECE department at University of California, Los Angeles, joins us today to talk about his work "Comparative Analysis of the Hidden Markov Model and LSTM: A Simulative Approach."

11 Okt 202141min

Boosted Embeddings for Time Series

Boosted Embeddings for Time Series

Sankeerth Rao Karingula, ML Researcher at Palo Alto Networks, joins us today to talk about his work "Boosted Embeddings for Time Series Forecasting." Works Mentioned Boosted Embeddings for Time Series Forecasting by Sankeerth Rao Karingula, Nandini Ramanan, Rasool Tahmasbi, Mehrnaz Amjadi, Deokwoo Jung, Ricky Si, Charanraj Thimmisetty, Luisa Polania Cabrera, Marjorie Sayer, Claudionor Nunes Coelho Jr https://www.linkedin.com/in/sankeerthrao/ https://twitter.com/sankeerthrao3 https://lod2021.icas.cc/

4 Okt 202128min

Change Point Detection in Continuous Integration Systems

Change Point Detection in Continuous Integration Systems

David Daly, Performance Engineer at MongoDB, joins us today to discuss "The Use of Change Point Detection to Identify Software Performance Regressions in a Continuous Integration System". Works Mentioned The Use of Change Point Detection to Identify Software Performance Regressions in a Continuous Integration System by David Daly, William Brown, Henrik Ingo, Jim O'Leary, David BradfordSocial Media David's Website David's Twitter Mongodb

27 Sep 202133min

Applying k-Nearest Neighbors to Time Series

Applying k-Nearest Neighbors to Time Series

Samya Tajmouati, a PhD student in Data Science at the University of Science of Kenitra, Morocco, joins us today to discuss her work Applying K-Nearest Neighbors to Time Series Forecasting: Two New Approaches.

20 Sep 202124min

Ultra Long Time Series

Ultra Long Time Series

Dr. Feng Li, (@f3ngli) is an Associate Professor of Statistics in the School of Statistics and Mathematics at Central University of Finance and Economics in Beijing, China. He joins us today to discuss his work Distributed ARIMA Models for Ultra-long Time Series.

13 Sep 202128min

MiniRocket

MiniRocket

Angus Dempster, PhD Student at Monash University in Australia, comes on today to talk about MINIROCKET: A Very Fast (Almost) Deterministic Transform for Time Series Classification, a fast deterministic transform for time series classification. MINIROCKET reformulates ROCKET, gaining a 75x improvement on larger datasets with essentially the same performance. In this episode, we talk about the insights that realized this speedup as well as use cases.

6 Sep 202125min

Populært innen Vitenskap

fastlegen
rekommandert
jss
rss-rekommandert
tingenes-tilstand
sinnsyn
rss-nysgjerrige-norge
villmarksliv
dekodet-2
forskningno
doktor-fives-podcast
rss-paradigmepodden
vett-og-vitenskap-med-gaute-einevoll
pod-britannia
psykopoden
tidlose-historier
diagnose
tomprat-med-gunnar-tjomlid
nevropodden
abid-nadia-skyld-og-skam