DataRec Library for Reproducible in Recommend Systems
Data Skeptic13 Marras

DataRec Library for Reproducible in Recommend Systems

In this episode of Data Skeptic's Recommender Systems series, host Kyle Polich explores DataRec, a new Python library designed to bring reproducibility and standardization to recommender systems research. Guest Alberto Carlo Mario Mancino, a postdoc researcher from Politecnico di Bari, Italy, discusses the challenges of dataset management in recommendation research—from version control issues to preprocessing inconsistencies—and how DataRec provides automated downloads, checksum verification, and standardized filtering strategies for popular datasets like MovieLens, Last.fm, and Amazon reviews.

The conversation covers Alberto's research journey through knowledge graphs, graph-based recommenders, privacy considerations, and recommendation novelty. He explains why small modifications in datasets can significantly impact research outcomes, the importance of offline evaluation, and DataRec's vision as a lightweight library that integrates with existing frameworks rather than replacing them. Whether you're benchmarking new algorithms or exploring recommendation techniques, this episode offers practical insights into one of the most critical yet overlooked aspects of reproducible ML research.

Jaksot(589)

[MINI] Selection Bias

[MINI] Selection Bias

A discussion about conducting US presidential election polls helps frame a converation about selection bias.

3 Loka 201414min

[MINI] Confidence Intervals

[MINI] Confidence Intervals

Commute times and BBQ invites help frame a discussion about the statistical concept of confidence intervals.

26 Syys 201411min

[MINI] Value of Information

[MINI] Value of Information

A discussion about getting ready in the morning, negotiating a used car purchase, and selecting the best AirBnB place to stay at help frame a conversation about the decision theoretic principal known as the Value of Information equation.

19 Syys 201414min

Game Science Dice with Louis Zocchi

Game Science Dice with Louis Zocchi

In this bonus episode, guest Louis Zocchi discusses his background in the gaming industry, specifically, how he became a manufacturer of dice designed to produce statistically uniform outcomes.  During the show Louis mentioned a two part video listeners might enjoy: part 1 and part 2 can both be found on youtube.  Kyle mentioned a robot capable of unnoticably cheating at Rock Paper Scissors / Ro Sham Bo. More details can be found here.  Louis mentioned dice collector Kevin Cook whose website is DiceCollector.com  While we're on the subject of table top role playing games, Kyle recommends these two related podcasts listeners might enjoy:  The Conspiracy Skeptic podcast (on which host Kyle was recently a guest) had a great episode "Dungeons and Dragons - The Devil's Game?" which explores claims of D&Ds alleged ties to skepticism.  Also, Kyle swears there's a great Monster Talk episode discussing claims of a satanic connection to Dungeons and Dragons, but despite mild efforts to locate it, he came up empty. Regardless, listeners of the Data Skeptic Podcast are encouraged to explore the back catalog to try and find the aforementioned episode of this great podcast.  Last but not least, as mentioned in the outro, awesomedice.com did some great independent empirical testing that confirms Game Science dice are much closer to the desired uniform distribution over possible outcomes when compared to one leading manufacturer.

17 Syys 201447min

Data Science at ZestFinance with Marick Sinay

Data Science at ZestFinance with Marick Sinay

Marick Sinay from ZestFianance is our guest this weel.  This episode explores how data science techniques are applied in the financial world, specifically in assessing credit worthiness.

12 Syys 201431min

[MINI] Decision Tree Learning

[MINI] Decision Tree Learning

Linhda and Kyle talk about Decision Tree Learning in this miniepisode. Decision Tree Learning is the algorithmic process of trying to generate an optimal decision tree to properly classify or forecast some future unlabeled element based by following each step in the tree.

5 Syys 201413min

Jackson Pollock Authentication Analysis with Kate Jones-Smith

Jackson Pollock Authentication Analysis with Kate Jones-Smith

Our guest this week is Hamilton physics professor Kate Jones-Smith who joins us to discuss the evidence for the claim that drip paintings of Jackson Pollock contain fractal patterns. This hypothesis originates in a paper by Taylor, Micolich, and Jonas titled Fractal analysis of Pollock's drip paintings which appeared in Nature.  Kate and co-author Harsh Mathur wrote a paper titled Revisiting Pollock's Drip Paintings which also appeared in Nature. A full text PDF can be found here, but lacks the helpful figures which can be found here, although two images are blurred behind a paywall.  Their paper was covered in the New York Times as well as in USA Today (albeit with with a much more delightful headline: Never mind the Pollock's [sic]).  While discussing the intersection of science and art, the conversation also touched briefly on a few other intersting topics. For example, Penrose Tiles appearing in islamic art (pre-dating Roger Penrose's investigation of the interesting properties of these tiling processes), Quasicrystal designs in art, Automated brushstroke analysis of the works of Vincent van Gogh, and attempts to authenticate a possible work of Leonardo Da Vinci of uncertain provenance. Last but not least, the conversation touches on the particularly compellingHockney-Falco Thesis which is also covered in David Hockney's book Secret Knowledge.  For those interested in reading some of Kate's other publications, many Katherine Jones-Smith articles can be found at the given link, all of which have downloadable PDFs.

29 Elo 201449min

[MINI] Noise!!

[MINI] Noise!!

Our topic for this week is "noise" as in signal vs. noise. This is not a signal processing discussions, but rather a brief introduction to how the work noise is used to describe how much information in a dataset is useless (as opposed to useful). Also, Kyle announces having recently had the pleasure of appearing as a guest on The Conspiracy Skeptic Podcast to discussion The Bible Code. Please check out this other fine program for this and it's many other great episodes.

22 Elo 201416min

Suosittua kategoriassa Tiede

rss-mita-tulisi-tietaa
rss-poliisin-mieli
utelias-mieli
tiedekulma-podcast
hippokrateen-vastaanotolla
rss-lihavuudesta-podcast
docemilia
rss-duodecim-lehti
rss-ylistys-elaimille
sotataidon-ytimessa
radio-antro
filocast-filosofian-perusteet
rss-bios-podcast
rss-totta-vai-tuubaa
rss-laakaripodi
rss-tiedetta-vai-tarinaa
rss-tervetta-skeptisyytta