DataRec Library for Reproducible in Recommend Systems
Data Skeptic13 Marras

DataRec Library for Reproducible in Recommend Systems

In this episode of Data Skeptic's Recommender Systems series, host Kyle Polich explores DataRec, a new Python library designed to bring reproducibility and standardization to recommender systems research. Guest Alberto Carlo Maria Mancino, a postdoc researcher from Politecnico di Bari, Italy, discusses the challenges of dataset management in recommendation research—from version control issues to preprocessing inconsistencies—and how DataRec provides automated downloads, checksum verification, and standardized filtering strategies for popular datasets like MovieLens, Last.fm, and Amazon reviews.

The conversation covers Alberto's research journey through knowledge graphs, graph-based recommenders, privacy considerations, and recommendation novelty. He explains why small modifications in datasets can significantly impact research outcomes, the importance of offline evaluation, and DataRec's vision as a lightweight library that integrates with existing frameworks rather than replacing them. Whether you're benchmarking new algorithms or exploring recommendation techniques, this episode offers practical insights into one of the most critical yet overlooked aspects of reproducible ML research.

Jaksot(589)

The Reliability of Mobile Phone Data

The Reliability of Mobile Phone Data

Our mobile phones generate an incredible amount of data inbound and outbound. In today's episode, Nishant Kishore, a PhD graduate of Harvard University in Infectious Disease Epidemiology, explains how mobility data from mobile phones can be captured and analysed to understand the spread of infectious diseases. Click here for additional show notes Thanks to our sponsor! https://neptune.ai/ Log, store, query, display, organize, and compare all your model metadata in a single place

13 Kesä 202249min

Haywire Algorithms

Haywire Algorithms

The pandemic changed how we lived. And this had a ripple effect on the performance of machine learning models. Ravi Parikh joins us today to discuss how the pandemic has affected the performance of machine learning models in clinical care and some actionable steps to fix it. Click here for additional show notes Thanks to our sponsor: Astera Centerprise is a no-code data integration platform that allows users to build ETL/ELT pipelines for modern data warehousing and analytics.

6 Kesä 202233min

School Reopening Analysis

School Reopening Analysis

Carly Lupton-Smith joins us today to speak about her research which investigated the consistency between household and county measures of school reopening. Carly is a doctoral researcher in Biostatistics at Johns Hopkins Bloomberg School of Public Health. Listen to know about her findings. Click here for additional show notes on our website! Thanks to our sponsor!ClearML is an open-source MLOps solution users love to customize, helping you easily Track, Orchestrate, and Automate ML workflows at scale. Astera Centerprise is a no-code data integration platform that allows users to build ETL/ELT pipelines for modern data warehousing and analytics.

30 Touko 202233min

Modern Data Stacks

Modern Data Stacks

Today, we are joined by Alexander Thor, a Product Manager at Vizlib, makers of Astrato. Astrato is a data analytics and business intelligence tool built on the cloud and for the cloud. Alexander discusses the features and capabilities of Astrato for data professionals. Visit our website for additional show notes!

26 Touko 202234min

Emoji as a Predictor

Emoji as a Predictor

Emojis are arguably one of the most effective ways to express emotions when texting. In today's episode, Xuan Lu shares her research on the use of emojis by developers. She explains how the study of emojis can track the emotions of remote workers and predict future behavior. Listen to find out more!

23 Touko 202221min

Polarizing Trends in the Gig Economy

Polarizing Trends in the Gig Economy

On the show today, Fabian Braesemann, a research fellow at the University of Oxford, joins us to discuss his study analyzing the gig economy. He revealed the trends he discovered since remote work became mainstream, the factors causing spatial polarization and some downsides of the gig economy. Listen to learn what he found.

16 Touko 202246min

Remote Learning in Applied Engineering

Remote Learning in Applied Engineering

On the show today, we interview Mouhamed Abdulla, a professor of Electrical Engineering at Sheridan Institute of Technology. Mouhamed joins us to discuss his study on remote teaching and learning in applied engineering. He discusses how he embraced the new approach after the pandemic, the challenges he faced and how he tackled them. Listen to find out more. Click here for additional show notes on our website! Thanks to our sponsor! https://neptune.ai/ Log, store, query, display, organize, and compare all your model metadata in a single place

12 Touko 202225min

Remote Productivity

Remote Productivity

It is difficult to estimate the effect on remote working across the board. Darja Šmite, who speaks with us today, is a professor of Software Engineering at the Blekinge Institute of Technology. In her recently published paper, she analyzed data on several companies' activities before and after remote working became prevalent. She discussed the results found, why they were and some subtle drawbacks of remote working. Check it out! Click here for additional show notes on our website!

9 Touko 202229min

Suosittua kategoriassa Tiede

rss-mita-tulisi-tietaa
utelias-mieli
rss-poliisin-mieli
hippokrateen-vastaanotolla
tiedekulma-podcast
docemilia
rss-duodecim-lehti
rss-lihavuudesta-podcast
filocast-filosofian-perusteet
rss-astetta-parempi-elama-podcast
rss-ylistys-elaimille
mielipaivakirja
radio-antro
rss-totta-vai-tuubaa
rss-tiedetta-vai-tarinaa
rss-ilmasto-kriisissa
rss-luontopodi-samuel-glassar-tutkii-luonnon-ihmeita
rss-lapsuuden-rakentajat-podcast