DataRec Library for Reproducible in Recommend Systems

DataRec Library for Reproducible in Recommend Systems

In this episode of Data Skeptic's Recommender Systems series, host Kyle Polich explores DataRec, a new Python library designed to bring reproducibility and standardization to recommender systems research. Guest Alberto Carlo Mario Mancino, a postdoc researcher from Politecnico di Bari, Italy, discusses the challenges of dataset management in recommendation research—from version control issues to preprocessing inconsistencies—and how DataRec provides automated downloads, checksum verification, and standardized filtering strategies for popular datasets like MovieLens, Last.fm, and Amazon reviews.

The conversation covers Alberto's research journey through knowledge graphs, graph-based recommenders, privacy considerations, and recommendation novelty. He explains why small modifications in datasets can significantly impact research outcomes, the importance of offline evaluation, and DataRec's vision as a lightweight library that integrates with existing frameworks rather than replacing them. Whether you're benchmarking new algorithms or exploring recommendation techniques, this episode offers practical insights into one of the most critical yet overlooked aspects of reproducible ML research.

Episoder(589)

Non-Response Bias

Non-Response Bias

Today's show focused on an essential part of surveys — missing values. This is typically caused by a low response rate or non-response from respondents. Yajuan Si is a Research Associate Professor at the Survey Research Center at the University of Michigan. She joins us to discuss dealing with bias from low survey response rates.

10 Apr 202335min

Measuring Trust in Robots with Likert Scales

Measuring Trust in Robots with Likert Scales

We are joined by two guests today, Mariah, a Ph.D. student in the CORE Robotics Lab at Georgia Tech, and Matthew Gombolay, the Director of the CORE Robotics Lab. They both discuss practices for measuring a respondent's perception in a survey.

3 Apr 202347min

CAREER Prediction

CAREER Prediction

Ever wondered what your next career would be? Today, Keyon Vafa, a computer science Ph.D. student at Columbia University, joins us to discuss his latest research on developing a machine-learning model for career prediction. Keyon extensively spoke about how the model was developed and the possibilities it brings.

27 Mar 202340min

The Panel Study of Income Dynamics

The Panel Study of Income Dynamics

Noura Insolera, a Research Investigator with the Panel Study of Income Dynamics (PSID), joins us to share how PSID conducts longitudinal household surveys. She also shared some interesting findings from their data exploration, particularly on the observation and trends in food insecurity.

21 Mar 202334min

Survey Design Working Session

Survey Design Working Session

Susan Gerbic joins Kyle to review some of the surveys Data Skeptic has launch, draft a new survey about podcast listening habits, and then review the results of that survey. You can see those results at the link below. https://survey.dataskeptic.com/survey/result/1675102237053 Watch the videos Susan mentioned on her Youtube page at the link below. https://www.youtube.com/playlist?list=PL7VAuaQDhPTVaLeI1IcpYph5lH19xA1u4

14 Mar 20231h 1min

Bot Detection and Dyadic Surveys

Bot Detection and Dyadic Surveys

The use of social bots to fill out online surveys is becoming prevalent. Today, we speak with Sara Bybee, a postdoctoral research scholar at the University of Utah. Sara shares from her research, how she detected social bots, the strategies to curb them, and how underrepresented groups can be more represented in surveys.

6 Mar 202335min

Reproducible ESP Testing

Reproducible ESP Testing

Our guest today is Zoltán Kekecs, a Ph.D. holder in Behavioural Science. Zoltán highlights the problem of low replicability in journal papers and illustrates how researchers can better ensure complete replication of their research and findings. He used Bem's experiment as an example, extensively talking about his methodology and results.

20 Feb 202347min

A Survey of Data Science Methodologies

A Survey of Data Science Methodologies

On the show, Iñigo Martinez, a Ph.D. student at the University of Navarra shares his survey results which investigated how data practitioners perform data science projects. He revealed the methodologies typically used by data practitioners and the success factors in data science projects.

13 Feb 202324min

Populært innen Vitenskap

fastlegen
rekommandert
tingenes-tilstand
jss
rss-rekommandert
sinnsyn
forskningno
rss-overskuddsliv
villmarksliv
rss-paradigmepodden
doktor-fives-podcast
fjellsportpodden
tomprat-med-gunnar-tjomlid
rss-nysgjerrige-norge
pod-britannia
abid-nadia-skyld-og-skam
nevropodden
vett-og-vitenskap-med-gaute-einevoll
aldring-og-helse-podden
rss-inn-til-kjernen-med-sunniva-rose