[MINI] Multi-armed Bandit Problems
Data Skeptic2 Loka 2015

[MINI] Multi-armed Bandit Problems

The multi-armed bandit problem is named with reference to slot machines (one armed bandits). Given the chance to play from a pool of slot machines, all with unknown payout frequencies, how can you maximize your reward? If you knew in advance which machine was best, you would play exclusively that machine. Any strategy less than this will, on average, earn less payout, and the difference can be called the "regret".

You can try each slot machine to learn about it, which we refer to as exploration. When you've spent enough time to be convinced you've identified the best machine, you can then double down and exploit that knowledge. But how do you best balance exploration and exploitation to minimize the regret of your play?

This mini-episode explores a few examples including restaurant selection and A/B testing to discuss the nature of this problem. In the end we touch briefly on Thompson sampling as a solution.

Tämä jakso on lisätty Podme-palveluun avoimen RSS-syötteen kautta eikä se ole Podmen omaa tuotantoa. Siksi jakso saattaa sisältää mainontaa.

Jaksot(601)

Computable AGI

Computable AGI

On today's show, we are joined by Michael Timothy Bennett, a Ph.D. student at the Australian National University. Michael's research is centered around Artificial General Intelligence (AGI), specifica...

3 Heinä 202336min

AGI Can Be Safe

AGI Can Be Safe

We are joined by Koen Holtman, an independent AI researcher focusing on AI safety. Koen is the Founder of Holtman Systems Research, a research company based in the Netherlands. Koen started the conver...

26 Kesä 202345min

AI Fails on Theory of Mind Tasks

AI Fails on Theory of Mind Tasks

An assistant professor of Psychology at Harvard University, Tomer Ullman, joins us. Tomer discussed the theory of mind and whether machines can indeed pass it. Using variations of the Sally-Anne test ...

19 Kesä 202352min

AI for Mathematics Education

AI for Mathematics Education

The application of LLMs cuts across various industries. Today, we are joined by Steven Van Vaerenbergh, who discussed the application of AI in mathematics education. He discussed how AI tools have cha...

12 Kesä 202335min

Evaluating Jokes with LLMs

Evaluating Jokes with LLMs

Fabricio Goes, a Lecturer in Creative Computing at the University of Leicester, joins us today. Fabricio discussed what creativity entails and how to evaluate jokes with LLMs. He specifically shared t...

6 Kesä 202343min

Why Machines Will Never Rule the World

Why Machines Will Never Rule the World

Barry Smith and Jobst Landgrebe, authors of the book "Why Machines will never Rule the World," join us today. They discussed the limitations of AI systems in today's world. They also shared elaborate ...

29 Touko 202355min

A Psychopathological Approach to Safety in AGI

A Psychopathological Approach to Safety in AGI

While the possibilities with AGI emergence seem great, it also calls for safety concerns. On the show, Vahid Behzadan, an Assistant Professor of Computer Science and Data Science, joins us to discuss ...

23 Touko 202349min

The NLP Community Metasurvey

The NLP Community Metasurvey

Julian Michael, a postdoc at the Center for Data Science, New York University, joins us today. Julian's conversation with Kyle was centered on the NLP community metasurvey: a survey aimed at understan...

15 Touko 202349min

Suosittua kategoriassa Tiede

rss-mita-tulisi-tietaa
rss-poliisin-mieli
tiedekulma-podcast
menologeja-tutkimusmatka-vaihdevuosiin
sotataidon-ytimessa
filocast-filosofian-perusteet
rss-duodecim-lehti
rss-astetta-parempi-elama-podcast
rss-lapsuuden-rakentajat-podcast
utelias-mieli
docemilia
radio-antro
rss-ranskaa-raakana
rss-kasvatuspsykologiaa-kaikille
rss-tiedetta-vai-tarinaa
rss-luontopodi-samuel-glassar-tutkii-luonnon-ihmeita
rss-sosiopodi