Data Governance for Data Science with Adam Wood - #578

Data Governance for Data Science with Adam Wood - #578

Today we’re joined by Adam Wood, Director of Data Governance and Data Quality at Mastercard. In our conversation with Adam, we explore the challenges that come along with data governance at a global scale, including dealing with regional regulations like GDPR and federating records at scale. We discuss the role of feature stores in keeping track of data lineage and how Adam and his team have dealt with the challenges of metadata management, how large organizations like Mastercard are dealing with enabling feature reuse, and the steps they take to alleviate bias, especially in scenarios like acquisitions. Finally, we explore data quality for data science and why Adam sees it as an encouraging area of growth within the company, as well as the investments they’ve made in tooling around data management, catalog, feature management, and more. The complete show notes for this episode can be found at twimlai.com/go/578

Jaksot(777)

The Fallacy of "Ground Truth" with Shayan Mohanty - #576

The Fallacy of "Ground Truth" with Shayan Mohanty - #576

Today we continue our Data-centric AI series joined by Shayan Mohanty, CEO at Watchful. In our conversation with Shayan, we focus on the data labeling aspect of the machine learning process, and ways that a data-centric approach could add value and reduce cost by multiple orders of magnitude. Shayan helps us define “data-centric”, while discussing the main challenges that organizations face when dealing with labeling, how these problems are currently being solved, and how techniques like active learning and weak supervision could be used to more effectively label. We also explore the idea of machine teaching, which focuses on using techniques that make the model training process more efficient, and what organizations need to be successful when trying to make the aforementioned mindset shift to DCAI.  The complete show notes for this episode can be found at twimlai.com/go/576

30 Touko 202251min

Principle-centric AI with Adrien Gaidon - #575

Principle-centric AI with Adrien Gaidon - #575

This week, we continue our conversations around the topic of Data-Centric AI joined by a friend of the show Adrien Gaidon, the head of ML research at the Toyota Research Institute (TRI). In our chat, Adrien expresses a fourth, somewhat contrarian, viewpoint to the three prominent schools of thought that organizations tend to fall into, as well as a great story about how the breakthrough came via an unlikely source. We explore his principle-centric approach to machine learning as well as the role of self-supervised machine learning and synthetic data in this and other research threads. Make sure you’re following along with the entire DCAI series at twimlai.com/go/dcai. The complete show notes for this episode can be found at twimlai.com/go/575

23 Touko 202247min

Data Debt in Machine Learning with D. Sculley - #574

Data Debt in Machine Learning with D. Sculley - #574

Today we kick things off with a conversation with D. Sculley, a director on the Google Brain team. Many listeners of today’s show will know D. from his work on the paper, The Hidden Technical Debt in Machine Learning Systems, and of course, the infamous diagram. D. has recently translated the idea of technical debt into data debt, something we spend a bit of time on in the interview. We discuss his view of the concept of DCAI, where debt fits into the conversation of data quality, and what a shift towards data-centrism looks like in a world of increasingly larger models i.e. GPT-3 and the recent PALM models. We also explore common sources of data debt, what are things that the community can and have done to mitigate these issues, the usefulness of causal inference graphs in this work, and much more! If you enjoyed this interview or want to hear more on this topic, check back on the DCAI series page weekly at https://twimlai.com/podcast/twimlai/series/data-centric-ai. The complete show notes for this episode can be found at twimlai.com/go/574

19 Touko 202236min

AI for Enterprise Decisioning at Scale with Rob Walker - #573

AI for Enterprise Decisioning at Scale with Rob Walker - #573

Today we’re joined by Rob Walker, VP of decisioning & analytics and gm of one-to-one customer engagement at Pegasystems. Rob, who you might know from his previous appearances on the podcast, joins us to discuss his work on AI and ML in the context of customer engagement and decisioning, the various problems that need to be solved, including solving the “next best” problem. We explore the distinction between the idea of the next best action and determining it from a recommender system, how the combination of machine learning and heuristics are currently co-existing in engagements, scaling model evaluation, and some of the challenges they’re facing when dealing with problems of responsible AI and how they’re managed. Finally, we spend a few minutes digging into the upcoming PegaWorld conference, and what attendees should anticipate at the event. The complete show notes for this episode can be found at twimlai.com/go/573

16 Touko 202239min

Data Rights, Quantification and Governance for Ethical AI with Margaret Mitchell - #572

Data Rights, Quantification and Governance for Ethical AI with Margaret Mitchell - #572

Today we close out our coverage of the ICLR series joined by Meg Mitchell, chief ethics scientist and researcher at Hugging Face. In our conversation with Meg, we discuss her participation in the WikiM3L Workshop, as well as her transition into her new role at Hugging Face, which has afforded her the ability to prioritize coding in her work around AI ethics. We explore her thoughts on the work happening in the fields of data curation and data governance, her interest in the inclusive sharing of datasets and creation of models that don't disproportionately underperform or exploit subpopulations, and how data collection practices have changed over the years.  We also touch on changes to data protection laws happening in some pretty uncertain places, the evolution of her work on Model Cards, and how she’s using this and recent Data Cards work to lower the barrier to entry to responsibly informed development of data and sharing of data. The complete show notes for this episode can be found at twimlai.com/go/572

12 Touko 202241min

Studying Machine Intelligence with Been Kim - #571

Studying Machine Intelligence with Been Kim - #571

Today we continue our ICLR coverage joined by Been Kim, a staff research scientist at Google Brain, and an ICLR 2022 Invited Speaker. Been, whose research has historically been focused on interpretability in machine learning, delivered the keynote Beyond interpretability: developing a language to shape our relationships with AI, which explores the need to study AI machines as scientific objects, in isolation and with humans, which will provide principles for tools, but also is necessary to take our working relationship with AI to the next level.  Before we dig into Been’s talk, she characterizes where we are as an industry and community with interpretability, and what the current state of the art is for interpretability techniques. We explore how the Gestalt principles appear in neural networks, Been’s choice to characterize communication with machines as a language as opposed to a set of principles or foundational understanding, and much much more. The complete show notes for this episode can be found at twimlai.com/go/571

9 Touko 202252min

Advances in Neural Compression with Auke Wiggers - #570

Advances in Neural Compression with Auke Wiggers - #570

Today we’re joined by Auke Wiggers, an AI research scientist at Qualcomm. In our conversation with Auke, we discuss his team’s recent research on data compression using generative models. We discuss the relationship between historical compression research and the current trend of neural compression, and the benefit of neural codecs, which learn to compress data from examples. We also explore the performance evaluation process and the recent developments that show that these models can operate in real-time on a mobile device. Finally, we discuss another ICLR paper, “Transformer-based transform coding”, that proposes a vision transformer-based architecture for image and video coding, and some of his team’s other accepted works at the conference.  The complete show notes for this episode can be found at twimlai.com/go/570

2 Touko 202237min

Mixture-of-Experts and Trends in Large-Scale Language Modeling with Irwan Bello - #569

Mixture-of-Experts and Trends in Large-Scale Language Modeling with Irwan Bello - #569

Today we’re joined by Irwan Bello, formerly a research scientist at Google Brain, and now on the founding team at a stealth AI startup. We begin our conversation with an exploration of Irwan’s recent paper, Designing Effective Sparse Expert Models, which acts as a design guide for building sparse large language model architectures. We discuss mixture of experts as a technique, the scalability of this method, and it's applicability beyond NLP tasks the data sets this experiment was benchmarked against. We also explore Irwan’s interest in the research areas of alignment and retrieval, talking through interesting lines of work for each area including instruction tuning and direct alignment. The complete show notes for this episode can be found at twimlai.com/go/569

25 Huhti 202246min

Suosittua kategoriassa Politiikka ja uutiset

rss-ootsa-kuullut-tasta
aikalisa
tervo-halme
ootsa-kuullut-tasta-2
politiikan-puskaradio
et-sa-noin-voi-sanoo-esittaa
rss-vaalirankkurit-podcast
rss-podme-livebox
aihe
viisupodi
otetaan-yhdet
linda-maria
rss-polikulaari-humanisti-vastaa-ja-muut-ts-podcastit
rss-tasta-on-kyse-ivan-puopolo-verkkouutiset
rikosmyytit
mtv-uutiset-polloraati
rss-valiokunta
rss-hyvaa-huomenta-bryssel
rss-50100-podcast
rss-kuntalehti-podcast