The Stack Overflow Podcast23 Kesä 2021

From search trees to neural nets, a deep dive into natural language processing

We chatted with three guests:

Josh Dong: AI Engineering Manager

Jenny Drexler: Senior Speech Scientist

When Jette was studying mathematics in the early 2000s, his focus was on computational biology, and more specifically, phylogenetic trees, and DNA sequences. He wanted to understand the evolution of certain traits and the forces that explain why our bones are a certain length or our brains a certain size. As it turned out, the algorithms and techniques he learned in this field mapped very well to the emerging discipline of automatic speech recognition, or ASR.

During this period, Montreal was emerging as a hotbed for artificial intelligence, and Jette found himself working for Nuance, the company behind the original implementation of Siri. That experience led him to several positions in the world of speech recognition, and he eventually landed at Rev, where he founded the company’s AI department.

Jette describes Rev as an “Uber for Transcription.” Anyone can sign up for the platform and earn money by listening to audio submitted by clients and transcribing the speech into text. This means the company has a tremendous dataset of raw audio that has been annotated by human beings and, in many cases, assessed a second time by the client. For someone looking to build an AI system that mastered the domain of speech to text, this was a goldmine.

Jette built the earliest version of Rev’s AI, but it was up to our second guest, Josh Dong, to productize and scale that system. He helped the department transition from older technologies like Perl to more popular languages like Python. He also focused on practical concerns like modularity and reusable components. To combine machine learning and DevOps, Dong added Docker containers and a testing pipeline. If you’re interested in the nuts and bolts of keeping a system like Rev’s running at tremendous scale, you’ll want to check out this part of the show.

We also explore some of the fascinating future and promise this technology holds in our time with Jenny Drexler. She explains how Rev is moving from a hybrid model—one that combines Jette’s older statistical techniques with Dong’s newer machine learning approach—to a new system that will be ML from end-to-end. This will open up the door for powerful applications, like a single system that can convert speech text across multiple languages in a single piece of audio.

“One of the things that's really cool about these end to end models is that basically, whatever data you have, it can learn to handle it. So a very similar architecture can do sequence to sequence learning with different kinds of sequences. The model architecture that you might use for speech recognition can actually look very similar to what you might use for translation. And you can use that same architecture, to say, feed in audio in lots of different languages and be able to do transcription for any of them within one model. It's much harder with the hybrid models to sort of put all the right pieces together to make that happen,” explains Drexler.

If you’re interested in learning more about the past, present, and future of artificial intelligence that can understand our spoken language and learn how to respond, check out the full episode. If you want to learn more about Rev or check out some of the positions they have open, you can find their careers page here.

See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

Kokeile Premiumia

Nauti 14 päivää ilmaiseksi

Tilaa Premium

Jaksot(911)

Vite is like the United Nations of JavaScript

Ryan welcomes back Evan You, the creator of Vite and Vue.js, to discuss the evolution of build tools in web development, the unique features of Vite from its plugins to its hot module capabilities, an...

10 Loka 202527min

Context is king for secure, AI-generated code

Ryan sits down with Dimitri Stiliadis, CTO and co-founder of Endor Labs, to talk about how AppSec is evolving to address AI’s use cases. They discuss the implications of AI-generated code on security ...

7 Loka 202528min

One is not the loneliest number for API calls

Gil Feig, co-founder and CTO of Merge, joins the show to explore Merge’s approach for reducing third-party APIs to a single call, the complexities of and need for data normalization, and the role that...

3 Loka 202526min

Building AI-ready teams: Why documentation and culture matter more than tools

In the second part of this two-part Leaders of Code episode, Peter O'Connor, Director of Platform Engineering, and Ryan J. Salva, Senior Director of Product at Google Developer Experiences, dive beyon...

2 Loka 202520min

As your AI gets smarter, so must your API

Ryan sits down with Marco Palladino, CTO of Kong, to talk about the rise of AI agents and their impact on API consumption, the MCP protocol as a new standard for agents, the importance of observabilit...

30 Syys 202528min

Getting Backstage in front of a shifting dev experience

Ryan welcomes Pia Nilsson, GM for Backstage and head of developer experience at Spotify, to discuss the evolution and adoption of Backstage, the impact of AI on dev experience, and how Spotify approac...

26 Syys 202526min

Democratizing your data access with AI agents

Jeff Hollan, director of product at Snowflake, joins Ryan to discuss the role that data plays in making AI and AI agents better. Along the way, they discuss how a database leads to an AI platform, Sno...

23 Syys 202529min

Off with your CMS’s head! Composability and security in headless CMS

Ryan welcomes Sebastian Gierlinger, VP of Engineering at Storyblok, to talk about how headless content management systems (CMS) fit into an increasingly componentized software landscape. They run thro...

19 Syys 202523min

Kaikki yhdessä sovelluksessa

Kuuntele kaikki suosikkipodcastisi ja -äänikirjasi yhdessä paikassa.

Sinulle valikoitua sisältöä

Podme-sovelluksessa kokoat suosikkisi helposti omaan kirjastoosi. Saat meiltä myös kuuntelusuosituksia!

Jatka kuuntelua koska tahansa

Voit jatkaa siitä mihin jäit, myös offline-tilassa.

Premium

9,99 €/kk

Kaikki premium-podcastit
Ei mainoksia
Ei sitoutumista, peruuta koska tahansa

Aloita 14 päivän kokeilu

Premium

13,99 €/kk

Kaikki premium-podcastit
Ei mainoksia
Ei sitoutumista, peruuta koska tahansa
Yksi lisäkäyttäjä

Kokeile 14 päivää maksutta

Suosittua kategoriassa Liike-elämä ja talous

rss-paatos-podcast-suomen-kovimmat-paatoksentekijat-2

Tarinat ja äänet, joita rakastat kuunnella

Kuuntele kaikki suosikkipodcastisi ja -äänikirjasi

Lue lisää