From search trees to neural nets, a deep dive into natural language processing

From search trees to neural nets, a deep dive into natural language processing

We chatted with three guests:

Miguel Jetté: Head of AI R&D

Josh Dong: AI Engineering Manager

Jenny Drexler: Senior Speech Scientist

When Jette was studying mathematics in the early 2000s, his focus was on computational biology, and more specifically, phylogenetic trees, and DNA sequences. He wanted to understand the evolution of certain traits and the forces that explain why our bones are a certain length or our brains a certain size. As it turned out, the algorithms and techniques he learned in this field mapped very well to the emerging discipline of automatic speech recognition, or ASR.

During this period, Montreal was emerging as a hotbed for artificial intelligence, and Jette found himself working for Nuance, the company behind the original implementation of Siri. That experience led him to several positions in the world of speech recognition, and he eventually landed at Rev, where he founded the company’s AI department.

Jette describes Rev as an “Uber for Transcription.” Anyone can sign up for the platform and earn money by listening to audio submitted by clients and transcribing the speech into text. This means the company has a tremendous dataset of raw audio that has been annotated by human beings and, in many cases, assessed a second time by the client. For someone looking to build an AI system that mastered the domain of speech to text, this was a goldmine.

Jette built the earliest version of Rev’s AI, but it was up to our second guest, Josh Dong, to productize and scale that system. He helped the department transition from older technologies like Perl to more popular languages like Python. He also focused on practical concerns like modularity and reusable components. To combine machine learning and DevOps, Dong added Docker containers and a testing pipeline. If you’re interested in the nuts and bolts of keeping a system like Rev’s running at tremendous scale, you’ll want to check out this part of the show.

We also explore some of the fascinating future and promise this technology holds in our time with Jenny Drexler. She explains how Rev is moving from a hybrid model—one that combines Jette’s older statistical techniques with Dong’s newer machine learning approach—to a new system that will be ML from end-to-end. This will open up the door for powerful applications, like a single system that can convert speech text across multiple languages in a single piece of audio.

“One of the things that's really cool about these end to end models is that basically, whatever data you have, it can learn to handle it. So a very similar architecture can do sequence to sequence learning with different kinds of sequences. The model architecture that you might use for speech recognition can actually look very similar to what you might use for translation. And you can use that same architecture, to say, feed in audio in lots of different languages and be able to do transcription for any of them within one model. It's much harder with the hybrid models to sort of put all the right pieces together to make that happen,” explains Drexler.

If you’re interested in learning more about the past, present, and future of artificial intelligence that can understand our spoken language and learn how to respond, check out the full episode. If you want to learn more about Rev or check out some of the positions they have open, you can find their careers page here.

See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

Avsnitt(935)

Attention isn’t all we need; we need ownership too

Attention isn’t all we need; we need ownership too

NEAR is the blockchain for AI, enabling AI agents to transact freely across networks.Connect with Illia on LinkedIn and X, and read the original Transformers paper that Illia co-authored in 2017.Today...

8 Juli 202536min

Why call one API when you can use GraphQL to call them all?

Why call one API when you can use GraphQL to call them all?

Apollo GraphQL lets you orchestrate APIs with a composable, declarative, self-service model. Apollo's MCP Server is now available.Connect with Matt on LinkedIn.Today we’re shouting out a Famous Questi...

4 Juli 202525min

Programming problems that seem easy, but aren't, featuring Jon Skeet

Programming problems that seem easy, but aren't, featuring Jon Skeet

Jon Skeet, for those not in the know, is legendary here at Stack Overflow. He even got his own Chuck Norris Facts-style jokes. Jon has graced the podcast before in the early days on episodes 4, 72, an...

1 Juli 202532min

You’ve got 99 problems but data shouldn’t be one

You’ve got 99 problems but data shouldn’t be one

Tobiko Data is creating a new standard in data transformation with their Cloud and SQL integrations. You can keep up with their work by joining their Slack community.Connect with Toby on LinkedIn.Conn...

27 Juni 202529min

You've vibe coded an app. Now what?

You've vibe coded an app. Now what?

SPONSORED BY HEROKUHeroku is a platform-as-a-service (PaaS) for deploying, scaling, and managing apps. Connect with Vish on X and LinkedIn. Congrats to Populist badge winner AmaDaden  for their answer...

25 Juni 202526min

How to build your prototypes without a 35% tariff

How to build your prototypes without a 35% tariff

Ryan and Ben welcome Alex Malcoci, CEO and founder of MiniProto, to talk innovations in hardware prototyping, the evolving complexities of the global supply chain, the impact of the US-China trade war...

24 Juni 202522min

Defending the realm: Trust and safety at Stack Overflow

Defending the realm: Trust and safety at Stack Overflow

The Trust and Safety team is using aliases in this episode. Learn more about who the Community Management team is at Stack Overflow. Explore how we keep our community safe in our Code of Conduct. Cong...

20 Juni 202537min

"My job is going to change in a dramatic way”: Exploring the future of the internet with Cloudflare

"My job is going to change in a dramatic way”: Exploring the future of the internet with Cloudflare

Dane shares his excitement about the Model Context Protocol (MCP), exploring its potential impact on the future of technology. The discussion turns to the growing need for sustainable content monetiza...

19 Juni 202523min

Populärt inom Business & ekonomi

framgangspodden
varvet
rss-jossan-nina
rss-svart-marknad
svd-tech-brief
rss-borsens-finest
badfluence
uppgang-och-fall
avanzapodden
bathina-en-podcast
fill-or-kill
24fragor
rss-inga-dumma-fragor-om-pengar
lastbilspodden
tabberaset
kapitalet-en-podd-om-ekonomi
rss-dagen-med-di
rss-kort-lang-analyspodden-fran-di
borsmorgon
rss-veckans-trade