The One With Shannon Brady and Operating Systems

The One With Shannon Brady and Operating Systems

In this episode of the Prodcast, guest Shannon Brady speaks with hosts Jordan Greenberg and Florian Rathgeber about managing Google's vast fleet of internal devices. Shannon explains how Google's Linux platform uses core SRE principles—specifically testing, canarying, and monitoring—for weekly stage rollouts of its Debian-based distribution. Configuration is efficiently managed using Puppet to ensure the right setup for a diverse user base. The conversation pivots to "the year of Linux everything," underscoring its widespread adoption. Discussing AI, Shannon identifies its greatest utility for SREs in rapidly analyzing signals and generating complex queries to resolve outages. This episode reinforces that practicing SRE fundamentals is paramount, demonstrating that you can be an SRE at heart, regardless of your official title.

Jaksot(51)

Profiling data with Pat Somaru and Narayan Desai

Profiling data with Pat Somaru and Narayan Desai

In this episode, guests Narayan Desai (Principal SRE, Google) and Pat Somaru (Senior Production Engineer, Meta) join hosts Steve McGhee and Florian Rathgeber to discuss the challenges of observability...

30 Loka 202442min

Google Public DNS (8.8.8.8) with Wilmer van der Gaast and Andy Sykes

Google Public DNS (8.8.8.8) with Wilmer van der Gaast and Andy Sykes

This episode features Google engineers Wilmer van der Gaast (Production on-tall) and Andy Sykes (Senior Staff Systems Engineer, SRE), joining hosts Steve McGhee and Jordan Greenberg, to discuss the de...

23 Loka 202432min

SRE in the Retail and Gaming Worlds with Jordan Chernev & Scott Bowers

SRE in the Retail and Gaming Worlds with Jordan Chernev & Scott Bowers

Guests Jordan Chernev (Senior Technology Executive) and Scott Bowers (SRE, Gearbox Software) who hail from the retail and gaming industries, respectively, join hosts Steve McGhee and Jordan Greenberg ...

16 Loka 202433min

Incident Response with Sarah Butt and Vrai Stacey

Incident Response with Sarah Butt and Vrai Stacey

Sarah Butt (Principal Engineer, Centralized Incident Response, Salesforce) and Vrai Stacey (Staff Software Engineer, Google) join hosts Steve McGhee and Jordan Greenberg to dive into incident response...

9 Loka 202443min

Building Reliable Systems with Silvia Botros and Niall Murphy

Building Reliable Systems with Silvia Botros and Niall Murphy

Silvia Botros (SRE Architect, Twilio | Author of "High Performance MySQL, 4th edition") and Niall Murphy (Co-founder & CEO, Stanza) join hosts Steve McGhee and Jordan Greenberg, to discuss cultural sh...

2 Loka 202442min

Creating Systems that are Safe with Liz Fong-Jones

Creating Systems that are Safe with Liz Fong-Jones

Liz Fong-Jones (former Google SRE and current Field CTO at honeycomb.io) joins hosts Steve McGhee and Jordan Greenberg for a lively discussion centered around observability, its evolution from monitor...

25 Syys 202428min

Production Problems Are For All! with Ben Treynor Sloss

Production Problems Are For All! with Ben Treynor Sloss

Ben Treynor Sloss (VP of Engineering, Google) joins hosts Steve McGhee and Dr. Jennifer Petoff (Director of Technical Infrastructure Education, Google) to share the evolution of SRE and its impact on ...

18 Syys 202431min

There Remains a Huge Amount of Work to Do, with Healfdene Goguen

There Remains a Huge Amount of Work to Do, with Healfdene Goguen

In this episode, Healfdene Goguen (Principal Engineer, Google) joins hosts Steve McGhee and Jordan Greenberg to discuss the vast amount of work to be done by SREs, and the fascinating challenges to ta...

11 Syys 202426min