The One With Carla Geisser and Crisis Engineering

The One With Carla Geisser and Crisis Engineering

Join us for a discussion with Carla Geisser of Layer Aleph, a company focused on "crisis engineering". Carla distinguishes a crisis from a standard incident by noting that a crisis is novel and lacks a playbook. She outlines five criteria for a true crisis: fundamental surprise, broken critical functions, high visibility, a rigid deadline (unlike internal tech deadlines), and perception breakdown. Crises often arise in organizations that struggle to admit computers control core decisions, leading to complex, glued-together systems. Carla emphasizes that SRE-adjacent skills are essential for connecting the dots and exposing the full system. The key takeaway for SREs is to recognize when a true crisis is happening, as leadership will only be willing to "break rules" and enable substantive change once three of these criteria are met.1

Tämä jakso on lisätty Podme-palveluun avoimen RSS-syötteen kautta eikä se ole Podmen omaa tuotantoa. Siksi jakso saattaa sisältää mainontaa.

Jaksot(55)

Matt Zelesko and the Future of SRE

Matt Zelesko and the Future of SRE

We sit down with Matt Zelesko, VP of SRE at Google, for a candid talk about how AI is changing SRE — and how it's not.

26 Touko 23min

Handling Burnout with Sam Anderson

Handling Burnout with Sam Anderson

Sam Anderson shares his experiences with burnout, and how to support yourself as a reliable system.  Sam provides guidance on how to deal with burnout, and some suggestions on how to avoid burnout thr...

21 Touko 10min

The One with Crisis Engineering and Mikey Dickerson

The One with Crisis Engineering and Mikey Dickerson

Crisis Engineer Mikey Dickerson joins us to talk about what constitutes a crisis. Mikey draws on his broad experience across industry and the public sector, as well as on work with his team of systems...

15 Touko 43min

This is Fine! With Colette Alexander and Clint Byrum

This is Fine! With Colette Alexander and Clint Byrum

What's happening in the world of SRE and resilience engineering? Join us as we catch up with fellow podcast hosts Colette Alexander and Clint Byrum of the This Is Fine! podcast at SREcon in Seattle.

12 Touko 9min

The One With Damion Yates and Building AI systems

The One With Damion Yates and Building AI systems

How do you introduce Site Reliability Engineering to an AI research lab, bringing concepts of scale to engineers who are at the leading edge of AI systems? In the latest episode of The Prodcast, hosts...

26 Helmi 31min

The One with Parker Barnes, Felipe Tiengo Ferreira, and AI

The One with Parker Barnes, Felipe Tiengo Ferreira, and AI

This episode of the Prodcast tackles the challenges of maintaining AI safety and alignment in production. Guests Felipe Tiengo Ferreira and Parker Barnes join hosts Matt Siegler and Steve McGhee to di...

5 Helmi 36min

The One With Shannon Brady and Operating Systems

The One With Shannon Brady and Operating Systems

In this episode of the Prodcast, guest Shannon Brady speaks with hosts Jordan Greenberg and Florian Rathgeber about managing Google's vast fleet of internal devices. Shannon explains how Google's Linu...

28 Tammi 24min