The One With Carla Geisser and Crisis Engineering

The One With Carla Geisser and Crisis Engineering

Join us for a discussion with Carla Geisser of Layer Aleph, a company focused on "crisis engineering". Carla distinguishes a crisis from a standard incident by noting that a crisis is novel and lacks a playbook. She outlines five criteria for a true crisis: fundamental surprise, broken critical functions, high visibility, a rigid deadline (unlike internal tech deadlines), and perception breakdown. Crises often arise in organizations that struggle to admit computers control core decisions, leading to complex, glued-together systems. Carla emphasizes that SRE-adjacent skills are essential for connecting the dots and exposing the full system. The key takeaway for SREs is to recognize when a true crisis is happening, as leadership will only be willing to "break rules" and enable substantive change once three of these criteria are met.1

Avsnitt(51)

The One with Ben Good and Our Kubernetes Friends

The One with Ben Good and Our Kubernetes Friends

In this special episode hosts Steve McGhee from the Google SRE Prodcast and Kaslin Fields from the Google Kubernetes Podcast, welcome Google Cloud Solutions Architect Ben Good to discuss platform engi...

30 Juli 202532min

The One With AI Agents, Ramón Llamas, and Swapnil Haria

The One With AI Agents, Ramón Llamas, and Swapnil Haria

Google Staff SRE Ramón Llamas and Google Software Engineer Swapnil Haria join our hosts to explore how AI agents are revolutionizing production management, from summarizing alerts and finding hidden e...

23 Juli 202542min

The One with Technical Program Managers and Karanveer Anand

The One with Technical Program Managers and Karanveer Anand

This episode features Google Technical Program Manager (TPM) Karanveer Anand, who joins our hosts to discuss the unique role of TPMs in Site Reliability Engineering (SRE). The conversation highlights ...

16 Juli 202527min

The One with STPA, Jeffrey Snover, and Theo Klein

The One with STPA, Jeffrey Snover, and Theo Klein

This episode discusses Systems Theoretic Process Analysis (STPA), a method for analyzing complex systems. Theo Klein, a Google SRE, and Jeffrey Snover, a Distinguished Engineer at Google, explain that...

2 Juli 202537min

The One with Startups and Adam Fletcher

The One with Startups and Adam Fletcher

In this episode, hosts Steve McGhee and Matt Siegler are joined by guest, Adam Fletcher, CEO and Co-Founder of MarketStreet. They discuss the current state of web development with LLMs, managing techn...

25 Juni 202541min

The One with SLOs and Sal Furino

The One with SLOs and Sal Furino

In this episode, Sal Furino, Customer Reliability Engineer at Bloomberg, discusses all things Service Level Objectives (SLOs) with hosts Steve McGhee and Matt Siegler. Together, they dig into what suc...

18 Juni 202543min

The One With the Future of SRE and Matt Zelesko

The One With the Future of SRE and Matt Zelesko

Matt Zelesko, the head of Site Reliability Engineering at Google, discusses the evolution of SRE, highlighting the shift from traditional operations to a model that balances velocity and reliability t...

11 Juni 202526min

The One with AI and Todd Underwood

The One with AI and Todd Underwood

In this Google Prodcast episode, Todd Underwood, a reliability expert from Anthropic with experience at Google and OpenAI, discusses the current state and future of AI in SRE. Todd and the hosts focus...

4 Juni 202543min

Populärt inom Teknik

uppgang-och-fall
elbilsveckan
market-makers
bilar-med-sladd
rss-elektrikerpodden
har-vi-akt-till-mars-an
skogsforum-podcast
rss-technokratin
rss-veckans-ai
natets-morka-sida
rss-en-ai-till-kaffet
rss-laddstationen-med-elbilen-i-sverige
hej-bruksbil
rss-powerboat-sverige-podcast
gubbar-som-tjotar-om-bilar
teknikveckan
bli-saker-podden
rss-sakerhetspodcasten
rss-upplyst-entreprenordirektor
developers-mer-an-bara-kod