SRE III with Steve McGhee and Yuri Grinshteyn

SRE III with Steve McGhee and Yuri Grinshteyn

Our old pal Mark Mirchandani is back this week, joining Stephanie Wong and our guests Steve McGhee and Yuri Grinshteyn to talk about Site Reliability Engineering. SRE is Google's way of helping companies of all sizes create consistent, predictable, and functional projects. It helps clients approach operations from a software engineering stand point so that growing systems can be managed efficiently.

We talk about the challenges of implementing best SRE practices and how companies can overcome these. Though the benefits of SRE are many, it can be difficult for clients to grasp. Steve and Yuri tell us the process they go through with customers to help them set realistic goals and work to make reliable, scalable projects with little downtime. By starting small and taking wins early, Steve says clients reap the rewards of SRE and are encouraged to push forward. Yuri's customer-centric approach encourages companies to prioritize alerts that affect the user experience, thus limiting inbox mayhem and keeping customers happy. Alerts based on symptoms, Steve says, help accomplish this goal.

Later, Yuri and Steve describe the best ways for companies to get started with SRE. Realistic goals and specific detailed plans can make the journey less bumpy for clients, and Google's SRE team can help.

Steve McGhee

Steve was an SRE at Google for about 10 years, then left to help a company build reliable systems on the Cloud. Now he's back at Google, helping more companies do that.

Yuri Grinshteyn

Yuri works with Google Cloud Platform customers to help them design, architect, build, and operate reliable applications and services. He also advocates for SRE principles and practices on YouTube and elsewhere.

Cool things of the week
  • Fresh updates: Google Cloud 2021 Summits blog
  • Why you need to explain machine learning models blog
    • GCP Podcast Episode 260: Responsible AI with Craig Wiley and Tracy Frey podcast
    • GCP Podcast Episode 249: ML Lifecycle with Dale Markowitz and Craig Wiley podcast
    • GCP Podcast Episode 214: AI in Healthcare with Dale Markowitz podcast
Interview
  • Site Reliability Engineering site
  • Reliability Architecture Framework site
  • Site Reliability Engineering: Measuring and Managing Reliability on Coursera site
  • Developing a Google SRE Culture on Coursera site
  • How Lowe's meets customer demand with Google SRE practices blog
  • GCP Podcast Episode 68: The Home Depot with William Bonnell podcast
  • GCP Podcast Episode 213: The Art of SLOs with Alex Bramley podcast
  • GCP Podcast Episode 127: SRE vs Devops with Liz Fong-Jones and Seth Vargo podcast
  • GCP Podcast Episode 72: Customer Reliability Engineering with Luke Stone podcast
  • GCP Podcast Episode 38: Site Reliability Engineering with Paul Newson podcast
  • GCP Podcast Episode 59: SRE II with Paul Newson podcast
What's something cool you're working on?

Yuri has been working on Engineering for Reliability.

Stephanie has been working on her new series What's New in Networking.

Jaksot(335)

Messaging on the Cloud

Messaging on the Cloud

In the seventh episode of this podcast, your hosts Francesc and Mark discuss the different ways messaging can be done on Google Cloud Platform, covering Pub/Sub and Task Queues and when to choose what...

9 Joulu 201533min

HTTP/2, SPDY, and QUIC with Ilya Grigorik

HTTP/2, SPDY, and QUIC with Ilya Grigorik

In the sixth episode of this podcast, your hosts Francesc and Mark interview Ilya Grigorik, Developer Advocate at Google. About Ilya: Ilya is a web performance engineer at Google; co-chair of W3C Webp...

2 Joulu 201535min

Google Cloud Developer Experience with Chris Sells

Google Cloud Developer Experience with Chris Sells

In the fifth episode of this podcast, your hosts Francesc and Mark interview Chris Sells, Product Manager at Google. About Chris: Chris Sells has been a software engineer of one kind or another since ...

25 Marras 201533min

Containers and Dockercon with Jessie Frazelle

Containers and Dockercon with Jessie Frazelle

In the fourth episode of this podcast, your hosts Francesc and Mark interview Jessie Frazelle, Container Hacker at the Docker Engine team. About Jessie: Jessie Frazelle is a Docker core maintainer and...

18 Marras 201533min

Kubernetes and Google Container Engine

Kubernetes and Google Container Engine

In the third episode of this podcast, your hosts Francesc and Mark interview Brian Dorsey, Developer Advocate, Google Cloud Platform about Kubernetes and Google Container Engine. About Brian: Brian Do...

11 Marras 201540min

Compute as a Continuum

Compute as a Continuum

In the second episode of this podcast, your hosts Francesc and Mark go from Infrastructure as a Service to Platform as a Service, as they discuss the concept of "Compute as a Continuum". Links: Googl...

4 Marras 201525min

We Got a Podcast!

We Got a Podcast!

In this first episode your hosts, Francesc and Mark, discuss how this podcast was built and deployed to Google Cloud Platform. Do you have something cool to share? Some questions? Let us know: web: g...

27 Loka 201519min

Suosittua kategoriassa Politiikka ja uutiset

uutiscast
aikalisa
ootsa-kuullut-tasta-2
politiikan-puskaradio
rss-ootsa-kuullut-tasta
tervo-halme
rss-vaalirankkurit-podcast
viisupodi
rss-podme-livebox
et-sa-noin-voi-sanoo-esittaa
otetaan-yhdet
rss-asiastudio
linda-maria
the-ulkopolitist
rss-raha-talous-ja-politiikka
rss-girls-finish-f1rst
rikosmyytit
rss-kaikki-uusiksi
io-techin-tekniikkapodcast
rss-vain-talouselamaa