Kubecon SC25 Debrief
cloud203012 Kesä

Kubecon SC25 Debrief

In this episode, we debrief several industry events I went to last year, including Supercomputing, KubeCon, Stack, the AI Infrastructure Show, and the Red Hat AI Infrastructure Summit. We dive deep into some observations from the shows and what they tell us about the gaps and fractures in how we are working to build AI infrastructure. We focus on how observability is being used for evaluation, tuning, performance issues, GPU dropouts, and cluster management, while anomaly detection and root cause analysis remain less common, and we note that networking is still underserved. We also get into the shift from building clusters to observing and fixing them after deployment, especially for agentic systems, and we end by highlighting the need for observability across application, identity, networking, and infrastructure layers. Transcript: https://otter.ai/u/y6FNvERJRe_8qnmAgVlmvd6kwb8?utm_source=copy_url

Tämä jakso on lisätty Podme-palveluun avoimen RSS-syötteen kautta eikä se ole Podmen omaa tuotantoa. Siksi jakso saattaa sisältää mainontaa.

Jaksot(500)

AWS Outage

AWS Outage

In this episode, we discuss the October 2025 Amazon outage. The conversation took place during the outage, and though it’s been a few months now, the insights and discussions are still very interestin...

5 Kesä 38min

Back After a Break

Back After a Break

In this episode, we discuss the rising cost of using AI and how usage-based pricing, model changes, and capacity limits are affecting daily work as AI moves from experimentation into operational use. ...

29 Touko 28min

MCP Agents and Context

MCP Agents and Context

In this episode, we continue our journey even deeper into how agentic vibe coding and other AI-based automation. This time we focus on Model Control Protocol (MCP) and its application in our bare meta...

22 Touko 1h 2min

Vibe Coding Mapping [TechOps]

Vibe Coding Mapping [TechOps]

In this episode, we continue our Vibe Coding experiment. Now that we’ve figured out how to interface with MaaS, this time we wrestle with mapping and how different systems interact with each other. We...

15 Touko 43min

Rob Weinhold: The Art of Crisis Leadership [Cloud 2030 Book Club]

Rob Weinhold: The Art of Crisis Leadership [Cloud 2030 Book Club]

In this episode, we talk about Rob Weinhold's book, "The Art of Crisis Leadership." We explore the vital principle of "owning your narrative" in crisis management, and we share some personal stories r...

8 Touko 52min

Vibe Coding for Ops [TechOps]

Vibe Coding for Ops [TechOps]

In this episode, we do some live vibe coding– using AI to write code. We share tips and tricks on having the best vibe coding experience and avoiding some common pitfalls. You'll get to hear what we d...

17 Loka 202540min

Infrastructure Summit Debrief

Infrastructure Summit Debrief

Rich Miller and I debrief our experiences at the AI Infrastructure and Edge Summit in Santa Clara. This show was interesting in the way it combined a lot of different pieces together related to AI. We...

10 Loka 202524min