Real Life Chaos Monkeys And Other Infrastructure Challenges
cloud20305 Elo 2022

Real Life Chaos Monkeys And Other Infrastructure Challenges

How do we use chaos monkeys in real life, and practically? This happens all the time when we have failures. The Rogers failure that took out the internet and cell phone use in Canada last week was the start of our discussion. Predicting how things are going to go out is a common theme for chaos monkeys, and really comes back to how we test infrastructure. Should we be putting it under stress in planned ways like Chaos Monkey, in order to ensure that our increasingly internet and power dependent society is prepared for the inevitable outages? We have a really fascinating discussion about what it would take to make this type of practice real, including alternatives that people can look at today. Transcript: https://otter.ai/u/D0ZV5c3ikvAiinsK7ugf_duCjv8 Image: https://www.pexels.com/photo/monkey-sitting-on-a-fence-and-looking-at-its-hands-11999152/

Jaksot(500)

Logging [TechOps Series]

Logging [TechOps Series]

We dive deep into logging, tracing, metrics, observability, with a specific filter for automation and systems and infrastructure. There's a real challenge here of how you capture information from a ...

1 Marras 202451min

Eve Of A Nuclear Renaissance?

Eve Of A Nuclear Renaissance?

Reference: https://en.wikipedia.org/wiki/Pebble-bed_reactor Do nuclear power and a potential renaissance in nuclear power, driven by the voracious power demands for data centers, have the potential o...

25 Loka 202448min

Reading Logs And Events [Techops]

Reading Logs And Events [Techops]

This TechOps episode explores the challenges of processing events and logs in technical operations. The discussion covers the importance of understanding the intent and purpose of building systems d...

18 Loka 202411min

Training Small LLMs

Training Small LLMs

In this episode, we dive deep into the emerging world of building and training small language models. We'll discuss the benefits, risks, and challenges companies face as they work to create more targe...

11 Loka 20241h

Process: Good, Bad And Ugly

Process: Good, Bad And Ugly

This podcast episode explores the challenges of process improvement in IT operations, using examples from data centers, automotive, and cybersecurity. The discussion covers the slow evolution of sec...

5 Loka 202456min

Containers Manager [TechOps]

Containers Manager [TechOps]

In this episode, we continue our TechOps series, diving deep into the topic of container management. As containers become increasingly mainstream, the need to effectively manage and orchestrate these ...

27 Syys 202450min

Supply Chain Security [Tech Ops]

Supply Chain Security [Tech Ops]

In this episode, we dive deep into a recent and highly sophisticated SSH intrusion attack that was discovered in the Linux kernel. We'll discuss how the attackers were able to inject a backdoor into a...

20 Syys 202417min

Software Bill of Materials [TechOps 006]

Software Bill of Materials [TechOps 006]

A software bill of materials is the idea that we can define and document exactly what goes into a system. We look at governance today and SBOMs as we put it together, both from a software and an opera...

13 Syys 202443min