Autoscaling Stateful Workloads in Kubernetes (DoK Day EU 2022) // Mohammad Fahim Abrar & Md. Kamol Hasan

Autoscaling Stateful Workloads in Kubernetes (DoK Day EU 2022) // Mohammad Fahim Abrar & Md. Kamol Hasan

https://go.dok.community/slack
https://dok.community/
From the DoK Day EU 2022 (https://youtu.be/Xi-h4XNd5tE)

Managing stateful workloads in a containerized environment has always been a concern. However, as Kubernetes developed, the whole community worked hard to bring stateful workloads to meet the needs of their enterprise users. As a result, Kubernetes introduced StatefulSets which supports stateful workloads since Kubernetes version 1.9. Users of Kubernetes now can use stateful applications like databases, AI workloads, and big data. Kubernetes support for stateful workloads comes in the form of StatefulSets. And as we all know, Kubernetes lets us automate many administration tasks along with provisioning and scaling. Rather than manually allocating resources, we can generate automated procedures that save time, it lets us respond faster when peaks in demand, and reduce costs by scaling this down when resources are not required. So, it’s really important to capture autoscaling in terms of stateful workloads in Kubernetes for better fault tolerance, high availability, and cost management. There are still a few challenges regarding Autoscaling Stateful Workloads in Kubernetes. They are related to horizontal/vertical scaling and automating the scaling process. In Horizontal Scaling when we are scaling up the workloads, we need to make sure that the infant workloads join the existing workloads in terms of collaboration, integration, load-sharing, etc. And make sure that no data is lost, also the ongoing tasks have to be completed/transferred/aborted while scaling down the workloads. If the workloads are in primary-standby architecture, we need to make sure that scale-up or scale-down happens on standby workloads first, so that the failovers are minimized. While scaling down some workloads, we also need to ensure that the targeted workloads are excluded from the voting to prevent quorum loss. Similarly, while scaling up some workloads, we need to ensure that new workloads join the voting. When new resources are required, we have to make the tradeoff between vertical scaling and horizontal scaling. And when it comes to Automation, we have to determine how to generate resource (CPU/memory) recommendations for the workloads. Also, when to trigger the autoscaling? Let’s say, a group of workloads may need to be autoscaled together. For example, In sharded databases, each shard is represented by one StatefulSet. But, all the shards are treated similarly by the database operator. Each shard may have its own recommendations. So, we have to find a way to scale them with the same recommendations. Also, we need to determine what happens when an autoscaling operation fails and what will happen to the future recommendations after the failure? There can be some workloads that may need a managed restart. For example, in a database, secondary nodes may need to be restarted before the primary. In this case, how to do a managed restart while autoscaling? Also, we need to figure out what happens when the workloads are going through maintenance? We will try to answer some of those questions throughout our session. ----- Fahim is a Software Engineer, working at AppsCode Inc. He has been involved with Kubernetes project since 2018 and is very enthusiastic about Kubernetes and open source in general. ----- MD Kamol Hasan is a Professional Software Developer with expertise in Kubernetes and backend development in Go. One of the lead engineers of KubeDB and KubeVault projects. Competitive contest programmer participated in different national and international programming contests including ACM ICPC, NCPC, etc

Jaksot(243)

What's New in Kubernetes Storage (DoK Day EU 2022) // Xing Yang

What's New in Kubernetes Storage (DoK Day EU 2022) // Xing Yang

https://go.dok.community/slack https://dok.community/ From the DoK Day EU 2022 (https://youtu.be/Xi-h4XNd5tE) Kubernetes SIG Storage is responsible for ensuring storage is available for containers in...

28 Touko 20229min

What we've learned from running a PostgreSQL managed service on Kubernetes (DoK Day EU 2022) // Oleksii Kliukin

What we've learned from running a PostgreSQL managed service on Kubernetes (DoK Day EU 2022) // Oleksii Kliukin

https://go.dok.community/slack https://dok.community/ From the DoK Day EU 2022 (https://youtu.be/Xi-h4XNd5tE) Kubernetes is an emerging platform of choice for deploying and running PostgresSQL. Deplo...

28 Touko 202211min

Weathering The Cloud Storm- Modern Data Management Patterns for Reliability and Availability (DoK Day EU 2022) // Denis Magda

Weathering The Cloud Storm- Modern Data Management Patterns for Reliability and Availability (DoK Day EU 2022) // Denis Magda

https://go.dok.community/slack https://dok.community/ From the DoK Day EU 2022 (https://youtu.be/Xi-h4XNd5tE) “Zero downtime” and “always-on” are illusions. All systems fail sooner or later, whether ...

28 Touko 202210min

Using Kubernetes to deliver a “serverless” service (DoK Day EU 2022) // Jim Walker

Using Kubernetes to deliver a “serverless” service (DoK Day EU 2022) // Jim Walker

https://go.dok.community/slack https://dok.community/ From the DoK Day EU 2022 (https://youtu.be/Xi-h4XNd5tE) Serverless promises to change the way we consume software. It allows us to potentially pa...

28 Touko 202220min

The many uses of Kubernetes cross cluster migration of persistent data (DoK Day EU 2022) // Ryan Kaw

The many uses of Kubernetes cross cluster migration of persistent data (DoK Day EU 2022) // Ryan Kaw

https://go.dok.community/slack https://dok.community/ From the DoK Day EU 2022 (https://youtu.be/Xi-h4XNd5tE) Multiple clusters exist in most Kubernetes environments today, and number of clusters wil...

28 Touko 20227min

The future of data on Kubernetes with Adobe and CNCF (DoK Day EU 2022) // Joseph Sandoval, Xing Yang & Sylvain Kalache

The future of data on Kubernetes with Adobe and CNCF (DoK Day EU 2022) // Joseph Sandoval, Xing Yang & Sylvain Kalache

https://go.dok.community/slack https://dok.community/ From the DoK Day EU 2022 (https://youtu.be/Xi-h4XNd5tE) Some data-intensive workloads are easier to run in Kubernetes than others. Why? What need...

28 Touko 202217min

The Data on Kubernetes Landscape (DoK Day EU 2022) // Melissa Logan & Sylvain Kalache

The Data on Kubernetes Landscape (DoK Day EU 2022) // Melissa Logan & Sylvain Kalache

https://go.dok.community/slack https://dok.community/ From the DoK Day EU 2022 (https://youtu.be/Xi-h4XNd5tE) We know from the first Data on Kubernetes Report that 90% of respondents believe Kubernet...

27 Touko 202210min

Testing the Mettle- Evaluating data solutions for large-scale production to check who stacks up (DoK Day EU 2022) // Dinesh Majrekar

Testing the Mettle- Evaluating data solutions for large-scale production to check who stacks up (DoK Day EU 2022) // Dinesh Majrekar

https://go.dok.community/slack https://dok.community/ From the DoK Day EU 2022 (https://youtu.be/Xi-h4XNd5tE) The state of the CNCF Storage options has exploded in the past few years, but if you had ...

27 Touko 20229min