DoK Community #42 Spark on Kubernetes is Now Generally Available: Why & How to Migrate to It // Jean-Yves Stephan

DoK Community #42 Spark on Kubernetes is Now Generally Available: Why & How to Migrate to It // Jean-Yves Stephan

Abstract of the talk…

Apache Spark natively runs on top of Kubernetes (instead of Hadoop YARN) since 2018, but it's only since Spark 3.1 (released in March 2021) that the integration is now officially generally available & production-ready. What is the high-level architecture of Spark on Kubernetes, how does it compare to alternatives, what does the migration look like? These are some of the questions we will answer together. We will first introduce the core concepts, then go through the stories of customers who migrated, and then give you concrete technical tips to help you be successful with Spark (on Kubernetes). If time permits, I may do a risky live demo. This will be a technical talk with very fresh content - I hope you will like it. I plan to make it short enough to make room for Q&A and improvisations based on your request. So let me know if there's something specific you're interested in.

Bio…

I'm one of the co-founders at Data Mechanics (https://www.datamechanics.co), a Cloud-Native Spark Platform for Data Engineers. We're a YCombinator backed startup. We strive to finally make Apache Spark as developer friendly and cost-effective as it should be.. by automating the infrastructure management side (autoscaling, automated sizing of containers, autotuning of Spark configurations) and building intuitive dashboards to help monitor your data pipelines. Prior to Data Mechanics, I was a software engineer at Databricks, where I led their Spark infrastructure team.

Jaksot(243)

Implementing Data & Databases on K8s within the Dutch Government | DoKC Town Hall

Implementing Data & Databases on K8s within the Dutch Government | DoKC Town Hall

Implementing Data & Databases on K8s within the Dutch GovernmentPresented by Sebastiaan Mannem, Director at Mannem Solutions A small walkthrough of projects within the Dutch government running databas...

13 Helmi 202444min

Unsticking Ourselves from Glue: Migrating PayIt’s Data Pipelines to Argo Workflows and Hera | DoKC Town Hall

Unsticking Ourselves from Glue: Migrating PayIt’s Data Pipelines to Argo Workflows and Hera | DoKC Town Hall

Unsticking Ourselves from Glue: Migrating PayIt’s Data Pipelines to Argo Workflows and HeraPresented by Matt Menzenski, Senior Software Engineering Manager, Payitgov At PayIt, we’ve been deploying app...

6 Helmi 202423min

Repel Boarders! How to find a Kubernetes operator that really protects your data | DoKC Town Hall

Repel Boarders! How to find a Kubernetes operator that really protects your data | DoKC Town Hall

Repel Boarders! How to find a Kubernetes operator that really protects your dataPresented by Robert Hodges, AltinityOperators are a godsend for managing data in Kubernetes. But how about protecting it...

30 Tammi 202419min

DoK + Apache Spark | DoKC Town Hall

DoK + Apache Spark | DoKC Town Hall

DoK + Apache SparkPresented by Holden Karau, Spark Committer and Open Source Engineer at NetflixIn this brief talk, Holden will cover some of the best practices from trying to deploy both small and la...

23 Tammi 202419min

DoK @ Comcast - Deliver Business Outcomes & Improved DevX with Data Services on K8s | DoKC Town Hall

DoK @ Comcast - Deliver Business Outcomes & Improved DevX with Data Services on K8s | DoKC Town Hall

DoK @ Comcast: Delivering Business Outcomes & Improved DevX with Data Services Running on Kubernetes Presented by Greg Otto, Executor Director, DevX Platforms & Charles Ju, Principal Engineer Transfor...

3 Tammi 202416min

DoK Talks - What is Kafka? The rise of one of the world's most used streaming data technologies // Abbey Russell

DoK Talks - What is Kafka? The rise of one of the world's most used streaming data technologies // Abbey Russell

Abbey Russell, PM at Cockroach Labs, shared the backstory on how and why Kafka was created. Along the way, you'll learn about - Who Franz Kafka was - Kafka's earliest use at Linkedin in 2010 -...

9 Maalis 202315min

DoK Talks - (almost)Everything you need to know about stateful cloud native network applications // W Watson

DoK Talks - (almost)Everything you need to know about stateful cloud native network applications // W Watson

https://go.dok.community/slack https://dok.community/ https://youtu.be/KjiK6eXYO34 DoK Talk with W Watson, Founder at Vulk Co-op

2 Maalis 202343min

The Outer Nerd #001 - Dungeons & Dragons - Why should you care? // Abhi Vaidyanatha, Fabian Met & Chase Christensen

The Outer Nerd #001 - Dungeons & Dragons - Why should you care? // Abhi Vaidyanatha, Fabian Met & Chase Christensen

https://dokcommunity.slack.com/ https://dok.community/ ABSTRACT OF THE TALK Fabian, Chris and Abhi will discuss their passion for roleplaying games, and what they can teach us about the power of ...

13 Joulu 202258min