DoK Community #42 Spark on Kubernetes is Now Generally Available: Why & How to Migrate to It // Jean-Yves Stephan

DoK Community #42 Spark on Kubernetes is Now Generally Available: Why & How to Migrate to It // Jean-Yves Stephan

Abstract of the talk…

Apache Spark natively runs on top of Kubernetes (instead of Hadoop YARN) since 2018, but it's only since Spark 3.1 (released in March 2021) that the integration is now officially generally available & production-ready. What is the high-level architecture of Spark on Kubernetes, how does it compare to alternatives, what does the migration look like? These are some of the questions we will answer together. We will first introduce the core concepts, then go through the stories of customers who migrated, and then give you concrete technical tips to help you be successful with Spark (on Kubernetes). If time permits, I may do a risky live demo. This will be a technical talk with very fresh content - I hope you will like it. I plan to make it short enough to make room for Q&A and improvisations based on your request. So let me know if there's something specific you're interested in.

Bio…

I'm one of the co-founders at Data Mechanics (https://www.datamechanics.co), a Cloud-Native Spark Platform for Data Engineers. We're a YCombinator backed startup. We strive to finally make Apache Spark as developer friendly and cost-effective as it should be.. by automating the infrastructure management side (autoscaling, automated sizing of containers, autotuning of Spark configurations) and building intuitive dashboards to help monitor your data pipelines. Prior to Data Mechanics, I was a software engineer at Databricks, where I led their Spark infrastructure team.

Avsnitt(243)

The Challenges of Data Processing On Kubernetes - A look at Spark, Flink, Dask, and Ray // Holden Karau (DoK Day North America 2022)

The Challenges of Data Processing On Kubernetes - A look at Spark, Flink, Dask, and Ray // Holden Karau (DoK Day North America 2022)

From the DoK Day North America 2022 (https://youtu.be/YWTa-DiVljY) ABSTRACT This talk will go through both the improvements that have been made in Kubernetes for batch analytic workloads as well as ...

31 Okt 202220min

Scaling our SaaS offering to thousands of clusters // Dax McDonald (DoK Day North America 2022)

Scaling our SaaS offering to thousands of clusters // Dax McDonald (DoK Day North America 2022)

From the DoK Day North America 2022 (https://youtu.be/YWTa-DiVljY) ABSTRACT Sourcegraph is a code intelligence platform that helps our customers to understand their code better. As we have scaled up...

29 Okt 202221min

Why we decided to migrate our Jaeger storage to ClickHouse on Kubernetes // Arul Jegadish Francis (DoK Day North America 2022)

Why we decided to migrate our Jaeger storage to ClickHouse on Kubernetes // Arul Jegadish Francis (DoK Day North America 2022)

From the DoK Day North America 2022 (https://youtu.be/YWTa-DiVljY) Abstract We at OpsVerse provide a DevOps tools platform with fully-managed open source-based tools. One of our key offerings is a ...

28 Okt 202213min

Building a Digital Factory for the Sheet Metal Industry // Elie Assi (From the DoK Day North America 2022)

Building a Digital Factory for the Sheet Metal Industry // Elie Assi (From the DoK Day North America 2022)

From the DoK Day North America 2022 (https://youtu.be/YWTa-DiVljY) Abstract We develop systems to digitize the sheet metal industry with the belief that they should cooperate with each other in an ...

27 Okt 202220min

How we built our Big Data Stack (almost) entirely on top of Kubernetes // Neylson Crepalde (From DoK Day NA 2022)

How we built our Big Data Stack (almost) entirely on top of Kubernetes // Neylson Crepalde (From DoK Day NA 2022)

From the DoK Day North America 2022 (https://youtu.be/YWTa-DiVljY) Abstract Working with Terabytes of data is a major challenge for organizations both in terms of architecture and cost. In recent ...

26 Okt 202216min

Dok Talks #153 - CRD Panel // Eyar Zilberman & Álvaro Hernández

Dok Talks #153 - CRD Panel // Eyar Zilberman & Álvaro Hernández

https://go.dok.community/slack https://dok.community We are going to speak about CRDs, and discuss considering them as higher level entities that we normally consider them. CRDs normally are kind of a...

14 Okt 202258min

Dok #152-Running PostgreSQL in Kubernetes:from day 0 to day 2 with CloudNativePG // Gabriele Bartolini

Dok #152-Running PostgreSQL in Kubernetes:from day 0 to day 2 with CloudNativePG // Gabriele Bartolini

https://go.dok.community/slack https://dok.community With: Gabriele Bartolini - Vice President/CTO of Cloud Native and Kubernetes, EDB Bart Farrell - Head of Community, Data on Kubernetes Community...

28 Sep 20221h 3min

Dok Talks #148 - Cost and Kubernetes // Chris Love

Dok Talks #148 - Cost and Kubernetes // Chris Love

https://go.dok.community/slack https://dok.community With: Chris Love - Managing Partner, LionKube Bart Farrell - Head of Community, Data on Kubernetes Community ABSTRACT OF THE TALK Using Kuberne...

27 Sep 202245min

Populärt inom Teknik

uppgang-och-fall
market-makers
elbilsveckan
skogsforum-podcast
rss-elektrikerpodden
bilar-med-sladd
developers-mer-an-bara-kod
rss-uppgang-och-fall
rss-veckans-ai
rss-laddstationen-med-elbilen-i-sverige
rss-powerboat-sverige-podcast
rss-technokratin
bli-saker-podden
teknikveckan
hej-bruksbil
rss-fabriken-2
natets-morka-sida
gubbar-som-tjotar-om-bilar
har-vi-akt-till-mars-an
rss-snacka-om-ai