DoK Talks #99- ETL/ELT on Kubernetes with Airbyte: K8s Development Insights // Abhi Vaidyanatha

DoK Talks #99- ETL/ELT on Kubernetes with Airbyte: K8s Development Insights // Abhi Vaidyanatha

https://go.dok.community/slack
https://dok.community/

ABSTRACT OF THE TALK

ETL/ELT on Kubernetes is currently an unsolved problem. There are a lot of different approaches vying for a spot as the de facto method, but none are clear winners. Considering that the cloud-native landscape is built for deploying Dockerized, open-source software, many of the closed-source solutions fall flat and don't mesh with the trajectory of the community.

Airbyte is an open-source ETL/ELT tool that harmonizes well with the cloud-native landscape and lives to enable your stateful workloads on Kubernetes. Previously, I have talked about a theoretical deployment on Kubernetes and the nuances behind deploying an ETL/ELT pipeline in such an environment. Now, I'm looking to follow that up with how we actually implemented that strategy as we launched our K8s beta. Additionally, I'll dive into some of the nitty gritty details that we needed to figure out in order to get this all working... stuff that isn't really found online!

Overall, this will be a really unique case of getting to do a retrospective on what we planned our architecture to look like and following up with some great development insights as we solidified the final implementation.

KEY TAKE-AWAYS FROM THE TALK

- Quick overview of Airbyte and open-source ETL/ELT [5 minutes]
- Why run your ETL/ELT in K8s? [3 minutes]
- A quick recap on the previous talk (what we thought the architecture would look like) [5 minutes]
- Display the actual architecture and implementation [10 minutes]
-> Talk about how to communicate with k8s pods on STDOUT and STDIN pipes
-> Describe parent-child process termination strategy
-> Describe persistence layer/strategy and config storage
- Quick demo of an Airbyte deployment on K8s [10 minutes]

BIO

Abhi is a confused economist who enjoys writing backend code for data management software. He now spends most of his time doing developer relations in the data integration space, where he looks to evangelize open source technologies. In his spare time he is a DJ, drummer, and competitive Super Smash Bros. player. He is a staunch advocate of proper semicolon usage, Oxford commas and overused grammar jokes.

Jaksot(243)

DoK Talks #155 - Databases at the edge with K3s and ARM devices // Sergio Méndez

DoK Talks #155 - Databases at the edge with K3s and ARM devices // Sergio Méndez

https://go.dok.community/slack https://dok.community/ https://youtu.be/KjiK6eXYO34 ABSTRACT OF THE TALK In this talk Sergio is going to present different ways to store data at the edge using diff...

29 Marras 202249min

DoK Talks #154 - StatefulSets in K8 // Srinivas Karnati

DoK Talks #154 - StatefulSets in K8 // Srinivas Karnati

https://go.dok.community/slack https://dok.community/ Link: https://youtu.be/n_thXwyJNSU ABSTRACT OF THE TALK Deploying Stateless applications is easy but this is not the case for Stateful applica...

23 Marras 202231min

Data-driven Diversity, Equity, and Inclusion // Lisa-Marie Namphy, Melissa Logan, Tiffany Jachja, Audra Montenegro & Cortney Nickerson (DoK Day North America 2022)

Data-driven Diversity, Equity, and Inclusion // Lisa-Marie Namphy, Melissa Logan, Tiffany Jachja, Audra Montenegro & Cortney Nickerson (DoK Day North America 2022)

From the DoK Day North America 2022 (https://youtu.be/YWTa-DiVljY)

2 Marras 202219min

 Formula 1 telemetry processing using Apache Kafka on Kubernetes // Paolo Patierno (DoK Day North America 2022)

Formula 1 telemetry processing using Apache Kafka on Kubernetes // Paolo Patierno (DoK Day North America 2022)

From the DoK Day North America 2022 (https://youtu.be/YWTa-DiVljY) Video - https://youtu.be/4cPVRWOK-_E ABSTRACT Apache Kafka is the de facto data streaming platform used for ingesting vast amounts...

2 Marras 202215min

Choosing Kubernetes for Stateful Applications // Akshay Ram & Peter Schuurman (DoK Day North America 2022)

Choosing Kubernetes for Stateful Applications // Akshay Ram & Peter Schuurman (DoK Day North America 2022)

From the DoK Day North America 2022 (https://youtu.be/YWTa-DiVljY) Video - https://youtu.be/Y4tdy9lctEI ABSTRACT Learn how customers are increasingly deploying stateful applications on Kubernetes t...

2 Marras 202218min

Kubernetes 360º - Data driven observability - from Secrets to logs // Ben Hirschberg (DoK Day North America 2022)

Kubernetes 360º - Data driven observability - from Secrets to logs // Ben Hirschberg (DoK Day North America 2022)

From the DoK Day North America 2022 (https://youtu.be/YWTa-DiVljY) Video - https://youtu.be/A1ch4AhKoeQ ABSTRACT If there’s one thing that everyone can agree on - it’s that the sheer scale and comple...

2 Marras 202217min

Shifting Left Stateful Applications In Kubernetes // Viktor Farcic (DoK Day North America 2022)

Shifting Left Stateful Applications In Kubernetes // Viktor Farcic (DoK Day North America 2022)

From the DoK Day North America 2022 (https://youtu.be/YWTa-DiVljY) Video - https://youtu.be/LymPjH6HA3E ABSTRACT Stateless apps are easy to manage. More often than not, a Kubernetes Deployment, with ...

2 Marras 202215min

Medical - Healthcare Data on Kubernetes // Olyvia Rakshit & Prasad Dorbala (DoK Day North America 2022)

Medical - Healthcare Data on Kubernetes // Olyvia Rakshit & Prasad Dorbala (DoK Day North America 2022)

From the DoK Day North America 2022 (https://youtu.be/YWTa-DiVljY) ABSTRACT Healthcare organizations are transforming their applications and embracing digital platforms for efficient patient care....

2 Marras 202213min