#37 DoK Community: Running Data Replication Pipelines on Kubernetes with Argo // Stephen Bailey

#37 DoK Community: Running Data Replication Pipelines on Kubernetes with Argo // Stephen Bailey

Abstract of the talk…

Hundreds of data teams have migrated to the ELT pattern in recent years, leveraging SaaS tools like Stitch or FiveTran to reliably load data into their infrastructure. These SaaS offerings are outstanding and can accelerate your time to production significantly. However, many teams prefer to roll their own tools. One solution in these cases is to deploy singer.io taps and targets — Python scripts that can perform data replication between arbitrary sources and destinations. The Singer specification is the foundation for the popular Stitch SaaS, and it is also leveraged by a number of independent consultants and data projects. Singer pipelines are highly modular. You can pipe any tap to any target to build a data pipeline that fits your needs, making them a good fit for containerized workflows. This article walks through the workflow at a high level and provides some example code to get up and running with some shared templates. I also drill into reasons for choosing the Argo approach over other orchestration tools like Airflow or Dagster, and the implications from a team perspective.

Bio…

Stephen Bailey is Director of Growth Analytics at Immuta, where he strives to implement privacy best practices while delivering business value from data. He loves to teach and learn, on just about any subject. He holds a PhD in educational cognitive neuroscience from Vanderbilt and enjoys reading philosophy

Jaksot(243)

Dok Talks #151 - Analytics with Apache Superset and ClickHouse // Vijay Anand Ramakrishnan

Dok Talks #151 - Analytics with Apache Superset and ClickHouse // Vijay Anand Ramakrishnan

https://go.dok.community/slack https://dok.community With: Vijay Anand Ramakrishnan - Database Administrator, ChistaDATA Bart Farrell - Head of Community, Data on Kubernetes Community ABSTRACT OF TH...

23 Syys 202233min

Dok Talks #150 - Building a Simple Postgres Async Streaming Cluster // Julian Fischer

Dok Talks #150 - Building a Simple Postgres Async Streaming Cluster // Julian Fischer

https://go.dok.community/slack https://dok.community With: Julian Fischer - CEO, anynines GmbH Bart Farrell - Head of Community, Data on Kubernetes Community ABSTRACT OF THE TALK In this talk you wi...

23 Syys 20221h 4min

DoK Talks #149 - Overcoming challenges with protecting and migrating data in multi-cloud K8s environments // Sebastian Glab & Martin Phan

DoK Talks #149 - Overcoming challenges with protecting and migrating data in multi-cloud K8s environments // Sebastian Glab & Martin Phan

https://go.dok.community/slack https://dok.community/ With: Sebastian Glab - Cloud Architect, CloudCasa by Catalogic Martin Phan - Field CTO – Americas, CloudCasa by Catalogic Bart Farrell - Head...

16 Syys 202247min

DoK Talks #147 - Evaluating Cloud Native Storage Vendors // Dinesh Majrekar

DoK Talks #147 - Evaluating Cloud Native Storage Vendors // Dinesh Majrekar

https://go.dok.community/slack https://dok.community/ With: Dinesh Majrekar - CTO, Civo Bart Farrell - Head of Community, Data on Kubernetes Community ABSTRACT OF THE TALK In a continuation of ...

5 Syys 20221h

Dok Talks #146 - OpenFeature - Making feature flags a commodity // Oleg Nenashev

Dok Talks #146 - OpenFeature - Making feature flags a commodity // Oleg Nenashev

https://go.dok.community/slack https://dok.community/ With: Oleg Nenashev - Community Builder and Developer Advocate, Dynatrace Bart Farrell - Head of Community, Data on Kubernetes Community AB...

26 Elo 20221h 1min

DoK Talks #145 - Making Hard Things Easy is Hard // Kurt Rinehart

DoK Talks #145 - Making Hard Things Easy is Hard // Kurt Rinehart

https://go.dok.community/slack https://dok.community/ https://youtu.be/6eSWOUzCb4w With: Kurt Rinehart - Director of Information Engineering, Section Bart Farrell - Head of Community, Data on Kube...

19 Elo 202257min

DoK Talks #144 - We will Dok You! - The journey to adopt stateful workloads on k8s // Guy Menahem

DoK Talks #144 - We will Dok You! - The journey to adopt stateful workloads on k8s // Guy Menahem

https://go.dok.community/slack https://dok.community/ https://youtu.be/AjvwG53yLMY With: Guy Menahem - Solution Architect, Komodor Bart Farrell - Head of Community, Data on Kubernetes Community A...

18 Elo 20221h 6min

DoK Talks #142 - Kubernetes Cluster Upgrade Strategies and Data: Best Practices for your Stateful Workload // Peter Schuurman

DoK Talks #142 - Kubernetes Cluster Upgrade Strategies and Data: Best Practices for your Stateful Workload // Peter Schuurman

https://go.dok.community/slack https://dok.community/ ABSTRACT OF THE TALK How do you make sure your Stateful Workloads remain available when your Kubernetes infrastructure updates? This talk wil...

18 Elo 202258min