#31 DoK Community: The Data Lifecycle - Where Do We Go From Here // Benjamin Rogojan. (Presenter: Bart Farrell)

#31 DoK Community: The Data Lifecycle - Where Do We Go From Here // Benjamin Rogojan. (Presenter: Bart Farrell)

Abstract of the talk…

Going from raw data to machine learning models successfully in companies of all sizes requires more than just an understanding of programming. Teams need to manage their data products lifecycle, their software as well as the data. Data products like machine learning models aren’t created out of thin air. They are built on layers of best practices that ensure the models are using accurate data, they are outputting reliable numbers and they have some method to interact with the outside world. So how do we get there? The purpose of this talk is to discuss the current state of the data lifecycle as it pertains to creating data products. This could be machine learning models, dashboards and data APIs. We will outline the general architecture that helps take data from raw to some form of machine learning model. In addition, we will discuss some of the concepts that are being applied from DevOps as well as being created in MLOps to help better facilitate your data life cycle.

Bio…

Ben has spent his career focused on all forms of data. He has focused on developing algorithms to detect fraud, reduce patient readmission and redesign insurance provider policy to help reduce the overall cost of healthcare. He has also worked in various industries including transportation, Big Tech, start-ups, insurance, Saas and more. In all of these industries he has helped companies develop their data strategy. Often starting from scratch to develop an end-to-end data solution. Ben privately consults on data science and engineering problems both solo with Seattle Data Guy as well as with a company called Acheron Analytics. He has experience both working hands-on with technical problems as well as helping leadership teams develop strategies to maximize their data.

Key take-aways from the talk…

- Creating successful data products and models requires more than just programming skills - Best practices from DevOps can help improve data science and ML models maintenance and lifecycle

Jaksot(243)

Highly Available Postgres Clusters In Kubernetes // John Long & Jonathan Gonzalez (DoK Day North America 2022)

Highly Available Postgres Clusters In Kubernetes // John Long & Jonathan Gonzalez (DoK Day North America 2022)

From the DoK Day North America 2022 (https://youtu.be/YWTa-DiVljY) ABSTRACT A practical session about running Highly Available PostgreSQL in Kubernetes. The primary objective will be to demonstr...

2 Marras 202215min

Inter-Cluster PostreSQL on Kubernetes // Julian Fischer (DoK Day North America 2022)

Inter-Cluster PostreSQL on Kubernetes // Julian Fischer (DoK Day North America 2022)

From the DoK Day North America 2022 (https://youtu.be/YWTa-DiVljY) ABSTRACT In this talk you’ll explore how to run a PostgreSQL cluster across multiple Kubernetes clusters. Learn what challenges ar...

2 Marras 202217min

Open Source Databases on Kubernetes- Best Practices // Peter Zaitsev (DoK Day North America 2022)

Open Source Databases on Kubernetes- Best Practices // Peter Zaitsev (DoK Day North America 2022)

From the DoK Day North America 2022 (https://youtu.be/YWTa-DiVljY) ABSTRACT So you’re looking to run your Open Source Database on Kubernetes. What best practices should you follow and what pitfall...

2 Marras 202216min

The Kubernetes Native Database // Jeffrey Carpenter (DoK Day North America 2022)

The Kubernetes Native Database // Jeffrey Carpenter (DoK Day North America 2022)

From the DoK Day North America 2022 (https://youtu.be/YWTa-DiVljY) ABSTRACT In the software industry we’re fond of terms that define major trends, like “cloud native”, “Kubernetes native” and “server...

2 Marras 202216min

Databases on Kubernetes: Why are they important? // With Bhavin Shah, Xing Yang, Gabriele Bartolini & Patrick McFadin (DoK Day North America 2022)

Databases on Kubernetes: Why are they important? // With Bhavin Shah, Xing Yang, Gabriele Bartolini & Patrick McFadin (DoK Day North America 2022)

From the DoK Day North America 2022 (https://youtu.be/YWTa-DiVljY) ABSTRACT Kubernetes has crossed the chasm, but what about stateful applications and databases? Join us for this panel discussion and...

2 Marras 202234min

Data streaming on Kubernetes // Yaniv Ben Hemo (DoK Day North America 2022)

Data streaming on Kubernetes // Yaniv Ben Hemo (DoK Day North America 2022)

From the DoK Day North America 2022 (https://youtu.be/YWTa-DiVljY) ABSTRACT I will cover what is the current data streaming on k8s landscape, why it is important, use cases, and what are the challeng...

2 Marras 202213min

Architecting Your First Event Driven Serverless Streaming Applications on K8 // Timothy Spann (DoK Day North America 2022)

Architecting Your First Event Driven Serverless Streaming Applications on K8 // Timothy Spann (DoK Day North America 2022)

From the DoK Day North America 2022 (https://youtu.be/YWTa-DiVljY) ABSTRACT Once you have built a topic in Apache Pulsar, you will quickly see the need to build event-driven applications. This can r...

2 Marras 202213min

Fybrik - A Kubernetes based platform for governed data use // Flora Gilboa-Solomon, Alexey Roytman, Maryna Strelchuk & Barry Hijkoop (DoK Day North America 2022)

Fybrik - A Kubernetes based platform for governed data use // Flora Gilboa-Solomon, Alexey Roytman, Maryna Strelchuk & Barry Hijkoop (DoK Day North America 2022)

From the DoK Day North America 2022 (https://youtu.be/YWTa-DiVljY) ABSTRACT Data is the foundation for business value. However, in many enterprises, it is spread across different data stores, public...

1 Marras 202220min