#31 DoK Community: The Data Lifecycle - Where Do We Go From Here // Benjamin Rogojan. (Presenter: Bart Farrell)

#31 DoK Community: The Data Lifecycle - Where Do We Go From Here // Benjamin Rogojan. (Presenter: Bart Farrell)

Abstract of the talk…

Going from raw data to machine learning models successfully in companies of all sizes requires more than just an understanding of programming. Teams need to manage their data products lifecycle, their software as well as the data. Data products like machine learning models aren’t created out of thin air. They are built on layers of best practices that ensure the models are using accurate data, they are outputting reliable numbers and they have some method to interact with the outside world. So how do we get there? The purpose of this talk is to discuss the current state of the data lifecycle as it pertains to creating data products. This could be machine learning models, dashboards and data APIs. We will outline the general architecture that helps take data from raw to some form of machine learning model. In addition, we will discuss some of the concepts that are being applied from DevOps as well as being created in MLOps to help better facilitate your data life cycle.

Bio…

Ben has spent his career focused on all forms of data. He has focused on developing algorithms to detect fraud, reduce patient readmission and redesign insurance provider policy to help reduce the overall cost of healthcare. He has also worked in various industries including transportation, Big Tech, start-ups, insurance, Saas and more. In all of these industries he has helped companies develop their data strategy. Often starting from scratch to develop an end-to-end data solution. Ben privately consults on data science and engineering problems both solo with Seattle Data Guy as well as with a company called Acheron Analytics. He has experience both working hands-on with technical problems as well as helping leadership teams develop strategies to maximize their data.

Key take-aways from the talk…

- Creating successful data products and models requires more than just programming skills - Best practices from DevOps can help improve data science and ML models maintenance and lifecycle

Jaksot(243)

What's New in Kubernetes Storage (DoK Day EU 2022) // Xing Yang

What's New in Kubernetes Storage (DoK Day EU 2022) // Xing Yang

https://go.dok.community/slack https://dok.community/ From the DoK Day EU 2022 (https://youtu.be/Xi-h4XNd5tE) Kubernetes SIG Storage is responsible for ensuring storage is available for containers in...

28 Touko 20229min

What we've learned from running a PostgreSQL managed service on Kubernetes (DoK Day EU 2022) // Oleksii Kliukin

What we've learned from running a PostgreSQL managed service on Kubernetes (DoK Day EU 2022) // Oleksii Kliukin

https://go.dok.community/slack https://dok.community/ From the DoK Day EU 2022 (https://youtu.be/Xi-h4XNd5tE) Kubernetes is an emerging platform of choice for deploying and running PostgresSQL. Deplo...

28 Touko 202211min

Weathering The Cloud Storm- Modern Data Management Patterns for Reliability and Availability (DoK Day EU 2022) // Denis Magda

Weathering The Cloud Storm- Modern Data Management Patterns for Reliability and Availability (DoK Day EU 2022) // Denis Magda

https://go.dok.community/slack https://dok.community/ From the DoK Day EU 2022 (https://youtu.be/Xi-h4XNd5tE) “Zero downtime” and “always-on” are illusions. All systems fail sooner or later, whether ...

28 Touko 202210min

Using Kubernetes to deliver a “serverless” service (DoK Day EU 2022) // Jim Walker

Using Kubernetes to deliver a “serverless” service (DoK Day EU 2022) // Jim Walker

https://go.dok.community/slack https://dok.community/ From the DoK Day EU 2022 (https://youtu.be/Xi-h4XNd5tE) Serverless promises to change the way we consume software. It allows us to potentially pa...

28 Touko 202220min

The many uses of Kubernetes cross cluster migration of persistent data (DoK Day EU 2022) // Ryan Kaw

The many uses of Kubernetes cross cluster migration of persistent data (DoK Day EU 2022) // Ryan Kaw

https://go.dok.community/slack https://dok.community/ From the DoK Day EU 2022 (https://youtu.be/Xi-h4XNd5tE) Multiple clusters exist in most Kubernetes environments today, and number of clusters wil...

28 Touko 20227min

The future of data on Kubernetes with Adobe and CNCF (DoK Day EU 2022) // Joseph Sandoval, Xing Yang & Sylvain Kalache

The future of data on Kubernetes with Adobe and CNCF (DoK Day EU 2022) // Joseph Sandoval, Xing Yang & Sylvain Kalache

https://go.dok.community/slack https://dok.community/ From the DoK Day EU 2022 (https://youtu.be/Xi-h4XNd5tE) Some data-intensive workloads are easier to run in Kubernetes than others. Why? What need...

28 Touko 202217min

The Data on Kubernetes Landscape (DoK Day EU 2022) // Melissa Logan & Sylvain Kalache

The Data on Kubernetes Landscape (DoK Day EU 2022) // Melissa Logan & Sylvain Kalache

https://go.dok.community/slack https://dok.community/ From the DoK Day EU 2022 (https://youtu.be/Xi-h4XNd5tE) We know from the first Data on Kubernetes Report that 90% of respondents believe Kubernet...

27 Touko 202210min

Testing the Mettle- Evaluating data solutions for large-scale production to check who stacks up (DoK Day EU 2022) // Dinesh Majrekar

Testing the Mettle- Evaluating data solutions for large-scale production to check who stacks up (DoK Day EU 2022) // Dinesh Majrekar

https://go.dok.community/slack https://dok.community/ From the DoK Day EU 2022 (https://youtu.be/Xi-h4XNd5tE) The state of the CNCF Storage options has exploded in the past few years, but if you had ...

27 Touko 20229min