Datashim - a framework for declarative management of datasets on Kubernetes (DoK Day EU 2022) // Srikumar Venugopal

Datashim - a framework for declarative management of datasets on Kubernetes (DoK Day EU 2022) // Srikumar Venugopal

https://go.dok.community/slack

https://dok.community/

From the DoK Day EU 2022 (https://youtu.be/Xi-h4XNd5tE)


Many ML pipelines depend on shared filesystems for input, output and intermediate data storage. Standards such as CSI have made it possible for applications in Kubernetes to access a variety of data storage systems. Yet, data scientists still have to deal with low-level details of data access in order to execute their pipelines in Kubernetes. Datashim is a framework that manages the lifecycle of a Dataset object, a CustomResourceDefinition that represents a source of data. Datashim takes care of the details of data access while Kubernetes pods can declaratively access the data by referencing a Dataset in their specifications. This talk will describe Datashim and the Dataset object, discuss its use in ML pipelines, and demonstrate how its pluggable architecture is designed for the development of caching, scheduling and governance plugins. Datashim is an incubating project of the Linux Foundation Data and AI Foundation


Srikumar Venugopal is a Research Scientist in IBM Research Europe in Dublin, Ireland. His research interests lie in the area of cloud computing and large-scale distributed systems, specifically in the topics of middleware, resource management, and scalability. He is the co-founder and current lead for the Datashim project.

Avsnitt(243)

Dok Talks #151 - Analytics with Apache Superset and ClickHouse // Vijay Anand Ramakrishnan

Dok Talks #151 - Analytics with Apache Superset and ClickHouse // Vijay Anand Ramakrishnan

https://go.dok.community/slack https://dok.community With: Vijay Anand Ramakrishnan - Database Administrator, ChistaDATA Bart Farrell - Head of Community, Data on Kubernetes Community ABSTRACT OF TH...

23 Sep 202233min

Dok Talks #150 - Building a Simple Postgres Async Streaming Cluster // Julian Fischer

Dok Talks #150 - Building a Simple Postgres Async Streaming Cluster // Julian Fischer

https://go.dok.community/slack https://dok.community With: Julian Fischer - CEO, anynines GmbH Bart Farrell - Head of Community, Data on Kubernetes Community ABSTRACT OF THE TALK In this talk you wi...

23 Sep 20221h 4min

DoK Talks #149 - Overcoming challenges with protecting and migrating data in multi-cloud K8s environments // Sebastian Glab & Martin Phan

DoK Talks #149 - Overcoming challenges with protecting and migrating data in multi-cloud K8s environments // Sebastian Glab & Martin Phan

https://go.dok.community/slack https://dok.community/ With: Sebastian Glab - Cloud Architect, CloudCasa by Catalogic Martin Phan - Field CTO – Americas, CloudCasa by Catalogic Bart Farrell - Head...

16 Sep 202247min

DoK Talks #147 - Evaluating Cloud Native Storage Vendors // Dinesh Majrekar

DoK Talks #147 - Evaluating Cloud Native Storage Vendors // Dinesh Majrekar

https://go.dok.community/slack https://dok.community/ With: Dinesh Majrekar - CTO, Civo Bart Farrell - Head of Community, Data on Kubernetes Community ABSTRACT OF THE TALK In a continuation of ...

5 Sep 20221h

Dok Talks #146 - OpenFeature - Making feature flags a commodity // Oleg Nenashev

Dok Talks #146 - OpenFeature - Making feature flags a commodity // Oleg Nenashev

https://go.dok.community/slack https://dok.community/ With: Oleg Nenashev - Community Builder and Developer Advocate, Dynatrace Bart Farrell - Head of Community, Data on Kubernetes Community AB...

26 Aug 20221h 1min

DoK Talks #145 - Making Hard Things Easy is Hard // Kurt Rinehart

DoK Talks #145 - Making Hard Things Easy is Hard // Kurt Rinehart

https://go.dok.community/slack https://dok.community/ https://youtu.be/6eSWOUzCb4w With: Kurt Rinehart - Director of Information Engineering, Section Bart Farrell - Head of Community, Data on Kube...

19 Aug 202257min

DoK Talks #144 - We will Dok You! - The journey to adopt stateful workloads on k8s // Guy Menahem

DoK Talks #144 - We will Dok You! - The journey to adopt stateful workloads on k8s // Guy Menahem

https://go.dok.community/slack https://dok.community/ https://youtu.be/AjvwG53yLMY With: Guy Menahem - Solution Architect, Komodor Bart Farrell - Head of Community, Data on Kubernetes Community A...

18 Aug 20221h 6min

DoK Talks #142 - Kubernetes Cluster Upgrade Strategies and Data: Best Practices for your Stateful Workload // Peter Schuurman

DoK Talks #142 - Kubernetes Cluster Upgrade Strategies and Data: Best Practices for your Stateful Workload // Peter Schuurman

https://go.dok.community/slack https://dok.community/ ABSTRACT OF THE TALK How do you make sure your Stateful Workloads remain available when your Kubernetes infrastructure updates? This talk wil...

18 Aug 202258min

Populärt inom Teknik

uppgang-och-fall
bilar-med-sladd
elbilsveckan
market-makers
rss-elektrikerpodden
skogsforum-podcast
rss-technokratin
rss-laddstationen-med-elbilen-i-sverige
rss-veckans-ai
rss-uppgang-och-fall
bli-saker-podden
developers-mer-an-bara-kod
rss-en-ai-till-kaffet
rss-powerboat-sverige-podcast
natets-morka-sida
rss-fabriken-2
har-vi-akt-till-mars-an
hej-bruksbil
rss-milpodden
rss-snacka-om-ai