Datashim - a framework for declarative management of datasets on Kubernetes (DoK Day EU 2022) // Srikumar Venugopal

Datashim - a framework for declarative management of datasets on Kubernetes (DoK Day EU 2022) // Srikumar Venugopal

https://go.dok.community/slack

https://dok.community/

From the DoK Day EU 2022 (https://youtu.be/Xi-h4XNd5tE)


Many ML pipelines depend on shared filesystems for input, output and intermediate data storage. Standards such as CSI have made it possible for applications in Kubernetes to access a variety of data storage systems. Yet, data scientists still have to deal with low-level details of data access in order to execute their pipelines in Kubernetes. Datashim is a framework that manages the lifecycle of a Dataset object, a CustomResourceDefinition that represents a source of data. Datashim takes care of the details of data access while Kubernetes pods can declaratively access the data by referencing a Dataset in their specifications. This talk will describe Datashim and the Dataset object, discuss its use in ML pipelines, and demonstrate how its pluggable architecture is designed for the development of caching, scheduling and governance plugins. Datashim is an incubating project of the Linux Foundation Data and AI Foundation


Srikumar Venugopal is a Research Scientist in IBM Research Europe in Dublin, Ireland. His research interests lie in the area of cloud computing and large-scale distributed systems, specifically in the topics of middleware, resource management, and scalability. He is the co-founder and current lead for the Datashim project.

Episoder(243)

DoK Community #45 K8s DX Chronicles: Evolution From CLI to GitOps & Cloud Native IDEs // Katie Gamanji

DoK Community #45 K8s DX Chronicles: Evolution From CLI to GitOps & Cloud Native IDEs // Katie Gamanji

Abstract of the talk… Within its 7 years of existence, Kubernetes has been the gravitational center of the Cloud Native landscape, elevating a pluggable system that contributed to the diversification ...

1 Mai 202155min

DoK Community #44 DataOps // Vijay AB Kumar

DoK Community #44 DataOps // Vijay AB Kumar

Abstract of the talk… The talk will cover the various aspects of DataOps, why DataOps is important. It will also talk about some of the client experiences and how DataOps strategy is helping addresses...

1 Mai 20211h 1min

DoK Community #46 Recovering and Porting Applications in the Fast-Paced DevOps World // Prashanto Kochavara

DoK Community #46 Recovering and Porting Applications in the Fast-Paced DevOps World // Prashanto Kochavara

Abstract of the talk… Are you a Cloud Architect, DevOps Engineer or SRE who is developing cloud-native applications, managing complex app migration projects or needs infrastructure resiliency? Cloud-n...

30 Apr 20211h 7min

DoK Community #43 Kubecost: open source cost monitoring for Kubernetes // Webb Brown

DoK Community #43 Kubecost: open source cost monitoring for Kubernetes // Webb Brown

Abstract of the talk… Measuring costs in Kubernetes environments is complex. Applications and their resources needs are often dynamic. Teams share resources without transparent prices attached to work...

25 Apr 20211h 2min

DoK Community #42 Spark on Kubernetes is Now Generally Available: Why & How to Migrate to It // Jean-Yves Stephan

DoK Community #42 Spark on Kubernetes is Now Generally Available: Why & How to Migrate to It // Jean-Yves Stephan

Abstract of the talk… Apache Spark natively runs on top of Kubernetes (instead of Hadoop YARN) since 2018, but it's only since Spark 3.1 (released in March 2021) that the integration is now officially...

23 Apr 20211h 3min

#3 DoK Community Brasil: "Como CNCF Brasil pode nos ajudar na nossa carreira de SRE, DevOps ou Dev" // Paulo Alberto Simoes

#3 DoK Community Brasil: "Como CNCF Brasil pode nos ajudar na nossa carreira de SRE, DevOps ou Dev" // Paulo Alberto Simoes

Talk in Portuguese Bio… Capacitando arquitetos, desenvolvedores e SREs em todo o mundo para fornecer aplicativos escaláveis e arquiteturas de sistemas otimizados para os recursos exclusivos da nuvem;...

23 Apr 20211h 3min

DoK Community #41 Designing Stateful Apps for the Cloud and Kubernetes // Evan Chan

DoK Community #41 Designing Stateful Apps for the Cloud and Kubernetes // Evan Chan

Abstract of the talk… Almost all applications have some kind of state. Some data processing apps and databases have huge amounts of state. How do we navigate a cloud-based world of containers where st...

20 Apr 20211h 1min

#40 DoK Community: Cloud-Native Chaos Engineering in Databases // Karthik Satchitanand

#40 DoK Community: Cloud-Native Chaos Engineering in Databases // Karthik Satchitanand

Abstract of the talk… Chaos Engineering is revolutionizing testing means and doing it the cloud-native way is the best way in today's rapidly changing world with a huge shift in the paradigm of Kubern...

18 Apr 20211h 5min

Populært innen Teknologi

lydartikler-fra-aftenposten
romkapsel
teknisk-sett
tomprat-med-gunnar-tjomlid
energi-og-klima
elektropodden
rss-impressions-2
nasjonal-sikkerhetsmyndighet-nsm
fornybaren
shifter
pedagogisk-intelligens
teknologi-og-mennesker
rss-for-alarmen-gar
rss-ai-forklart
rss-ki-praten
rss-polypod
rss-digitaliseringspadden
rss-ki-til-kaffen
smart-forklart
blaskjerm-brodrene