Datashim - a framework for declarative management of datasets on Kubernetes (DoK Day EU 2022) // Srikumar Venugopal

Datashim - a framework for declarative management of datasets on Kubernetes (DoK Day EU 2022) // Srikumar Venugopal

https://go.dok.community/slack

https://dok.community/

From the DoK Day EU 2022 (https://youtu.be/Xi-h4XNd5tE)


Many ML pipelines depend on shared filesystems for input, output and intermediate data storage. Standards such as CSI have made it possible for applications in Kubernetes to access a variety of data storage systems. Yet, data scientists still have to deal with low-level details of data access in order to execute their pipelines in Kubernetes. Datashim is a framework that manages the lifecycle of a Dataset object, a CustomResourceDefinition that represents a source of data. Datashim takes care of the details of data access while Kubernetes pods can declaratively access the data by referencing a Dataset in their specifications. This talk will describe Datashim and the Dataset object, discuss its use in ML pipelines, and demonstrate how its pluggable architecture is designed for the development of caching, scheduling and governance plugins. Datashim is an incubating project of the Linux Foundation Data and AI Foundation


Srikumar Venugopal is a Research Scientist in IBM Research Europe in Dublin, Ireland. His research interests lie in the area of cloud computing and large-scale distributed systems, specifically in the topics of middleware, resource management, and scalability. He is the co-founder and current lead for the Datashim project.

Avsnitt(243)

#17 DoK community: Is k8s Even Ready For Data? Round II // Patrick McFadin & Jeffry Molanus

#17 DoK community: Is k8s Even Ready For Data? Round II // Patrick McFadin & Jeffry Molanus

In our inaugural DOKC meet-up, Patrick McFadin Developer Advocate at Datastax emphasized the challenges of running Cassandra on Kubernetes, concluding at one point that “Kubernetes might not be ready ...

30 Nov 20201h 20min

#16 DoK community: HyperStore-C: S3 object storage managed by Kubernetes // Gary Ogasawara

#16 DoK community: HyperStore-C: S3 object storage managed by Kubernetes // Gary Ogasawara

Cloudian’s HyperStore is S3-compatible object storage software focused on the enterprise market. In this talk, I'll discuss how and why we are working on Kubernetes-managed versions of HyperStore, in...

30 Nov 202055min

Dok Season 1 Extras - #2 - El paso por Pivotal Cloud Foundry a Kubernetes // Alexander Herranz

Dok Season 1 Extras - #2 - El paso por Pivotal Cloud Foundry a Kubernetes // Alexander Herranz

En este episodio, Alexander Herranz nos habla sobre la localización de los datos de las empresas mediante la comparativa entre Openshift o Kubernetes. Algunos temas que tratamos: El paso por Pivota...

5 Nov 202031min

#15 DoK community: Reaching limits in K8s: A case study with Ingress Controller // Laurent Rouquette

#15 DoK community: Reaching limits in K8s: A case study with Ingress Controller // Laurent Rouquette

When talking about data, we usually think about big data and scale, and what do we do next. Such limits are sometimes a good problem to have. In this talk, we'll discuss our approach to this situation...

27 Okt 202049min

#14 DoK community: Kubernetes Cost Control // Arie van den Bos

#14 DoK community: Kubernetes Cost Control // Arie van den Bos

For our 14th installation of the data on k8s community meetup, we talked with Cloud System Engineer / Architect Arie van den Bos. // Abstract: In this meetup, Arie discussed the following: The import...

26 Okt 202051min

#13 DoK community: Distributed Workloads on Kubernetes Operators to the Rescue // Sebastien Guilloux

#13 DoK community: Distributed Workloads on Kubernetes Operators to the Rescue // Sebastien Guilloux

For our 13th installation of the data on k8s meetup, we will be talking with Senior Software Engineer Sebastien Guilloux from Elastic about Distributed workloads on k8s and how operators play a part i...

14 Okt 202058min

#12 DoK community: PostgreSQL as a Service on K8s at Zalando // Alexander Kukushkin

#12 DoK community: PostgreSQL as a Service on K8s at Zalando // Alexander Kukushkin

// Abstract: PostgreSQL is a powerful, open-source object-relational database system with over 30 years of active development that has earned it a strong reputation for reliability, feature robustness...

8 Okt 202057min

DoK Season 1 Extras - #1 - Is my data secure in K8s? // Asier Azaceta

DoK Season 1 Extras - #1 - Is my data secure in K8s? // Asier Azaceta

Bart Farrell interviews Asier Azaceta, Cloud Security Architect in the IBM European Centre of Competence

4 Okt 202050min

Populärt inom Teknik

uppgang-och-fall
bilar-med-sladd
elbilsveckan
market-makers
rss-elektrikerpodden
skogsforum-podcast
rss-technokratin
rss-laddstationen-med-elbilen-i-sverige
rss-veckans-ai
rss-uppgang-och-fall
bli-saker-podden
developers-mer-an-bara-kod
rss-en-ai-till-kaffet
rss-powerboat-sverige-podcast
natets-morka-sida
rss-fabriken-2
har-vi-akt-till-mars-an
hej-bruksbil
rss-milpodden
rss-snacka-om-ai