Datashim - a framework for declarative management of datasets on Kubernetes (DoK Day EU 2022) // Srikumar Venugopal

Datashim - a framework for declarative management of datasets on Kubernetes (DoK Day EU 2022) // Srikumar Venugopal

https://go.dok.community/slack

https://dok.community/

From the DoK Day EU 2022 (https://youtu.be/Xi-h4XNd5tE)


Many ML pipelines depend on shared filesystems for input, output and intermediate data storage. Standards such as CSI have made it possible for applications in Kubernetes to access a variety of data storage systems. Yet, data scientists still have to deal with low-level details of data access in order to execute their pipelines in Kubernetes. Datashim is a framework that manages the lifecycle of a Dataset object, a CustomResourceDefinition that represents a source of data. Datashim takes care of the details of data access while Kubernetes pods can declaratively access the data by referencing a Dataset in their specifications. This talk will describe Datashim and the Dataset object, discuss its use in ML pipelines, and demonstrate how its pluggable architecture is designed for the development of caching, scheduling and governance plugins. Datashim is an incubating project of the Linux Foundation Data and AI Foundation


Srikumar Venugopal is a Research Scientist in IBM Research Europe in Dublin, Ireland. His research interests lie in the area of cloud computing and large-scale distributed systems, specifically in the topics of middleware, resource management, and scalability. He is the co-founder and current lead for the Datashim project.

Episoder(243)

St.Patrick´s Day Special - A diplomatic answer to the meaning of data, kubernetes, and everything // Luke Feeney

St.Patrick´s Day Special - A diplomatic answer to the meaning of data, kubernetes, and everything // Luke Feeney

Abstract of the talk… I will talk about my experiences entering the world of databases and data management after a very different life as a diplomat. I will introduce TerminusDB and it's world history...

19 Mar 20211h 6min

#35 DoK Community: Make Kubernetes your development environment // Ramiro Berrelleza

#35 DoK Community: Make Kubernetes your development environment // Ramiro Berrelleza

https://go.dok.community/slack Abstract of the talk… Developers spend a lot of time making their local machine look like a cluster. But why do we do that? Our local machine is not where our code is s...

18 Mar 20211h 2min

#34 DoK Community: Opstrace, An open source alternative to services like Datadog, SignalFx, and others... // Sébastien Pahl

#34 DoK Community: Opstrace, An open source alternative to services like Datadog, SignalFx, and others... // Sébastien Pahl

Abstract of the talk… Open source observability should not be hard. What companies package as their enterprise offering should be available to anyone who wants to monitor their systems. Opstrace is a ...

13 Mar 202159min

#33 DoK Community: Making observability accessible is the fourth pillar // Alex Jones

#33 DoK Community: Making observability accessible is the fourth pillar // Alex Jones

Abstract of the talk… Observability systems are typically a collection of tools that cover the three pillars of logs, metrics and tracing. These enable skilled engineers to correlate telemetry insight...

10 Mar 202153min

#32 DoK Community: How to choose a Kubernetes distribution for on-prem environments? // Tomasz Cholewa (Presenter: Bart Farrell)

#32 DoK Community: How to choose a Kubernetes distribution for on-prem environments? // Tomasz Cholewa (Presenter: Bart Farrell)

Abstract of the talk… Buy a ready off-the-shelf product, customize an existing open source project, or build your own distribution? When you can't go to the cloud and leverage its powerful features yo...

7 Mar 20211h 13min

#31 DoK Community: The Data Lifecycle - Where Do We Go From Here // Benjamin Rogojan. (Presenter: Bart Farrell)

#31 DoK Community: The Data Lifecycle - Where Do We Go From Here // Benjamin Rogojan. (Presenter: Bart Farrell)

Abstract of the talk… Going from raw data to machine learning models successfully in companies of all sizes requires more than just an understanding of programming. Teams need to manage their data pro...

4 Mar 20211h 20min

#2 DoK Community Brazil: Bora entender as Bases de dados na nuvem com a ajuda de Wagner Bianchi! // Wagner Bianchi (Talk in Portuguese)

#2 DoK Community Brazil: Bora entender as Bases de dados na nuvem com a ajuda de Wagner Bianchi! // Wagner Bianchi (Talk in Portuguese)

Abstract of the talk… Uma conversa descontraída sobre o futuro de bases de dados como um serviço. Dados em Kubernetes desde o ponto de vista dum DBA. E várias outros assuntos parecidos. Bio… DBA e Pr...

27 Feb 202149min

#30 DoK Community: Kyverno for Kubernetes! // Jim Bugwadia. (Presenter: Bart Farrell)

#30 DoK Community: Kyverno for Kubernetes! // Jim Bugwadia. (Presenter: Bart Farrell)

Abstract of the talk… Kubernetes is powerful but can be complex to manage! In this talk, Jim Bugwadia from Nirmata will show how policy managers can help address the complexity via admission controls ...

24 Feb 20211h 3min

Populært innen Teknologi

lydartikler-fra-aftenposten
romkapsel
teknisk-sett
tomprat-med-gunnar-tjomlid
energi-og-klima
elektropodden
rss-impressions-2
nasjonal-sikkerhetsmyndighet-nsm
fornybaren
shifter
pedagogisk-intelligens
teknologi-og-mennesker
rss-for-alarmen-gar
rss-ai-forklart
rss-ki-praten
rss-polypod
rss-digitaliseringspadden
rss-ki-til-kaffen
smart-forklart
blaskjerm-brodrene