Datashim - a framework for declarative management of datasets on Kubernetes (DoK Day EU 2022) // Srikumar Venugopal

Datashim - a framework for declarative management of datasets on Kubernetes (DoK Day EU 2022) // Srikumar Venugopal

https://go.dok.community/slack

https://dok.community/

From the DoK Day EU 2022 (https://youtu.be/Xi-h4XNd5tE)


Many ML pipelines depend on shared filesystems for input, output and intermediate data storage. Standards such as CSI have made it possible for applications in Kubernetes to access a variety of data storage systems. Yet, data scientists still have to deal with low-level details of data access in order to execute their pipelines in Kubernetes. Datashim is a framework that manages the lifecycle of a Dataset object, a CustomResourceDefinition that represents a source of data. Datashim takes care of the details of data access while Kubernetes pods can declaratively access the data by referencing a Dataset in their specifications. This talk will describe Datashim and the Dataset object, discuss its use in ML pipelines, and demonstrate how its pluggable architecture is designed for the development of caching, scheduling and governance plugins. Datashim is an incubating project of the Linux Foundation Data and AI Foundation


Srikumar Venugopal is a Research Scientist in IBM Research Europe in Dublin, Ireland. His research interests lie in the area of cloud computing and large-scale distributed systems, specifically in the topics of middleware, resource management, and scalability. He is the co-founder and current lead for the Datashim project.

Episoder(243)

DoK #60 Intro to Kubernetes // Aitor Artola & Kunal Kushwaha

DoK #60 Intro to Kubernetes // Aitor Artola & Kunal Kushwaha

Abstract of the talk… In this event we will introduce Kubernetes, containers and the cloud native initiative. You will get an overview of the benefits of containers running on Kubernetes and the new m...

2 Jul 20211h 46min

Postgres on Kubernetes Hands-On-Lab // Álvaro Hernández

Postgres on Kubernetes Hands-On-Lab // Álvaro Hernández

Abstract of the talk… From 0 to 60/100 (depending on where you live) in just 2h! It may sound "slow" if you talk about cars, but when you talk about databases in general and Postgres in particular, it...

1 Jul 20212h 1min

DoK #59 Let's get Real: SRE | Do we need it? // Benoit Schipper

DoK #59 Let's get Real: SRE | Do we need it? // Benoit Schipper

Abstract of the talk… More and more companies around the world are adopting SRE. Despite Google's great book series regarding SRE, there is no default implementation for SRE. Join me and watch me expl...

23 Jun 202159min

DoK #58 Benchmarking for PostgreSQL workloads in Kubernetes // Gabriele Bartolini & Francesco Canovai

DoK #58 Benchmarking for PostgreSQL workloads in Kubernetes // Gabriele Bartolini & Francesco Canovai

Abstract of the talk… Databases like PostgreSQL cannot run on Kubernetes. That’s the refrain we hear all the time, as well as the motivation for us to break this barrier, once and for all. Hear the st...

18 Jun 20211h 4min

DoK #57 Key Criteria for Evaluating Kubernetes Data Storage // Enrico Signoretti

DoK #57 Key Criteria for Evaluating Kubernetes Data Storage // Enrico Signoretti

Abstract of the talk… Enterprises of all sizes are embracing hybrid cloud strategies that are ever more complex and structured, moving quickly from a first adoption phase, where data and applications ...

17 Jun 20211h 1min

DoK #56 It's just a SQL - Crash course on Synapse Serverless for T-SQL ninjas! // Nikola Ilic

DoK #56 It's just a SQL - Crash course on Synapse Serverless for T-SQL ninjas! // Nikola Ilic

Abstract of the talk… Are you a seasoned T-SQL developer, used to solve each and every challenge by writing plain old SQL? But, now you need to leverage data coming from semi-structured or unstructure...

16 Jun 20211h 8min

DoK #55 How to optimise operations and life cycle management for containers? // Rajalakshmi Srinivasa

DoK #55 How to optimise operations and life cycle management for containers? // Rajalakshmi Srinivasa

Abstract of the talk… Modern applications are built to run on containerized infrastructure. Businesses are also migrating their existing apps from traditional to container deployments. In such a scena...

11 Jun 20211h 3min

DoK #54 Putting Chaos into Continuous Delivery - How to increase the resiliency of your applications // Jürgen Etzlstorfer

DoK #54 Putting Chaos into Continuous Delivery - How to increase the resiliency of your applications // Jürgen Etzlstorfer

Abstract of the talk… Continuous Delivery practices have evolved significantly with the cloud-native paradigm. GitOps & Chaos Engineering are at the forefront of this new CD approach, with an ever-inc...

9 Jun 20211h 7min

Populært innen Teknologi

lydartikler-fra-aftenposten
romkapsel
teknisk-sett
tomprat-med-gunnar-tjomlid
energi-og-klima
elektropodden
rss-impressions-2
nasjonal-sikkerhetsmyndighet-nsm
fornybaren
shifter
pedagogisk-intelligens
teknologi-og-mennesker
rss-for-alarmen-gar
rss-ai-forklart
rss-ki-praten
rss-polypod
rss-digitaliseringspadden
rss-ki-til-kaffen
smart-forklart
blaskjerm-brodrene