Apache Beam with Kenneth Knowles and Pablo Estrada

Apache Beam with Kenneth Knowles and Pablo Estrada

On the podcast this week, your hosts Stephanie Wong and Mark Mirchandani talk about the data processing tool Apache Beam with guests Pablo Estrada and Kenneth Knowles.

Kenn starts us off with an overview of how Apache Beam began and how Cloud Dataflow was involved. The unique batch and stream method and emphasis on correctness garnered support from developers early on and continues to attract users. Pablo helps us understand why Beam is a better option for certain projects looking to process large amounts of data. Our guests describe how Beam may be a better fit than microservices that could become obsolete as company needs change.

Next, we step back and take a look at why batch and stream is the gold standard of data processing because of its balance between low latency and ease of "being done" with data collection. Beam's focus on the correctness of data and correctness in processing that data is a core component. With good data, processing becomes easier, more reliable, and cheaper. Kenn gives examples of how things can go wrong with bad data processing. Beam strives for the perfect combination of low latency, correct data, and affordability. Users can choose where to run Beam pipelines, from other Apache software offerings to Dataflow, which means excellent flexibility. Our guests talk about the pros and cons of some of these options and we hear examples of how companies are using Beam along with supporting software to solve data processing challenges.

To get started with Beam, check out Beam College or attend Beam Summit 2022.

Kenneth Knowles

Kenn Knowles is chair of the Apache Beam Project Management Committee. Kenn has been working on Google Cloud Dataflow—Google's Beam backend—since 2014. Kenn holds a PhD in programming languages from the University of California, Santa Cruz.

Pablo Estrada

Pablo is a Software Engineer at Google, and a management committee member for Apache Beam. Pablo is big into working on an open source project, and has worked all across the Apache Beam stack.

Cool things of the week
  • Under the sea: Building the world's fiber optic internet video
  • Google Data Cloud Summit site
  • It's official—Google Distributed Cloud Edge is generally available blog
    • GCP Podcast Episode 228: Fastly with Tyler McMullen podcast
  • Save big by temporarily suspending unneeded Compute Engine VMs—now GA blog
Interview
  • Apache Beam site
  • Apache Beam Documentation site
  • Dataflow site
  • Apache Flink site
  • Apache Spark site
  • Apache Samza site
  • Apache Nemo site
  • Spanner site
  • BigQuery site
  • Beam College site
  • Beam College on Github site
  • Beam Developer Mailing List email
  • Beam User Mailing List email
  • Beam Summit site
What's something cool you're working on?

Mark is working on a new Apache Beam video series Getting Started Wtih Apache Beam

Hosts

Stephanie Wong and Mark Mirchandani

Jaksot(335)

Messaging on the Cloud

Messaging on the Cloud

In the seventh episode of this podcast, your hosts Francesc and Mark discuss the different ways messaging can be done on Google Cloud Platform, covering Pub/Sub and Task Queues and when to choose what...

9 Joulu 201533min

HTTP/2, SPDY, and QUIC with Ilya Grigorik

HTTP/2, SPDY, and QUIC with Ilya Grigorik

In the sixth episode of this podcast, your hosts Francesc and Mark interview Ilya Grigorik, Developer Advocate at Google. About Ilya: Ilya is a web performance engineer at Google; co-chair of W3C Webp...

2 Joulu 201535min

Google Cloud Developer Experience with Chris Sells

Google Cloud Developer Experience with Chris Sells

In the fifth episode of this podcast, your hosts Francesc and Mark interview Chris Sells, Product Manager at Google. About Chris: Chris Sells has been a software engineer of one kind or another since ...

25 Marras 201533min

Containers and Dockercon with Jessie Frazelle

Containers and Dockercon with Jessie Frazelle

In the fourth episode of this podcast, your hosts Francesc and Mark interview Jessie Frazelle, Container Hacker at the Docker Engine team. About Jessie: Jessie Frazelle is a Docker core maintainer and...

18 Marras 201533min

Kubernetes and Google Container Engine

Kubernetes and Google Container Engine

In the third episode of this podcast, your hosts Francesc and Mark interview Brian Dorsey, Developer Advocate, Google Cloud Platform about Kubernetes and Google Container Engine. About Brian: Brian Do...

11 Marras 201540min

Compute as a Continuum

Compute as a Continuum

In the second episode of this podcast, your hosts Francesc and Mark go from Infrastructure as a Service to Platform as a Service, as they discuss the concept of "Compute as a Continuum". Links: Googl...

4 Marras 201525min

We Got a Podcast!

We Got a Podcast!

In this first episode your hosts, Francesc and Mark, discuss how this podcast was built and deployed to Google Cloud Platform. Do you have something cool to share? Some questions? Let us know: web: g...

27 Loka 201519min

Suosittua kategoriassa Politiikka ja uutiset

uutiscast
aikalisa
politiikan-puskaradio
rss-ootsa-kuullut-tasta
ootsa-kuullut-tasta-2
tervo-halme
rss-vaalirankkurit-podcast
rss-podme-livebox
otetaan-yhdet
rss-asiastudio
the-ulkopolitist
viisupodi
et-sa-noin-voi-sanoo-esittaa
rikosmyytit
aihe
rss-tasta-on-kyse-ivan-puopolo-verkkouutiset
radio-antro
rss-kaikki-uusiksi
rss-hyvaa-huomenta-bryssel
rss-sanna-ukkola-show-verkkouutiset