Apache Beam with Kenneth Knowles and Pablo Estrada

Apache Beam with Kenneth Knowles and Pablo Estrada

On the podcast this week, your hosts Stephanie Wong and Mark Mirchandani talk about the data processing tool Apache Beam with guests Pablo Estrada and Kenneth Knowles.

Kenn starts us off with an overview of how Apache Beam began and how Cloud Dataflow was involved. The unique batch and stream method and emphasis on correctness garnered support from developers early on and continues to attract users. Pablo helps us understand why Beam is a better option for certain projects looking to process large amounts of data. Our guests describe how Beam may be a better fit than microservices that could become obsolete as company needs change.

Next, we step back and take a look at why batch and stream is the gold standard of data processing because of its balance between low latency and ease of "being done" with data collection. Beam's focus on the correctness of data and correctness in processing that data is a core component. With good data, processing becomes easier, more reliable, and cheaper. Kenn gives examples of how things can go wrong with bad data processing. Beam strives for the perfect combination of low latency, correct data, and affordability. Users can choose where to run Beam pipelines, from other Apache software offerings to Dataflow, which means excellent flexibility. Our guests talk about the pros and cons of some of these options and we hear examples of how companies are using Beam along with supporting software to solve data processing challenges.

To get started with Beam, check out Beam College or attend Beam Summit 2022.

Kenneth Knowles

Kenn Knowles is chair of the Apache Beam Project Management Committee. Kenn has been working on Google Cloud Dataflow—Google's Beam backend—since 2014. Kenn holds a PhD in programming languages from the University of California, Santa Cruz.

Pablo Estrada

Pablo is a Software Engineer at Google, and a management committee member for Apache Beam. Pablo is big into working on an open source project, and has worked all across the Apache Beam stack.

Cool things of the week
  • Under the sea: Building the world's fiber optic internet video
  • Google Data Cloud Summit site
  • It's official—Google Distributed Cloud Edge is generally available blog
    • GCP Podcast Episode 228: Fastly with Tyler McMullen podcast
  • Save big by temporarily suspending unneeded Compute Engine VMs—now GA blog
Interview
  • Apache Beam site
  • Apache Beam Documentation site
  • Dataflow site
  • Apache Flink site
  • Apache Spark site
  • Apache Samza site
  • Apache Nemo site
  • Spanner site
  • BigQuery site
  • Beam College site
  • Beam College on Github site
  • Beam Developer Mailing List email
  • Beam User Mailing List email
  • Beam Summit site
What's something cool you're working on?

Mark is working on a new Apache Beam video series Getting Started Wtih Apache Beam

Hosts

Stephanie Wong and Mark Mirchandani

Jaksot(335)

Managed Service for Prometheus with Lee Yanco and Ashish Kumar

Managed Service for Prometheus with Lee Yanco and Ashish Kumar

Hosts Carter Morgan and Anthony Bushong are in the studio this week! We're talking about Prometheus with guests Lee Yanco and Ashish Kumar and learning about the build process for Google Cloud's Manag...

20 Heinä 202237min

Distributed Cloud Edge for Telcos with DP Ayyadevara and Krishna Garimella

Distributed Cloud Edge for Telcos with DP Ayyadevara and Krishna Garimella

Stephanie Wong and Carter Morgan are back this week learning about Google's Distributed Cloud Edge for telcos with guests Krishna Garimella and DP Ayyadevara. Launched last year, Google Distributed Cl...

13 Heinä 202236min

Disaster Recovery with Cody Ault and Jo-Anne Bourne

Disaster Recovery with Cody Ault and Jo-Anne Bourne

Your hosts Max Saltonstall and Carter Morgan talk with guests Cody Ault and Jo-Anne Bourne of Veeam. Veeam is revolutionizing the data space by minimizing data loss impacts and project downtime with e...

29 Kesä 202236min

Contact Center AI with Amit Kumar and Vasili Triant

Contact Center AI with Amit Kumar and Vasili Triant

This week on the GCP Podcast, Carter Morgan and Max Saltonstall are joined by Amit Kumar and Vasili Triant. Our guests are here to talk about new features in Contact Center AI. Amit starts the show he...

22 Kesä 202236min

New Pi World Record with Emma Haruka Iwao and Sara Ford

New Pi World Record with Emma Haruka Iwao and Sara Ford

Carter Morgan and Brian Dorsey are working on their math skills today with guests Emma Haruka Iwao and Sara Ford. What kind of computing power does it take to break the world record for pi computation...

15 Kesä 202239min

FinOps with Joe Daly

FinOps with Joe Daly

On the podcast this week, guest Joe Daly tells Stephanie Wong, Mark "Money" Mirchandani, and our listeners all about FinOps principles and how they're helping companies take advantage of the cloud whi...

8 Kesä 202239min

Network Analyzer with Zach Seils and Manasa Chalasani

Network Analyzer with Zach Seils and Manasa Chalasani

Stephanie Wong and Lorin Price welcome guests Zach Seils and Manasa Chalasani to talk about networking and the newly released Network Analyzer. Google Cloud's Network Intelligence Center is described ...

1 Kesä 202238min

GKE Release Channels with Kobi Magnezi and Abdelfettah Sghiouar

GKE Release Channels with Kobi Magnezi and Abdelfettah Sghiouar

Kaslin Fields and Mark Mirchandani learn how GKE manages their releases and how customers can take advantage of the GKE release channels for smooth transitions. Guests Abdelfettah Sghiouar and Kobi Ma...

25 Touko 202247min

Suosittua kategoriassa Politiikka ja uutiset

uutiscast
aikalisa
rss-ootsa-kuullut-tasta
politiikan-puskaradio
ootsa-kuullut-tasta-2
tervo-halme
viisupodi
rss-podme-livebox
otetaan-yhdet
et-sa-noin-voi-sanoo-esittaa
rikosmyytit
the-ulkopolitist
rss-asiastudio
io-techin-tekniikkapodcast
aihe
rss-pallo-keskelle-2
radio-antro
rss-kovin-paikka
rss-sanna-ukkola-show-verkkouutiset
rss-tasta-on-kyse-ivan-puopolo-verkkouutiset