Rack-scale Networking

Rack-scale Networking

Bryan and Adam are joined by a number of members of the Oxide networking team to talk about the networking software that drives the Oxide rack. It turns out that rack-scale networking is hard... and has enormous benefits!


We've been hosting a live show weekly on Mondays at 5p for about an hour, and recording them all; here is the recording from February 27th, 2023.

In addition to Bryan Cantrill and Adam Leventhal, speakers included Ryan Goodfellow, Levon Tarver, Ben Naecker, and Arjen Roodselaar.

Links

Here's (much of) the live chat from the show:

  • ahl https://github.com/oxidecomputer/oxide-and-friends/blob/master/2021_11_29.md
  • ahl That's the Sidecar switch episode
  • bcantrill https://p4.org/
  • admchl What does "at line rate" mean?
  • Riking Line rate = As fast as the packets could possibly come. 1Gbit, 10Gbit, 100Gbit, etc
  • admchl Do you need ASICs to hit that speed? I assume x86_64 is not going to be fast enough for these specialised operations?
  • levon Yes, the Tofino 2 is the ASIC
  • bcantrill You need ASICs
  • bnaecker Yes, you really can't do these kinds of operations on a general purpose CPU.
  • rng_drizzt Yeah, you need specialized silicon here.
  • JustinAzoff Right, also often across all ports at the same time in both direction. a 48 port 10gbps switch will have a line rate of 960gbps (10 ** 48 ** 2)
  • duckman So the advantage is being able to offload compute to the switch?
  • bnaecker Yes, and specifically that you can separate the data plane (operations on the packets) from the control plane (decisions about what operations to allow or make).
  • tahnok What's TCAM?
  • levon Ternary Content Addressable Memory
  • bnaecker https://en.wikipedia.org/wiki/Content-addressable_memory#Ternary_CAMs
  • ryaeng Sure beats logging into a number of Cisco switches and making changes at the console.
  • admchl This is my favourite episode in a long time, this is all really fascinating.
  • rng_drizzt the first Sidecar episode was nearly 1.5 years ago ü§Ø , right after we cut the first rev
  • levon That episode blew my mind
  • duckman This sounds like a big deal on the scale of ebpf
  • duckman Or bigger
  • bnaecker It is extremely useful for understanding the processing pipelines. As long as you only run single-packet integration tests üôÇ
  • od0 just want to go out and find things to write P4 code for
  • JustinAzoff <@354365572554948608> yeah one way to think about that sort of thing is that xdp can be used to run little programs on a nic, where p4 is kind of like that, but running on effectively a nic with 48+ ports
  • bcantrill https://github.com/oxidecomputer/p4
  • SyntheticGate sidecar is the "codename" of our switch box
  • SyntheticGate "gimlet" is our server sled
  • bcantrill https://github.com/oxidecomputer/propolis
  • wmf So you have P4 and OPTE in the hypervisor at the same time?
  • bnaecker OPTE is in the host kernel.
  • arjenroodselaar The P4 runtime Ry described only exists in the test bed, where it high level simulates the switches. OPTE is part of the production environment.
  • arjenroodselaar The rough difference between P4 and OPTE is that P4 works on individual packets without much concept of a session (so it can't reason about TCP streams, packet order etc, so no firewall like functionality), while OPTE aims to operate on streams of packets.
  • JustinAzoff So you can run 100 VMs on a test system and wire them up to your virtual switch compiled by x4c?
  • arjenroodselaar Correct.
  • bcantrill OPTE == Oxide Packet Transformation Engine
  • admchl Gimlet?
  • rng_drizzt Compute server
  • rng_drizzt The Sidecar switch is actually just a PCIe peripheral to a Gimlet.
  • bnaecker The Gimlet managing the Sidecar is often called a "Scrimlet" for "Sidecar attached Gimlet"
  • Riking and "how do i reconfigure this giant network without hosing my ability to reconfigure this giant network"
  • ShaunO can identify with that - we seriously struggle to keep our own products inter-operating, let alone anyone else's
  • levon It can feel like a Sisyphean task.
  • a172 Setup a much smaller/simpler network in parallel that is accessible from "not your network" that gets you to the management interface.
  • levon It's a whole new world when you can look at the actual table definitions in P4
  • rng_drizzt Owning all the layers here is immensely beneficial
  • levon Those DTrace probes have been very helpful
  • bnaecker Those probes turned out to be everywhere. They are are in: SQL queries, HTTP queries, log messages, Propolis hypervisor state, virtual storage system, networking protocol messages, the P4 emulator, and probably more that I'm forgetting about.
  • levon For those unfamiliar with the DTrace tool, or the rationale behind leveraging DTrace over other tracing / debugging tools: https://www.cs.princeton.edu/courses/archive/fall05/cos518/papers/dtrace.pdf
  • bcantrill https://github.com/oxidecomputer/progenitor
  • ahl some notes on rust codegen: https://github.com/ahl/codegen-template
  • arjenroodselaar DDM! Bring us home!
  • a172 it astonishes me how many "cloud" type architectures are built on v4 only or v4 first.
  • a172 IPv6 is older than Wi-Fi
  • a172 It solves real problems. PLEASE use it.
  • nyanotech yessss fina...

Det här avsnittet är hämtat från ett öppet RSS-flöde och publiceras inte av Podme. Det kan innehålla reklam.

Avsnitt(181)

The Hardest Kind of Unsafe Rust

The Hardest Kind of Unsafe Rust

We love Rust for how much the compiler helps enforce safety. But sometimes it's up to us to uphold the complex--and often unclear--expectations of what constitutes safety on our own. Oxide colleague, ...

13 Juni 1h 28min

This Old Repo: LLMs and the Restoration of BattleTris

This Old Repo: LLMs and the Restoration of BattleTris

Bryan and Adam discuss the process of restoring a software project--BattleTris--untouched and unbuilt in over 20 years! How did LLMs help restore code Bryan started in the mid-1990s and what does that...

9 Juni 1h 18min

Rooting for the Home Team with Paul Freedman and Bryan Carmel

Rooting for the Home Team with Paul Freedman and Bryan Carmel

Two years ago we introduced listeners to the Oakland Ballers, the startup returning baseball to the city of Oakland. Bryan and Adam were joined again by Paul Freedman and Bryan Carmel to discuss the B...

27 Maj 1h 2min

The Tale of Reverso

The Tale of Reverso

Oxide ships a rack scale system--how to test the manufacturing of the backplane and switches? Previously we've been using a collection of sacrificial servers, but this was unwieldy, expensive, and uns...

16 Maj 1h 6min

AI in Computer Science Education

AI in Computer Science Education

AI is an existential topic for all aspects of education--for none more so than Computer Science. Bryan and Adam were joined by Kathi Fisler and Shriram Krishnamurthi, professors of Computer Science at...

10 Maj 1h 29min

Mechanical Engineering at Oxide [chapter images]

Mechanical Engineering at Oxide [chapter images]

Bryan and Adam were joined by members of the Oxide mechanical engineering team to talk the mechanical challenges of building a rack-scale computer, and--in particular--of scaling manufacturing from ju...

7 Maj 1h 24min

Are LLMs Insufficently Lazy?

Are LLMs Insufficently Lazy?

Brogrammer Garry Tan has been boasting about "writing" tens of thousands of lines of code each day as the paragon of productivity. Is this really the right way to think about building systems? Bryan a...

3 Maj 1h 31min

Building a Quorum of Trust in the Oxide Rack

Building a Quorum of Trust in the Oxide Rack

The Oxide rack contains within it a distributed system that needs to trust itself. But how is this trust built? Bryan and Adam were joined by colleagues Andrew and Finch to explore how Trust Quorum wa...

4 Apr 1h 26min

Populärt inom Teknik

uppgang-och-fall
elbilsveckan
market-makers
rss-laddstationen-med-elbilen-i-sverige
rss-technokratin
rss-elektrikerpodden
skogsforum-podcast
bilar-med-sladd
rss-uppgang-och-fall
developers-mer-an-bara-kod
ai-sweden-podcast
rss-snacka-om-ai
bli-saker-podden
rss-en-ai-till-kaffet
rss-veckans-ai
natets-morka-sida
dom-kallar-oss-krypto
hej-bruksbil
kodsnack
under-femton