Bringing up Cosmo

Bringing up Cosmo

Oxide is bringing up its next generation server. To discuss the (amazingly smooth) bringup process, Bryan and Adam were joined by members of the oxide team. Tales of adversity, re-work, un-re-work, and triumph!

In addition to Bryan Cantrill and Adam Leventhal, we were joined by Oxide colleagues Nathanael Huffman, Ian Sobering, Matt Keeter, and Aaron Hartwig.

We mentioned quite a few terms! Here's a helpful guide:

  • Cosmo - Oxide’s next-generation sled (currently in development) with an AMD Turin CPU
  • Gimlet - Oxide’s current-generation sled with an AMD Milan CPU
  • Turin - AMD Epyc 9005 Series
  • Milan - AMD Epyc 7003 Series
  • Genoa - AMD Epyc 9004 Series (Oxide chose to skip this generation)
  • Sequencing - the precise control of when power rails are energized throughout a PCB
  • Sled - One of the (max 32) computers in an Oxide rack; a custom form-factor optimized for power and cooling efficiency
  • IBC - Intermediate Bus Converter (Our 54VDC -> 12VDC converter)
  • RoT - Root of Trust
  • SP - Service Processor, the small computer (running Hubris) that allows for low-level control
  • Ignition - An even lower-level control network for power management (including power of the SP)
  • Ruby - The AMD reference platform (Oxide has used this to prepare Cosmo software in advance of bringup)
  • DC-SCM - https://www.opencompute.org/documents/ocp-dc-scm-spec-rev-1-0-pdf and OpenCompute standard form factor.
  • Grapefruit - OCP DC-SCM form-factor board with our SP, RoT, and FPGA on it, used to replace the OCP DC-SCM baseboard management controller in the Ruby platform.
  • Cadence - Software Oxide previously used for PCB design
  • Altium - Software Oxide now uses for PCB design
  • Hubris - Oxide’s embedded operating system, run on the SP and RoT
  • Humility - The Hubris debugger
  • PLM - Product Lifecycle Management – a class of software used for managing hardware BOMs
  • BOM - Bill of Materials – the components required to build a hardware product
  • RFK - Our colleague, Robert Keith (to distinguish him from our other colleague, Robert, and our former colleague, Keith)
  • FPGA - Field Programmable Gate Array – Also referred to as “soft logic” – effectively programmable hardware
  • ILA - Integrated Logic Analyzer
  • JTAG - A debugging interface for various processors
  • UART - A serial port or connection

For previous tales from the bringup lab:

If we got something wrong or missed something, please file a PR! Our next show will likely be on Monday at 5p Pacific Time on our Discord server; stay tuned to our Mastodon feeds for details, or subscribe to this calendar. We'd love to have you join us, as we always love to hear from new speakers!


Denne episoden er hentet fra en åpen RSS-feed og er ikke publisert av Podme. Den kan derfor inneholde annonser.

Episoder(179)

Rooting for the Home Team with Paul Freedman and Bryan Carmel

Rooting for the Home Team with Paul Freedman and Bryan Carmel

Two years ago we introduced listeners to the Oakland Ballers, the startup returning baseball to the city of Oakland. Bryan and Adam were joined again by Paul Freedman and Bryan Carmel to discuss the B...

27 Mai 1h 2min

The Tale of Reverso

The Tale of Reverso

Oxide ships a rack scale system--how to test the manufacturing of the backplane and switches? Previously we've been using a collection of sacrificial servers, but this was unwieldy, expensive, and uns...

16 Mai 1h 6min

AI in Computer Science Education

AI in Computer Science Education

AI is an existential topic for all aspects of education--for none more so than Computer Science. Bryan and Adam were joined by Kathi Fisler and Shriram Krishnamurthi, professors of Computer Science at...

10 Mai 1h 29min

Mechanical Engineering at Oxide [chapter images]

Mechanical Engineering at Oxide [chapter images]

Bryan and Adam were joined by members of the Oxide mechanical engineering team to talk the mechanical challenges of building a rack-scale computer, and--in particular--of scaling manufacturing from ju...

7 Mai 1h 24min

Are LLMs Insufficently Lazy?

Are LLMs Insufficently Lazy?

Brogrammer Garry Tan has been boasting about "writing" tens of thousands of lines of code each day as the paragon of productivity. Is this really the right way to think about building systems? Bryan a...

3 Mai 1h 31min

Building a Quorum of Trust in the Oxide Rack

Building a Quorum of Trust in the Oxide Rack

The Oxide rack contains within it a distributed system that needs to trust itself. But how is this trust built? Bryan and Adam were joined by colleagues Andrew and Finch to explore how Trust Quorum wa...

4 Apr 1h 26min

When Nine Nines Isn't Enough

When Nine Nines Isn't Enough

Bryan and Adam were joined by members of the Oxide team to describe the multi-year search for a mysterious source of hardware failures. All related to an ultra-reliable--and yet still not reliable eno...

18 Mar 1h 24min

Oxide's $200M Series C

Oxide's $200M Series C

Oxide raised a truckload of capital a few weeks ago to fund the business for the foreseeable future. Bryan and Steve describe the raise, and Adam poses the best the best (and worst) questions scraped ...

27 Feb 1h 45min

Populært innen Teknologi

lydartikler-fra-aftenposten
romkapsel
teknisk-sett
elektropodden
nasjonal-sikkerhetsmyndighet-nsm
shifter
tomprat-med-gunnar-tjomlid
energi-og-klima
hans-petter-og-co
fornybaren
rss-ki-praten
teknologi-og-mennesker
rss-for-alarmen-gar
rss-heis
rss-ai-forklart
i-loopen
rss-bouvet-bobler
rss-digitaliseringspadden
rss-alt-som-gar-pa-strom
kortslutning