What's a bug? What's a debugger?
Oxide and Friends22 Kesä 2021

What's a bug? What's a debugger?

Oxide and Friends Twitter Space: June 21, 2021

What’s a bug? What’s a debugger?

We’ve been holding a Twitter Space weekly on Mondays at 5p for about an hour. Even though it’s not (yet?) a feature of Twitter Spaces, we have been recording them all; here is the recording for our Twitter Space for June 21, 2021.

In addition to Bryan Cantrill and Adam Leventhal, speakers on June 21st included Dan Cross, Sean Klein, Aram Hăvărneanu, and the mononymous Nate. (Did we miss your name and/or get it wrong? Drop a PR!)

Some of the topics we hit on, in the order that we hit them:

  • Adam’s toddler (being chased by a rooster) > Don’t get me wrong, some of my best friends are three-year-olds.
  • [@3:12](https://youtu.be/UOucW3F7nCg?t=192) Sy Brand’s tutorial Writing a Debugger
  • [@4:34](https://youtu.be/UOucW3F7nCg?t=274) Bryan’s debuggers
    • MDB Modular Debugger > Adam: I think people are using cargo-cult debugging, rather than getting to the root cause > of these things, or being satisfied until they get to the root cause.
      > Bryan: I think with software systems, it’s really hard to know what they’re actually doing.
    • Procedure Linkage Table aka “the plits”
    • “Runtime Performance Analysis of the M-to-N Scheduling Model” (pdf) 1996 undergrad thesis (Brown CS dept website)

    • [@6:29](https://youtu.be/UOucW3F7nCg?t=389) Threadmon website and 1997 paper (a retooling of the ’96 paper) > When I built that tooling, it revealed this thing > is not doing at all what anyone thought it was doing.
    • TNF Trace Normal Form > Part of the problem with debuggers… debuggers are historically written by compiler folks, > and not system folks. As a result, debuggers are designed to debug the problem that > compiler folks have the most familiarity with, and that’s a compiler.
      > Debuggers are designed for reproducible problems, way too frequently.
  • I view in situ breakpoint debugging as one sliver of debugging that’s useful for one particular and somewhat unusual class of bugs. That’s actually not the kind of debugger I want to use most of the time.
  • Software breakpoints
  • [@11:59](https://youtu.be/UOucW3F7nCg?t=719) > libdis was my intern project in 2000. The idea was to take the program text, > and interpret it in some structural form, and try to infer different things about the program.
  • [@14:59](https://youtu.be/UOucW3F7nCg?t=899) I meant this question earnestly, what is a debugger?
    • The first bug > The term is somewhat regrettable… It implies a problem, when there may not be a problem. > It may just be I want to understand how the system is operating, independent of whether > it’s doing it badly.
  • Wikipedia on Observability (control theory)
  • Oxide’s embedded OS and companion debugger: Hubris and Humility
  • [@19:01](https://youtu.be/UOucW3F7nCg?t=1141) Using DTrace to help customers understand their systems. > If you strings the DTrace binary, > you’re not gonna find any mention of raincoats.
  • [@22:13](https://youtu.be/UOucW3F7nCg?t=1333) Cardinal rule of debuggers: Don’t kill the patient! (see also: Do No Harm) > Not killing the patient is really important, > this was always an Ur principle for us.
  • The notion that the debugger has now become load bearing in the execution of the program, is a pretty grave responsibility.
  • [@26:54](https://youtu.be/UOucW3F7nCg?t=1614) Post-mortem debugging > It is a tragedy of our domain that we do not debug post-mortem, routinely.
  • Heisenbug (when the act of observing the problem, hides the problem)
  • [@31:11](https://youtu.be/UOucW3F7nCg?t=1871) > What’s going on in the system? It’s not crashing, there’s no core dump. > But the system is behaving in a way I didn’t expect it to, and I want to know why.
  • [@33:51](https://youtu.be/UOucW3F7nCg?t=2031) Pre-production reliability techniques > All of our pre-production work has gotten way better than it was, and I think that’s > compensation for the fact we can’t understand these systems when we deploy them.
  • [@37:58](https://youtu.be/UOucW3F7nCg?t=2278) > The move to testing has in fact obviated some of the need for > what we consider traditional debuggers.
    > (Bryan audibly cringes)
  • [@39:08](https://youtu.be/UOucW3F7nCg?t=2348) Automated and Algorithmic Debugging conference AADEBUG 2003
    • HOPL History of Programming Languages > There was a test suite of excellence when it comes to automated program debugging. > And it was some pile of C programs with known bugs, and you would throw your new > paper at it, and it would find 84% of the bugs, and there would be a lot of > slapping each other on the back on that. Really focused on the simplest of simple bugs.
    • [@43:15](https://youtu.be/UOucW3F7nCg?t=2595) Bryan’s Postmortem Object Type Identification paper > Who is my neighbor in memory? Because my neighbor just burned down my house basically.
    • mdb’s ::kgrep > I need to pause you there because it’s so crazy, and I want to emphasize that > he means what he’s saying. We look for the 64 bit value, and see where we find it. > This is a game of bingo across the entire address space.
  • We can follow the pointers and propagate types.
  • [@48:49](https://youtu.be/UOucW3F7nCg?t=2929) printf/println debugging – everyone’s doing it > I think it’s a mistake for people to denigrate printf debugging. > If you’ve got a situation that you ca...

Tämä jakso on lisätty Podme-palveluun avoimen RSS-syötteen kautta eikä se ole Podmen omaa tuotantoa. Siksi jakso saattaa sisältää mainontaa.

Jaksot(179)

Rooting for the Home Team with Paul Freedman and Bryan Carmel

Rooting for the Home Team with Paul Freedman and Bryan Carmel

Two years ago we introduced listeners to the Oakland Ballers, the startup returning baseball to the city of Oakland. Bryan and Adam were joined again by Paul Freedman and Bryan Carmel to discuss the B...

27 Touko 1h 2min

The Tale of Reverso

The Tale of Reverso

Oxide ships a rack scale system--how to test the manufacturing of the backplane and switches? Previously we've been using a collection of sacrificial servers, but this was unwieldy, expensive, and uns...

16 Touko 1h 6min

AI in Computer Science Education

AI in Computer Science Education

AI is an existential topic for all aspects of education--for none more so than Computer Science. Bryan and Adam were joined by Kathi Fisler and Shriram Krishnamurthi, professors of Computer Science at...

10 Touko 1h 29min

Mechanical Engineering at Oxide [chapter images]

Mechanical Engineering at Oxide [chapter images]

Bryan and Adam were joined by members of the Oxide mechanical engineering team to talk the mechanical challenges of building a rack-scale computer, and--in particular--of scaling manufacturing from ju...

7 Touko 1h 24min

Are LLMs Insufficently Lazy?

Are LLMs Insufficently Lazy?

Brogrammer Garry Tan has been boasting about "writing" tens of thousands of lines of code each day as the paragon of productivity. Is this really the right way to think about building systems? Bryan a...

3 Touko 1h 31min

Building a Quorum of Trust in the Oxide Rack

Building a Quorum of Trust in the Oxide Rack

The Oxide rack contains within it a distributed system that needs to trust itself. But how is this trust built? Bryan and Adam were joined by colleagues Andrew and Finch to explore how Trust Quorum wa...

4 Huhti 1h 26min

When Nine Nines Isn't Enough

When Nine Nines Isn't Enough

Bryan and Adam were joined by members of the Oxide team to describe the multi-year search for a mysterious source of hardware failures. All related to an ultra-reliable--and yet still not reliable eno...

18 Maalis 1h 24min

Oxide's $200M Series C

Oxide's $200M Series C

Oxide raised a truckload of capital a few weeks ago to fund the business for the foreseeable future. Bryan and Steve describe the raise, and Adam poses the best the best (and worst) questions scraped ...

27 Helmi 1h 45min