Handling multiple breakpoints in Trap Redux

In a previous post I worked on getting Trap building on a new system and found that the unit tests were not passing. It looked like I had found a real bug:

I found:

The first call to breakpoint_resolve must be when the breakpoint in main is reached. And the second call is 3 bytes later. I think this is trying to resolve the stop in execution that happens when we are in state INFERIOR_SINGLE_STEPPING to step over the breakpoint. And in this case the instruction pointer will have moved past the byte where we inserted the breakpoint – past the entire original instruction. So breakpoint_resolve should fail as this address isn’t one of our breakpoints. That is, breakpoint_resolve assumes we executed a one byte instruction and subtracts 1 from the instruction pointer to find the target_address.

This is not correct and wasn’t correct when I originally wrote it. I don’t know how this worked before.

So, to make progress I’m going to go back to the commit I started with at the beginning of Handling multiple breakpoints in Trap to retrace my steps without introducing this bug. So I’m starting at commit fb7b3c6 . First, I need to cherry-pick my changes to fix the build and disable address space randomization.

modified   src/inferior_load.c
@@ -5,6 +5,8 @@
 #include <sys/wait.h>
 #include <assert.h>
 #include <errno.h>
+#include <sys/personality.h>
+#include <signal.h>
 #include <stdio.h>
 #include <stdlib.h>
 
@@ -12,6 +14,10 @@ static inferior_t g_inferior;
 
 static void setup_inferior(const char *path, char *const argv[])
 {
+  unsigned long old = personality(0xFFFFFFFF);
+  if (personality(old | ADDR_NO_RANDOMIZE) < 0) {
+    perror("Failed to set personality:");
+  }
   ptrace_util_traceme();
   execv(path, argv);
 }

Alright, it builds and the tests fail because I need to update the breakpoint addresses again. I’ll follow the style of the nice clean up that I used in rusty_trap that sets some constants for the two addresses I use (main and foo)

modified   test/test_breakpoint.c
@@ -2,6 +2,8 @@
 #include <trap.h>
 #include <unistd.h>
 
+const char *ADDRESS_OF_MAIN = (char*)0x0000555555555151;
+
 int g_breakpoint_count = 0;
 trap_inferior_t g_inferior = 0;
 trap_breakpoint_t g_bp = 0;
@@ -20,7 +22,7 @@ int main()
 
   trap_breakpoint_set_callback(breakpoint_callback);
   g_inferior = trap_inferior_exec("./inferiors/hello", argv);
-  g_bp = trap_inferior_set_breakpoint(g_inferior, (char *)0x000000000040079d);
+  g_bp = trap_inferior_set_breakpoint(g_inferior, ADDRESS_OF_MAIN);
   trap_inferior_continue(g_inferior);
 
   assert(g_breakpoint_count == 1);
modified   test/test_multi_breakpoint.c
@@ -2,6 +2,8 @@
 #include <trap.h>
 #include <unistd.h>
 
+const char *ADDRESS_OF_FOO = (char*)0x0000555555555131;
+
 int g_breakpoint_count = 0;
 trap_inferior_t g_inferior = 0;
 trap_breakpoint_t g_bp = 0;
@@ -20,7 +22,7 @@ int main()
 
   trap_breakpoint_set_callback(breakpoint_callback);
   g_inferior = trap_inferior_exec("./inferiors/loop", argv);
-  g_bp = trap_inferior_set_breakpoint(g_inferior, (char *)0x000000000040076d);
+  g_bp = trap_inferior_set_breakpoint(g_inferior, ADDRESS_OF_FOO);
   trap_inferior_continue(g_inferior);
 
   assert(g_breakpoint_count == 5);

There are two interesting things to note here.

The first is minor: I created const globals to store the address out of habit and now there is a warming that passing the constant to trap_inferior_set_breakpoint discards const. I should fix the signature. This can be const.

As for the second, why do I already have a test_multi_breakpoint?!?! I thought the point of Handling multiple breakpoints in Trap was to build that functionality? And strangely the test passes! Ok, let me review Handling multiple breakpoints in Trap again - the tagline does say I also worked on handling more than one breakpoint not just breakpoints triggered multiple times.

I asked Claude 3.7 Sonnet Thought to summarize the post for me. I like doing this because it can make sure I don’t miss anything. The one complaint I have is that no matter how much brevity I ask for it always gives me too much. Anyway, here’s the summary of the main steps:

AI Generated Summary

Initial Goals

Address warnings from previous state machine implementation
Write a test for multiple breakpoints
Make the test pass

Implementing Multiple Breakpoints Support

Created a new test (test_multiple_breakpoints.c) that sets two breakpoints
Identified the core issue: reliance on a single global g_breakpoint variable
Replaced global variable with an array of breakpoints (g_breakpoints[MAX_BREAKPOINTS])
Implemented breakpoint_allocate() to get breakpoints from the array

Resolving the Right Breakpoint

Implemented ptrace_util_get_instruction_pointer() to determine current position
Created breakpoint_resolve() to find which breakpoint was hit
Implemented find_breakpoint_with_target_address() for lookup

Handling Duplicate Breakpoints

Created a test that sets two breakpoints at the same address
Considered API design options (abort, return same handle, create distinct handles)
Decided on option to return the same handle for duplicate breakpoints

Removing Breakpoints

Added functionality to remove breakpoints (trap_inferior_remove_breakpoint)
Created tests for removing breakpoints (both before execution and during callbacks)
Discovered issues when removing breakpoints from within callbacks

Key Bug: Duplicate Breakpoint Reading

Identified that when setting a second breakpoint at the same address, the code was reading the modified memory (with INT 3 already inserted) rather than using the original instruction
Fixed by checking if a breakpoint already exists at an address and copying its original_breakpoint_word rather than reading memory

Deferred Removal Implementation

Added state tracking to breakpoints (BREAKPOINT_ACTIVE, BREAKPOINT_DEFERED_REMOVE)
Implemented deferred removal for breakpoints deleted during callbacks
Created do_deferred_removals() to process pending removals

Refactoring and Cleanup

Balanced structure between start_breakpoint() and finish_breakpoint()
Identified duplication in code that iterates over breakpoint collections

Final Issues

Noted need for more robust breakpoint container
Identified remaining warnings indicating incomplete support for multiple inferiors

Enjoying the post? Sign up for the newsletter or support the work on Buy Me a Coffee .

My Summary of the Summary

So Handling multiple breakpoints in Trap was really about setting multiple breakpoints and not at all about handling a single breakpoint multiple times. The latter works now. Handling more than one breakpoint introduced the function breakpoint_resolve which is used when the inferior stops to look up which breakpoint was hit using the instruction pointer as a key. Then we went on to work through the details of handling the breakpoints, fixing bugs, and cleaning up.

Most of what I see in the summary sounds pretty reasonable with the exception of breakpoint_resolve. As we saw before it doesn’t work because each breakpoint trigger causes the inferior to stop twice:

When the breakpoint is executed - breakpoint_resolve works in this case.
When single stepping over the original instruction - breakpoint_resolve does not work in this case.

Retracing Our Steps

Given that most of what was done in Handling multiple breakpoints in Trap seems like the right things to do let’s retrace our steps using the summary above as a map and the old post for more detail. When we get to breakpoint_resolve we can take what we know now and come up with better test coverage and drive a more robust solution.

So, let’s start with

Created a new test (test_multiple_breakpoints.c) that sets two breakpoints

Let’s actually just take the original version of the test. For trap implementation code I’m not sure I want to just take the old versions. But for the tests I think I do. By using the same test we can drive toward the same feature set and API while building a more robust solution. Here’s the test:

#include <assert.h>
#include <trap.h>
#include <unistd.h>

int g_foo_count = 0;
int g_main_count = 0;

trap_inferior_t g_inferior = 0;
trap_breakpoint_t g_bp_main = 0;
trap_breakpoint_t g_bp_foo  = 0;

void breakpoint_callback(trap_inferior_t inferior, trap_breakpoint_t bp)
{
  assert(inferior == g_inferior);

  if (bp == g_bp_main) {
    g_main_count++;
  } else if (bp == g_bp_foo) {
    g_foo_count++;
  }
}

int main()
{
  char *argv[1] = { 0 };

  trap_breakpoint_set_callback(breakpoint_callback);
  g_inferior = trap_inferior_exec("./inferiors/loop", argv);
  g_bp_main = trap_inferior_set_breakpoint(g_inferior, (char *)0x0000000000400778);
  g_bp_foo  = trap_inferior_set_breakpoint(g_inferior, (char *)0x000000000040076d);
  trap_inferior_continue(g_inferior);

  assert(g_main_count == 1);
  assert(g_foo_count == 5);

  return 0;
}

I will update the addresses though and extract them into constants.

modified   test/test_multiple_breakpoints.c
@@ -2,6 +2,9 @@
 #include <trap.h>
 #include <unistd.h>
 
+const char *ADDRESS_OF_MAIN = (char*)0x0000555555555140;
+const char *ADDRESS_OF_FOO  = (char*)0x0000555555555131;
+
 int g_foo_count = 0;
 int g_main_count = 0;
 
@@ -26,8 +29,8 @@ int main()
 
   trap_breakpoint_set_callback(breakpoint_callback);
   g_inferior = trap_inferior_exec("./inferiors/loop", argv);
-  g_bp_main = trap_inferior_set_breakpoint(g_inferior, (char *)0x0000000000400778);
-  g_bp_foo  = trap_inferior_set_breakpoint(g_inferior, (char *)0x000000000040076d);
+  g_bp_main = trap_inferior_set_breakpoint(g_inferior, ADDRESS_OF_MAIN);
+  g_bp_foo  = trap_inferior_set_breakpoint(g_inferior, ADDRESS_OF_FOO);
   trap_inferior_continue(g_inferior);
 
   assert(g_main_count == 1);

And as expected this test fails because we use single global breakpoint and can’t track both the breakpoints the test uses. Let’s work on fixing that. Last time we:

Replaced global variable with an array of breakpoints (g_breakpoints[MAX_BREAKPOINTS])

Implemented breakpoint_allocate() to get breakpoints from the array

While I don’t think a global array of breakpoints is the ideal form let’s go in this direction for now. But, eventually I think we’ll need to dynamically allocate breakpoints.

Implementing Multiple Breakpoints with Working Resolution

Now we can start working on a proper implementation by writing breakpoint_allocate. Oh, now that I open up breakpoint.c and start looking for a place to add breakpoint_allocate I find:

static trap_breakpoint_t breakpoint_resolve(trap_inferior_t inferior)
{
  return &g_breakpoint;
}

So breakpoint_resolve has been here for a while. This old version ‘works’ because there is only one breakpoint so we can simple resolve to that single breakpoint. It doesn’t matter what state the inferior is in as there is only one breakpoint to consider. This function will need to do more to support multiple breakpoints.

Ok, at least adding an array of breakpoints and a simple allocator is straightforward:

modified   src/breakpoint.c
@@ -12,7 +12,8 @@ struct breakpoint_t {
 };
 typedef struct breakpoint_t breakpoint_t;
 
-breakpoint_t g_breakpoint;
+#define MAX_BREAKPOINTS 100
+breakpoint_t g_breakpoints[MAX_BREAKPOINTS];
 
 static trap_breakpoint_callback_t g_callback;
 
@@ -34,18 +35,28 @@ static void breakpoint_set(trap_inferior_t inferior, breakpoint_t *bp)
   ptrace_util_poke_text(pid, bp->aligned_address, modified_word);
 }
 
+breakpoint_t *breakpoint_alloc() {
+  // linear search for unused breakpoint
+  for (breakpoint_t *bp = g_breakpoints; bp < g_breakpoints[MAX_BREAKPOINTS]; bp++) {
+    if (bp->target_address == 0) {
+      // This is free, take it.
+      return bp;
+    }
+  }
+  assert(!"Failed to find a free breakpoint!");
+}
+
 trap_breakpoint_t trap_inferior_set_breakpoint(trap_inferior_t inferior,
                                                char *location)
 {
   const uintptr_t target_address = (uintptr_t)location;
   uintptr_t aligned_address = target_address & ~(0x7UL);
-  breakpoint_t *bp = &g_breakpoint;
+  breakpoint_t *bp = breakpoint_alloc();
 
   bp->original_breakpoint_word = ptrace_util_peek_text(inferior,
 						       aligned_address);
   bp->target_address = target_address;
   bp->aligned_address = aligned_address;

   breakpoint_set(inferior, bp);
 
   return bp;

Next, we need to rethink breakpoint_resolve, it’s used like this:

static trap_breakpoint_t breakpoint_resolve(trap_inferior_t inferior)
{
  return &g_breakpoint;
}

// ...

static void step_over_breakpoint(trap_inferior_t inferior)
{
  struct user_regs_struct regs;
  pid_t pid = inferior;

  trap_breakpoint_t bp = breakpoint_resolve(inferior);      // <===== Here ====
  breakpoint_remove(inferior, bp);
  breakpoint_trigger_callback(inferior, bp);

  ptrace_util_get_regs(pid, &regs);
  ptrace_util_set_instruction_pointer(pid, regs.rip - 1);

  ptrace_util_single_step(pid);
}

static void finish_breakpoint(trap_inferior_t inferior)
{
  breakpoint_t *bp = breakpoint_resolve(inferior);          // <====== Here ====
  breakpoint_set(inferior, bp);
  ptrace_util_continue(inferior);
}

enum inferior_state_t breakpoint_handle(trap_inferior_t inferior, enum inferior_state_t state)
{
  switch(state) {
    case INFERIOR_RUNNING:
      step_over_breakpoint(inferior);
      return INFERIOR_SINGLE_STEPPING;

    case INFERIOR_SINGLE_STEPPING:
      finish_breakpoint(inferior);
      return INFERIOR_RUNNING;

    default:
      abort();
  }
}

In step_over_breakpoint breakpoint_resolve is fine. That is, the inferior was running normally and stopped at a normal breakpoint and want to step over it. Inferior’s instruction pointer points to the end of the breakpoint instruction (0xCC) so one byte after the breakpoint target address.

But, in finish_breakpoint we have just finished single stepping and the inferior’s instruction pointer points to the start of the next instruction.

I’m going to remove the breakpoint_resolve call from breakpoint_finish. But, what do I replace it with? You might think that we could subtract something other than 1 from the instruction pointer to find the corresponding target_address. There are two things that make this impossible:

x86 has variable sized instructions. We could figure out the size of the instruction if we were at the start of the instruction by decoding it. But, we are past that instruction so we have no chance of decoding it.
The instruction that was just executed could have been some sort of branching instruction (e.g jump or call) so that by the end the instruction pointer has branched far away from the start of the original instruction.

So, basically the inferior’s current instruction pointer really tells us nothing about the address of the instruction that we just finished stepping over. I think this means that we have to save the breakpoint across the single step. There are two ways we can do this:

Stash it in a global. This is easy but I’m worried that as we develop new features in trap that this will be unsafe. I can’t quite imagine the scenarios that will lead to problems but it doesn’t feel right.
Restructure the breakpoint handling. I wonder if breakpoint_handle shouldn’t return to trap_inferior_continue instead it could call waitpid itself. Then, it can just keep the breakpoint it found initially.

This isn't a matter of making breakpoint_resolve smarter; it's a structural problem requiring us to maintain state across operations.

So I’m proposing a structural change that would look something like this:

enum inferior_state_t breakpoint_handle(trap_inferior_t inferior, enum inferior_state_t state)
{
  int status;

  // This is the critical fix - maintaining breakpoint state
  // across the single-step operation
  trap_breakpoint_t bp = breakpoint_resolve(inferior);

  assert(state == INFERIOR_RUNNING);
  step_over_breakpoint(inferior, bp);

  waitpid(inf->pid, &status, 0);
  if (WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP) {
    finish_breakpoint(inferior, bp);
  } else if (WIFEXITED(status)) {
    return;
  } else {
    fprintf(stderr, "Unexpected stop in trap_inferior_continue: 0x%x\n", status);
    abort();
  }
}

Ok, so to explain again: trap_inferior_continue will wake up when the breakpoint is triggered and call this new breakpoint_handle which will:

Look up (or resolve) the breakpoint.
Call step_over_breakpoint to remove the breakpoint and call PTRACE_SINGLESTEP.
Call waitpid to wait for the inferior to finish single stepping.
Call finish_breakpoint to restore the original breakpoint.

So this way breakpoint_handle doesn’t return back to trap_inferor_continue between single stepping and finishing. This means it can maintain state across the single step by simply keeping it on the stack. I think this is our solution.

Now we just have to fix breakpoint_finish to take the breakpoint, bp, as an argument (I’ll leave that as an exercise for the reader) and implement a new breakpoint_resolve to look the address up out of the g_breakpoints array.

static trap_breakpoint_t breakpoint_resolve(trap_inferior_t inferior)
{
  pid_t pid = inferior;
  struct user_regs_struct regs;
  uintptr_t target_address;

  ptrace_util_get_regs(pid, &regs);
  target_address = regs.rip - 1;  // Back up to the start of the breakpoint

  // linear search for target_address
  for (breakpoint_t *bp = g_breakpoints; bp < &g_breakpoints[MAX_BREAKPOINTS]; bp++) {
    if (bp->target_address == target_address) {
      return bp;
    }
  }
  assert(!"Failed to find existing breakpoint at address");
}

With a few more minor changes the tests pass!

The changes are to breakpoint_handle to:

Take over the responsibility of calling resolve_breakpoint. Pass bp to both step_over_breakpoint and finish_breakpoint.
Clean up some types
Make sure to return a value reflecting the current state in all successful branches.

enum inferior_state_t breakpoint_handle(trap_inferior_t inferior, enum inferior_state_t state)
{
  pid_t pid = inferior;

  // This is the critical fix - maintaining breakpoint state
  // across the single-step operation
  trap_breakpoint_t bp = breakpoint_resolve(inferior);
  int status;

  assert(state == INFERIOR_RUNNING);
  step_over_breakpoint(inferior, bp);
  waitpid(pid, &status, 0);
  if (WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP) {
    finish_breakpoint(inferior, bp);
    return INFERIOR_RUNNING;
  } else if (WIFEXITED(status)) {
    // This is a lie, but it should cause trap_inferior_continue to wait again.
    return INFERIOR_RUNNING;
  } else {
    fprintf(stderr, "Unexpected stop in trap_inferior_continue: 0x%x\n", status);
    abort();
  }
}

Next Steps

We have fixed our bug by making a structural change to breakpoint_handle that merges what was two calls into a single call that can maintain state about the breakpoint throughout the entire handling process. The old test that we originally wrote now passes.

Enjoying the post? Sign up for the newsletter or support the work on Buy Me a Coffee .

While I have put together a PR for this change we really should consider two options now. We can continue retracing our steps through Handling multiple breakpoints in Trap or we can take what we learned and fix the bug directly in the master branch.

Assuming the rest of the changes in the original Handling multiple breakpoints in Trap were good then I think porting the fix forward to the master branch makes the most sense. That’s not to say that the idea to retrace our steps was a bad one. There a number of advantages in resetting back to an old commit to debug. For me, I was able to look at the change fresh to better see what was going wrong. And from a test first perspective, resetting insures that we start from a passing test and can add one failing test at a time.

And while resetting has these advantages we should also keep in mind that once we leverage these advantages to solve the problem we shouldn’t let ourselves become overly attached to the new branch. If taking our new understanding and porting our fix forward yields quicker results then we should definitely do that. Though we must insure that we validate the fix properly in the master branch.

And I think that’s the case here because there are a lot of steps left in the roadmap we have based on the summary of the old post. We have done things like:

Removing Breakpoints
Deferred Removal
Dealing with setting a duplicate breakpoint.
Clean up

By porting forward we can keep all of these features and their tests. I’ll take care of porting the fix forward behind the scenes and end this post. This should leave trap in working state on my new development system. And as this post ends we should take away the following lessons:

Some bugs require a structural change to resolve especially when managing state throughout a longer process.
The technique of resetting and retracing our steps is a powerful one that can help us better focus on the problem in front on us. And once that problem is solved we should reevaluate the need to continue retracing vs porting the fix forward.

This post is part of my Writing a Debugger series.
View all posts and upcoming topics →