Debugger state machine

Two posts ago I wrote about INT 3 and breakpoints. In this post we will continue on with the development of the Trap debugger by taking care of some of the TODO items we collected during the last phase of development. In doing so we will work toward formalizing the state machine that Trap follows when managing the inferior:

Recall our TODO list at the end of the last post:

Refactor
- trap_inferior_continue
- Other functions or duplication?
Pay attention to what the warnings are telling us
Write a test that sets a breakpoint in an inferior other than “hello”
Write a test that sets a breakpoint in a function other than “main”
Write a test that sets more than one breakpoint.
Write a test that hits the same breakpoint more than once.
Fix error checking
- Write tests that expect errors?

Refactor `trap_inferior_continue()`

Here’s the code to trap_inferior_countinue()

void trap_inferior_continue(trap_inferior_t inferior)
{
  pid_t pid = inferior;

  ptrace(PTRACE_CONT, pid, ignored_ptr, no_continue_signal);
  while(1) {
    int status;
    waitpid(pid, &status, 0);

    if (WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP) {
      struct user_regs_struct regs;

      trap_breakpoint_t bp = breakpoint_resolve(inferior);
      breakpoint_remove(inferior, bp);
      breakpoint_trigger_callback(inferior, bp);

      ptrace(PTRACE_GETREGS, pid, ignored_ptr, &regs);
      regs.rip -= 1;
      ptrace(PTRACE_SETREGS, pid, ignored_ptr, &regs);

      ptrace(PTRACE_CONT, pid, ignored_ptr, no_continue_signal);
    } else if (WIFEXITED(status)) {
      return;
    } else {
      fprintf(stderr, "Unexpected stop in trap_inferior_continue: 0x%x\n", status);
      abort();
    }
  }
}

In the last post, I mentioned that the function was getting too long and the longest part is in handling SIGTRAP. So let’s break this out into a separate function like this:

static void some_function(trap_inferior_t inferior)
{
  struct user_regs_struct regs;
  pid_t pid = inferior;

  trap_breakpoint_t bp = breakpoint_resolve(inferior);
  breakpoint_remove(inferior, bp);
  breakpoint_trigger_callback(inferior, bp);

  ptrace(PTRACE_GETREGS, pid, ignored_ptr, &regs);
  regs.rip -= 1;
  ptrace(PTRACE_SETREGS, pid, ignored_ptr, &regs);

  ptrace(PTRACE_CONT, pid, ignored_ptr, no_continue_signal);
}

I’ve named the function some_function() because I haven’t thought about what it does yet. Now that I can see it altogether I can think of some names. Based on the usage I would call this handle_sigtrap(). Or, based on what it does a good name would be handle_breakpoint().

Naming should be based on the behavior the function advertises rather than on usage or how the function does something.

So I would pick handle_breakpoint() over handle_sigtrap(). But, given the name handle_breakpoint() the function really belongs in the breakpoint module – within the breakpoint module the name would be breakpoint_handle(). So we have, now in breakpoint.c:

#include <sys/user.h>
static const void *ignored_ptr;
static const void *no_continue_signal = 0;

void breakpoint_handle(trap_inferior_t inferior)
{
  struct user_regs_struct regs;
  pid_t pid = inferior;

  trap_breakpoint_t bp = breakpoint_resolve(inferior);
  breakpoint_remove(inferior, bp);
  breakpoint_trigger_callback(inferior, bp);

  ptrace(PTRACE_GETREGS, pid, ignored_ptr, &regs);
  regs.rip -= 1;
  ptrace(PTRACE_SETREGS, pid, ignored_ptr, &regs);

  ptrace(PTRACE_CONT, pid, ignored_ptr, no_continue_signal);
}

And in inferior_load.c:

void trap_inferior_continue(trap_inferior_t inferior)
{
  pid_t pid = inferior;

  ptrace(PTRACE_CONT, pid, ignored_ptr, no_continue_signal);
  while(1) {
    int status;
    waitpid(pid, &status, 0);

    if (WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP) {
      breakpoint_handle(inferior);
    } else if (WIFEXITED(status)) {
      return;
    } else {
      fprintf(stderr, "Unexpected stop in trap_inferior_continue: 0x%x\n", stat\
us);
      abort();
    }
  }
}

I like this much better. We’ve improved the modularity by having the breakpoint module handle breakpoints. But there are a couple of things that can still be improved:

Duplication of ignored_ptr and no_continue_signal
I need a header file in which to declare breakpoint_handle. The current situation is now actually better than it was in the last post. I skipped over the fact that I had warnings about missing declarations for breakpoint_remove, breakpoint_resolve, and breakpoint_trigger_callback. These are no longer a problem because they are used internally in the breakpoint module.
breakpoint_remove, breakpoint_resolve, and breakpoint_trigger_callback can all be made static.

Let’s start with #2 and just add a simple breakpoint.h file

#if !defined(_TRAP_BREAKPOINT_H)
#define _TRAP_BREAKPOINT_H

void breakpoint_handle(trap_inferior_t inferior);

#endif /* _TRAP_BREAKPOINT_H */

For #3 I just added static to the functions.

For #1 I want to do something a little more than what is strictly necessary to remove the duplication of ignored_ptr and no_continue_signal. But what I want to do may help abstract the code a bit. The ptrace API is very flexible with its arguments but I’ve been essentially working around this or at least clarifying it by using these values like ignored_ptr and no_continue_signal. To make things even clearer I think that I should write specific functions for the various ptrace requests I need. This way I can encapsulate all of this documentation. So for example, I want:

void ptrace_continue(pid_t pid)
{
  static const void *ignored_addr = 0;
  static const void *no_continue_signal = 0;

  ptrace(PTRACE_CONT, pid, ignored_addr, no_continue_signal);
}

Then I can call this function in place of ptrace(PTRACE_CONT, ...) wherever I need to. I would imagine having a set of these ptrace utility functions. So I will create ptrace_util.c and ptrace_util.h with these functions and use them throught Trap. Here’s the contents of ptrace_util.h. You can imagine the implementations.

#if !defined(_TRAP_PTRACE_UTIL_H)
#define _TRAP_PTRACE_UTILT_H

#include <sys/types.h>
#include <sys/user.h>

void ptrace_util_traceme();
void ptrace_util_continue(pid_t pid);
long ptrace_util_peek_text(pid_t pid, unsigned long target_address);
void ptrace_util_poke_text(pid_t pid, unsigned long target_address,
                           unsigned long data);
void ptrace_util_get_regs(pid_t pid, struct user_regs_struct *regs);
void ptrace_util_set_regs(pid_t pid, struct user_regs_struct *regs);

#endif /* _TRAP_PTRACE_UTIL_H */

I’ve also updated the core Trap to use thse functions and things are much better. I’ve been wanting to make this clean up for some time.

The ideal form

If we look ahead on the roadmap we can see that these upcoming features

Add support for debugging multiple threads.
Use the debugger hook for dynamic library loading.
Add watch points.

All of these features involve new events (thread creation, thread destruction, library load, watch point hit) that Trap will need to respond to. All of these events, and breakpoints, cause execution of the inferior to stop so that Trap can respond. That means that all of these events will need to be processed by trap_inferior_continue(). This function will, in practice, become the main event loop. The current implementation of trap_inferior_continue() looks like this:

void trap_inferior_continue(trap_inferior_t inferior)
{
  pid_t pid = inferior;

  ptrace_util_continue(pid);
  while(1) {
    int status;
    waitpid(pid, &status, 0);

    if (WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP) {
      breakpoint_handle(inferior);
    } else if (WIFEXITED(status)) {
      return;
    } else {
      fprintf(stderr, "Unexpected stop in trap_inferior_continue: 0x%x\n", stat\
us);
      abort();
    }
  }
}

As we add events we can imagine abstracting the structure here into something like this

void trap_inferior_continue(trap_inferior_t inferior)
{
  pid_t pid = inferior;

  ptrace_util_continue(pid);
  while(1) {
    int status;
    waitpid(pid, &status, 0);

    if (is_breakpoint()) {
      breakpoint_handle(inferior);
    } else if (is_watchpoint()) {
      watchpoint_handle(inferior)
    } else if (is_thread_event()) {
      thread_handle(inferior);
    } else if (is_library_load()) {
      symbol_manager_handle(inferior);
    } else if (is_exit()) {
      return;
    } else {
      error_handle(inferior);
    }
  }
}

This structure is clearer, easier to follow and easier to extend. Responsibility is delegated to different modules through checkers (the is_whatever functions) and handlers. I find this structure ideal in that I think it is the right structure to aim for but that I also think it may be unattainable as we will find as we continue adding features.

We need to write some more tests to help us determine what real world structure will both meets our needs and resembles our ideal. To drive this structure I want to work through the following proposed test

Write a test that hits the same breakpoint more than once.

To do this we are going to have to hack a bit on the way specify locations of breakpoints.

Breakpoint locations

Right now we have

trap_breakpoint_t trap_inferior_set_breakpoint(trap_inferior_t inferior,
                                               char *location)
{
  const uintptr_t target_address = 0x000000000040079d;
  const uintptr_t int3_opcode = 0xCC;
  pid_t inferior_pid = inferior;
  uintptr_t modified_word;

  uintptr_t aligned_address = target_address & ~(0x7UL);
  uintptr_t target_offset = target_address - aligned_address;


  g_original_breakpoint_word = ptrace_util_peek_text(inferior_pid,
                                                     aligned_address);
  modified_word = g_original_breakpoint_word;
  modified_word &= ~(0xFFUL << (target_offset * 8));
  modified_word |= int3_opcode << (target_offset * 8);
  ptrace_util_poke_text(inferior_pid, aligned_address, modified_word);

  return 0;
}

And as I mentioned last time, the argument location is unused. Instead we use a hardcoded address of 0x000000000040079d. A proper solution should use symbols but that is some ways away for us. But, we could at least support different raw addresses. The caller could pass the address in through location.

First, let’s modify test_breakpoint to do this:

--- a/test/test_breakpoint.c
+++ b/test/test_breakpoint.c
@@ -20,7 +20,7 @@ int main()

   trap_breakpoint_set_callback(breakpoint_callback);
   g_inferior = trap_inferior_exec("./inferiors/hello", argv);
-  g_bp = trap_inferior_set_breakpoint(g_inferior, "main");
+  g_bp = trap_inferior_set_breakpoint(g_inferior, (char *)0x000000000040079d);
   trap_inferior_continue(g_inferior);

   assert(g_breakpoint_count == 1);

Then we can update trap_breakpoint_set_callback()

--- a/src/breakpoint.c
+++ b/src/breakpoint.c
@@ -14,7 +14,7 @@ void trap_breakpoint_set_callback(trap_breakpoint_callback_t callback)
 trap_breakpoint_t trap_inferior_set_breakpoint(trap_inferior_t inferior,
                                                char *location)
 {
-  const uintptr_t target_address = 0x000000000040079d;
+  const uintptr_t target_address = (uintptr_t)location;
   const uintptr_t int3_opcode = 0xCC;
   pid_t inferior_pid = inferior;
   uintptr_t modified_word;

And our tests still pass. This will allow us to write a few more intersting tests without having to add support for symbols today.

Write a test that hits the same breakpoint more than once.

To write this test first we will need a new inferior

int foo()
{
  return 5;
}

int main()
{
  for (int i = 0; i < 5; i++) {
    foo();
  }

  return 0;
}

I call this inferior loop, and with this inferior we can set a breakpoint in foo() and expect it to be hit 5 times.

To do this I will need to know the address of foo(). I find this by running

$ objdump -x _out/test/inferiors/loop   | grep foo
000000000040076d g     F .text	000000000000000b              foo

So I will set the breakpoint at 000000000040076d. Here’s a test called test_multi_breakpoint.c to do it:

#include <assert.h>
#include <trap.h>
#include <unistd.h>

int g_breakpoint_count = 0;
trap_inferior_t g_inferior = 0;
trap_breakpoint_t g_bp = 0;

void breakpoint_callback(trap_inferior_t inferior, trap_breakpoint_t bp)
{
  assert(inferior == g_inferior);
  assert(bp == g_bp);

  g_breakpoint_count++;
}

int main()
{
  char *argv[1] = { 0 };

  trap_breakpoint_set_callback(breakpoint_callback);
  g_inferior = trap_inferior_exec("./inferiors/loop", argv);
  g_bp = trap_inferior_set_breakpoint(g_inferior, (char *)0x000000000040076d);
  trap_inferior_continue(g_inferior);

  assert(g_breakpoint_count == 5);

  return 0;
}

The only differences between this test and test_breakpoints are

Run the loop inferior
Use the address of foo
Assert the breakpoint is hit 5 times instead of 1 time.

This test fails with SIGSEGV.

3/3 Test #3: test_multi_breakpoint ............***Exception: Other  0.46 sec
Unexpected stop in trap_inferior_continue: 0xb7f

The 0x7f in the least significant byte of the status means stop due to signal. And the 0xb is signal 11 or SIGSEGV.

This is the same problem we had last time when implementing the breakpoint handling initially. It was solved by restoring the orignal bytes after the breakpoint is encountered. The reason why this is failing for test_multi_breakpoint is that the location is also hardcoded in

static void breakpoint_remove(trap_inferior_t inferior,
                              trap_breakpoint_t handle)
{
  unsigned long target_address = 0x000000000040079d;
  pid_t inferior_pid = inferior;

  ptrace_util_poke_text(inferior_pid, target_address,
			g_original_breakpoint_word);
}

and we are using the address of main in the hello inferior. To fix this problem we have to do a better job of tracking our breakpoints.

Also, while I was looking over the code involved I found another bug in breakpoint_remove()

static void breakpoint_remove(trap_inferior_t inferior,
                              trap_breakpoint_t handle)
{
  unsigned long target_address = 0x000000000040079d;
  pid_t inferior_pid = inferior;

  ptrace_util_poke_text(inferior_pid, target_address,
			g_original_breakpoint_word);
}

We use the target_address directly, but in trap_inferior_set_breakpoint() we align the address to the next lowest multiple of 8. Because our breakpoint tracking will track the target address we will fix this bug as we add the tracking. To store data about a breakpoint I’ve written this struct:

struct breakpoint_t {
  uintptr_t target_address;
  uintptr_t aligned_address;
  uintptr_t original_breakpoint_word;
};

Then I allocate one of these structures and fill it in in trap_inferior_set_breakpoint(). I then return the pointer to the struct as the breakpoint handle instead of returning a hardcoded handle of 0. This means changing trap_breakpoint_t to be a void*. trap_inferior_set_breakpoint() now looks like this:

trap_breakpoint_t trap_inferior_set_breakpoint(trap_inferior_t inferior,
                                               char *location)
{
  const uintptr_t target_address = (uintptr_t)location;
  const uintptr_t int3_opcode = 0xCC;
  pid_t inferior_pid = inferior;
  uintptr_t modified_word;

  uintptr_t aligned_address = target_address & ~(0x7UL);
  uintptr_t target_offset = target_address - aligned_address;

  breakpoint_t *bp = malloc(sizeof(breakpoint_t));

  bp->original_breakpoint_word = ptrace_util_peek_text(inferior_pid,
                                                       aligned_address);
  bp->target_address = target_address;
  bp->aligned_address = aligned_address;

  modified_word = bp->original_breakpoint_word;
  modified_word &= ~(0xFFUL << (target_offset * 8));
  modified_word |= int3_opcode << (target_offset * 8);
  ptrace_util_poke_text(inferior_pid, aligned_address, modified_word);

  return bp;
}

Now the information about the breakpoint is tracked. We just need to use it in breakpoint_remove(). But there is one small issue - breakpoint_handle() uses breakpoint_resolve() to look up a breakpoint.

void breakpoint_handle(trap_inferior_t inferior)
{
  struct user_regs_struct regs;
  pid_t pid = inferior;

  trap_breakpoint_t bp = breakpoint_resolve(inferior);
  breakpoint_remove(inferior, bp);
  // ...
}

The simplest thing to do to pass this test is to have a global breakpoint_t that we use to store all this information. Then breakpoint_resolve() can return it to us.

static trap_breakpoint_t breakpoint_resolve(trap_inferior_t inferior)
{
  return &g_breakpoint;
}

static void breakpoint_remove(trap_inferior_t inferior,
                              trap_breakpoint_t handle)
{
  breakpoint_t *bp = (breakpoint_t *)handle;
  pid_t inferior_pid = inferior;

  ptrace_util_poke_text(inferior_pid, bp->aligned_address,
                        bp->original_breakpoint_word);
}

Also, I’ve removed the malloc() call from trap_inferior_set_breakpoint and replaced it with

breakpoint_t *bp = &g_breakpoint;

We could actually make this a little simpler but this form will be more extensible as we improve it going forward.

After working with this for some time I discovered another bug. The request PTRACE_SETREGS fails. I’m not convinced this call can ever succeed as some of the more exotic registers can’t be set in the kernel. Instead I can write just the instruction pointer like this:

void ptrace_util_set_instruction_pointer(pid_t pid, uintptr_t ip)
{
  uintptr_t offset = offsetof(struct user, regs.rip);
  int result = ptrace(PTRACE_POKEUSER, pid, offset, ip);
  if (result != 0) {
    perror("ptrace_util_set_instruction_pointer: ");
    abort();
  }
}

Also, its time to add error checking to all ptrace functions so I can save time debugging these kinds of problems in the future. The ptrace_util_set_instruction_pointer() function above checks for others and I’ve added this style of error checking to all the other functions in ptrace_util.c. This is a nice payoff that we see from abstracting ptrace calls into ptrace_util.c – we can consolidate all the ptrace() error checking here.

Goinb back to breakpoint handling, we find that test_multi_breakpoint still fails but it fails in the way I expected:

3/3 Test #3: test_multi_breakpoint ............***Exception: Other  0.24 sec
test_multi_breakpoint: /home/joseph/src/c/trap/test/test_multi_breakpoint.c:26: main: Assertion `g_breakpoint_count == 5' failed.

The breakpoint is hit only once rather than 5 times. The reason for this is that we removed the breakpoint after it was hit the first time in order to execute the original instruction. To make the breakpoint work each time we have to add it back when we are done. But, right now we don’t have a way to know when the inferior is done executing that one instruction. Fortunately, ptrace provides a PTRACE_SINGLESTEP request which we can use to step over that single instruction. The man page describes this request as:

PTRACE_SYSCALL, PTRACE_SINGLESTEP
      Restart the stopped tracee as for PTRACE_CONT, but arrange for the tracee to be stopped  at
      the  next  entry to or exit from a system call, or after execution of a single instruction,
      respectively.  (The tracee will also, as usual, be stopped upon receipt of a signal.)  From
      the  tracer's perspective, the tracee will appear to have been stopped by receipt of a SIG‐
      TRAP.  So, for PTRACE_SYSCALL, for example, the idea is to inspect  the  arguments  to  the
      system  call at the first stop, then do another PTRACE_SYSCALL and inspect the return value
      of the system call at the second stop.  The data argument is treated  as  for  PTRACE_CONT.
      (addr is ignored.)

The first step here is easy, we need to write ptrace_util_single_step:

void ptrace_util_single_step(pid_t pid)
{
  static const void *no_step_signal = 0;
  int result = ptrace(PTRACE_SINGLESTEP, pid, ignored_addr, no_step_signal);
  if (result != 0) {
    perror("PTRACE_SINGLESTEP: ");
    abort();
  }    
}

The next step is to use this function. Initially, it might seem like this would be enough to implement the single step:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
--- a/src/breakpoint.c
+++ b/src/breakpoint.c
@@ -70,14 +70,17 @@ static void breakpoint_remove(trap_inferior_t inferior,
 void breakpoint_handle(trap_inferior_t inferior)
 {
   struct user_regs_struct regs;
   pid_t pid = inferior;

   trap_breakpoint_t bp = breakpoint_resolve(inferior);
   breakpoint_remove(inferior, bp);
   breakpoint_trigger_callback(inferior, bp);

   ptrace_util_get_regs(pid, &regs);
   ptrace_util_set_instruction_pointer(pid, regs.rip - 1);

+  ptrace_util_single_step(pid);
+  // Reset the breakpoint
+
   ptrace_util_continue(pid);
 }

But the problem is that ptrace_util_single_step() steps the inferior and then raises a SIGTRAP. When ptrace_util_single_step() returns to breakpoint_handle we don’t know the current state of the inferior. We need to call wait() on the inferior to wait for it to receive SIGTRAP and stop. This is exactly what the trap_inferior_continue() knows how to do:

void trap_inferior_continue(trap_inferior_t inferior)
{
  pid_t pid = inferior;

  ptrace_util_continue(pid);
  while(1) {
    int status;
    waitpid(pid, &status, 0);

    if (WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP) {
      breakpoint_handle(inferior);
    } else if (WIFEXITED(status)) {
      return;
    } else {
      fprintf(stderr, "Unexpected stop in trap_inferior_continue: 0x%x\n", status);
      abort();
    }
  }

The only issue is that trap_inferior_continue() always calls breakpoint_handle() on SIGTRAP we need a way to distinguish an INT 3 from the end of a single step.

A state machine

The solution I choose to employ is a state machine. To manage this I’ll need to start storing more data about the inferior:

I had started putting together inferior.h before I needed it. Now we need it:

#if !defined(_TRAP_INFERIOR_H)
#define _TRAP_INFERIOR_H

#include <sys/types.h>
#include <unistd.h>

enum inferior_state_t {
  INFERIOR_RUNNING,
  INFERIOR_STOPPED,
};

struct Inferior_t {
  pid_t pid;
  enum inferior_state_t state;
};
typedef struct inferior_t inferior_t;

#endif /* _TRAP_INFERIOR_H */

This defines a simple struct to hold an inferior which consists of the ProcessID and the state. The state is a simple enum that can hold any states we define. Currently, we have states INFERIOR_RUNNING meaning that the inferior is running normally and INFERIOR_STOPPED that means the inferior is stopped being processed by Trap.

For the moment, let’s use a single, global inferior_t in inferior_load.c much the same way we use a single, global, breakpoint_t in breakpoint.c:

+static inferior_t g_inferior;
+
+static inferior_t *inferior_resolve(trap_inferior_t inferior)
+{
+  assert(inferior == g_inferior.pid);
+  return &g_inferior;
+}
+
 void trap_inferior_continue(trap_inferior_t inferior)
 {
-  pid_t pid = inferior;
+  inferior_t *inf = inferior_resolve(inferior);

-  ptrace_util_continue(pid);
+  ptrace_util_continue(inf->pid);
   while(1) {
     int status;
-    waitpid(pid, &status, 0);
+    waitpid(inf->pid, &status, 0);

     if (WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP) {
       breakpoint_handle(inferior);

Next we manage the states. The simplest form of this looks like this:

 void trap_inferior_continue(trap_inferior_t inferior)
 {
   inferior_t *inf = inferior_resolve(inferior);

+  inf->state = INFERIOR_RUNNING;
   ptrace_util_continue(inf->pid);
   while(1) {
     int status;
     waitpid(inf->pid, &status, 0);

     if (WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP) {
+      inf->state = INFERIOR_STOPPED;
       breakpoint_handle(inferior);
     } else if (WIFEXITED(status)) {
       return;
     } else {
       fprintf(stderr, "Unexpected stop in trap_inferior_continue: 0x%x\n", status);
       abort();
     }
   }
 }

This reflects what we were already able to manage without states. But we need to have a more precise definition of the state in order to manage the breakpoints. So first, I’ll propose this change:

     waitpid(inf->pid, &status, 0);

     if (WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP) {
-      inf->state = INFERIOR_STOPPED;
-      breakpoint_handle(inferior);
+      inf->state = breakpoint_handle(inferior);
     } else if (WIFEXITED(status)) {
       return;
     } else {

That is, we will let the breakpoint module determine the states when handling breakpoints. Then we update the breakpoint code. First, I’ll add a new state to inferior_state_t called INFERIOR_SINGLE_STEPPING. The I’ll split breakpoint_handle into two cases. The first handles the initial breakpoint and the second handles single stepping. So we have:

void breakpoint_handle(trap_inferior_t inferior, inferior_state_t state)
{
  switch(state) {
    case INFERIOR_RUNNING:
      step_over_breakpoint(inferior);
      return INFERIOR_SINGLE_STEPPING;
    
    case INFERIOR_SINGLE_STEPPING:
      finish_breakpoint(inferior);
      return INFERIOR_RUNNING;
  }
}

With ths change, breakpoint_handle() becomes the point of state transition for breakpoints. To formalize this a bit, we can look at Trap as a state machine with the following state transition graph:

The nodes in this graph represent possible states, the edges represent possible transitions, and the labels on the edges describe the events that trigger those transitions. This state machine describes what we want (rather than what we currently have), in that we can handle the breakpoint (or eventually breakpoints) multiple times going back-and-forth between the RUNNING and SINGLE_STEPPING states.

Now that we have this nice abstraction, we have to write the two functions used by breakpoint_handle(). The function step_over_breakpoint() is effectively the old version of breakpoint_handle:

static void step_over_breakpoint(trap_inferior_t inferior)
{
  struct user_regs_struct regs;
  pid_t pid = inferior;

  trap_breakpoint_t bp = breakpoint_resolve(inferior);
  breakpoint_remove(inferior, bp);
  breakpoint_trigger_callback(inferior, bp);

  ptrace_util_get_regs(pid, &regs);
  ptrace_util_set_instruction_pointer(pid, regs.rip - 1);

  ptrace_util_single_step(pid);
}

And the new function finish_breakpoint() is:

static void finish_breakpoint(trap_inferior_t inferior)
{
  const uintptr_t int3_opcode = 0xCC;
  pid_t pid = inferior;

  breakpoint_t *bp = breakpoint_resolve(inferior);
  uintptr_t target_offset = bp->target_address - bp->aligned_address;
  
  uintptr_t modified_word;
  modified_word = bp->original_breakpoint_word;
  modified_word &= ~(0xFFUL << (target_offset * 8));
  modified_word |= int3_opcode << (target_offset * 8);
  ptrace_util_poke_text(pid, bp->aligned_address, modified_word);
  
  ptrace_util_continue(pid);
}

This is of course a bunch of cut-and-paste from trap_inferior_set_breakpoint() but the test passes!

    Start 3: test_multi_breakpoint
3/3 Test #3: test_multi_breakpoint ............   Passed    0.01 sec

Now that we have a working test, we can refactor to clean up our cut-and-paster. We can extract the code to set the INT 3 instruction and share it between the two functions.

I’ve written:

static void breakpoint_set(trap_inferior_t inferior, breakpoint_t *bp)
{
  const uintptr_t int3_opcode = 0xCC;
  pid_t pid = inferior;
  uintptr_t target_offset = bp->target_address - bp->aligned_address;

  uintptr_t modified_word;
  modified_word = bp->original_breakpoint_word;
  modified_word &= ~(0xFFUL << (target_offset * 8));
  modified_word |= int3_opcode << (target_offset * 8);
  ptrace_util_poke_text(pid, bp->aligned_address, modified_word);
}

I choose the interface to take a breakpoint_t * rather than a location. This works well for finish_breakpoint() because it has looked up a breakpoint_t * using breakpoint_resolve(). It will also work well for trap_inferior_set_breakpoint() because it needs to build a breakpoint_t * to reutrn as the breakpoint handle. It can build up the breakpoint_t * and then call breakpoint_set() like this:

trap_breakpoint_t trap_inferior_set_breakpoint(trap_inferior_t inferior,
                                               char *location)
{
  const uintptr_t target_address = (uintptr_t)location;
  uintptr_t aligned_address = target_address & ~(0x7UL);
  breakpoint_t *bp = &g_breakpoint;

  bp->original_breakpoint_word = ptrace_util_peek_text(inferior,
						       aligned_address);
  bp->target_address = target_address;
  bp->aligned_address = aligned_address;
  
  breakpoint_set(inferior, bp);

  return bp;
}

Wrapping Up

In this post we added a new feature to the Trap debugger so that it can process a breakpoint multiple times. And along the way we fixed some bugs. Our test coverage is giving a us a bit more confidence in our tool though we still have a long way to go.

We also discussed the ideal form for trap_inferior_continue() which I proposed should look something like this:

void trap_inferior_continue(trap_inferior_t inferior)
{
  pid_t pid = inferior;

  ptrace_util_continue(pid);
  while(1) {
    int status;
    waitpid(pid, &status, 0);

    if (is_breakpoint()) {
      breakpoint_handle(inferior);
    } else if (is_watchpoint()) {
      watchpoint_handle(inferior)
    } else if (is_thread_event()) {
      thread_handle(inferior);
    } else if (is_library_load()) {
      symbol_manager_handle(inferior);
    } else if (is_exit()) {
      return;
    } else {
      error_handle(inferior);
    }
  }
}

In implementing support for multiple hits of a breakpoint we added states to this ideal so it now follows a form more like:

void trap_inferior_continue(trap_inferior_t inferior)
{
  inferior_t *inf = inferior_resolve(inferior);

  inf->state = INFERIOR_RUNNING;  
  ptrace_util_continue(inf->pid);
  while(1) {
    int status;
    waitpid(pid, &status, 0);

    if (is_breakpoint()) {
      inf->state = breakpoint_handle(inferior, inf->state);
    } else if (is_watchpoint()) {
      inf->state = watchpoint_handle(inferior, inf->state)
    } else if (is_thread_event()) {
      inf->state = thread_handle(inferior, inf->state);
    } else if (is_library_load()) {
      inf->state = symbol_manager_handle(inferior, inf->state);
    } else if (is_exit()) {
      return;
    } else {
      error_handle(inferior);
    }
  }
}

That is, all of our handlers will be state transitions. By managing the state in the hadlers we can keep the logic localized to the individual modules and we can keep the complexity in trap_inferior_continue() to a minimum.

We started this post with a TODO list. We’ve managed to finish a few of the items but these still remain on the list

Pay attention to what the warnings are telling us
Write a test that sets more than one breakpoint.

In the next post we will handle more than one breakpoint. This should also fix some of the warnings we’ve left in the code.

Refactor trap_inferior_continue()