How I read new code

Ring Theory

When I was in college I studied computer science but I also took a lot of math classes including abstract math. My intro to abstract algebra went well and I enjoyed the class. I decided to take the next course which was on ring theory, if I remember correctly. The beginning of this class started off as sort of a shock, yes it got more abstract and harder to understand but our professor also had an interesting approach. He assigned reading, we were to go back and read the text book and then do some exercises. And he didn’t really want to teach us the material via lecture, he wanted to have a discussion about the material in class. Ok, this seemed more like what I would expect in a philosophy or literature class but in math? I was really conditioned to learning math from lectures.

My friend Rob was in the class as well and we would study together. He and I tried to read the text. Our professor had told us to read until the point we lose understanding and then come to office hours and ask him for help. The problem was that we didn’t really have a single point where we would lose understanding. It was more that we had a gradual decay of understanding over time. Until at some point you realize you haven’t understood a thing for several paragraphs.

As our frustration with our lack of progress grew Rob and I decided to try something different. We flipped to the end of the chapter and started reading the exercises we had been assigned. We thought we should try to get through them anyway we could. This ended up working extremely well. I can’t remember any of the exercises we worked on that day but here’s one I found online in Ring Theory that can serve as an example:

6.1
1. Suppose that R satisfies all ring axioms for a ring with identity, except that for all
 xy in R, x+y=y+x. Show that this axiom is implied by the rest.

So we read the question and then asked ourselves, “what does this even mean?”

And we went back through the text to read and to find answers to the questions:

What is a ring?
What are the axioms for a ring with identity?

Then we at least understood the question. From there we could work with the other axioms to try to prove that for all xy in R, x+y=y+x.

Reading Code

Ok, so what does this story about my ring theory class have to do with reading code? Well, everything. I think the take away from that story is that:

The key to our understanding was the context created by the question in the exercise.

And this is maybe one of the most impactful things I’ve ever learned in my life.

So, when I read code that’s new to me I like to have a question to provide context. Without a question I just read function after function and my understanding of the system as a whole just starts to degrade over time.

When I’m starting out with a new system I get a small bug or task to work on and then read the code in the context of that bug or task. This helps me focus on a few details at a time and how they relate to the bug I’m working on, and in turn to each other. A small bug to fix or task to work on is something that a company might give you as a new hire, but I always treat as a way frame learning the system rather than as just an assignment.

Enjoying the post? Sign up for the newsletter or support the work on Buy Me a Coffee .

Top down / Bottom up

Reading functions is what I would call building a bottom up understanding of a system. Functions are low-level details within the system and conceivably if you read and understood all of them you could build a complete understanding of the system. However, what I find hard to understand when just reading functions is how they fit together. Sure, you can get an idea of how a few of them fit together if you read them in the right order. But, calls between functions can be far apart in the code base making it harder to build this understanding. Creating a context around a bug builds, what feels to me like, a container that ties a few functions together and makes it easier to think about the relationships. Pushing to the extreme is what I think of at top down thinking. That is, puting the relationships first - what are the overarching activities or data flows that are in a system? These activities and data flows are implemented using the functions in the system but the functions themselves don’t always reveal what the over arching flows are.

I once started a new job at a company that was building a distributed system. I was hired to do performance work and so I started looking at one of the nodes, let’s call it node A, in the system. Others had told me to take a look at node A because they thought it had high latency and would be a good place to start. I dug into node A with the context of: where does this node do most of its work? I did some profiling and made a few minor improvements. Great, I got something accomplished and I learned a little more about node A.

From there I looked at another node and realized a couple of things:

The individual nodes were relatively easy to read.
There was a common structure, a base class, that all the nodes were built on that gave them a consistent structure and life cycle while providing pub/sub capability.

Understanding this structure helped me to fit other nodes into a mental model so that I could dig into them and understand them more quickly.

I realized that to really look at the latency of the system as a whole I needed to better understand the system as a whole. Looking at nodes randomly or based on rumor was only going to get me so far.

This is what I consider top down vs. bottom up thinking. Looking at individual nodes in the distributed system is bottom up. It’s great for understanding the details of a system. And in fact, for something like performance, a lot of improvements will come from making changes in the details. That’s where work gets done, in the details.

That said, a top down approach is also very useful and can lead to even more significant improvements in performance. Before I describe how, let me first expand on what I mean by top down. I think of the top down understanding of a system like this as understanding how the parts fit together and how information or work flows through the system. In the case of this distributed system, each node performs a different kind of task and they collaborate to solve a larger problem. How they are put together is critical.

The classical performance strategy for a system like this is to understand the critical path through the graph of nodes. To do this you need to understand the system, first of all where are the inputs? Where does data come into the system from the external world? This is the “start” of your critical path. Where do the outputs come from (and how do they get delivered to the external world)? This the end of the critical path. Then the critical path through the nodes in between can be identified and work can be done to improve it.

Another example from the system I worked on was that there was a lot of time spent “between” the nodes or in nodes deciding when to act. Some nodes took input from several other nodes and might have to decide to wait or timeout when data isn’t available from a specific input. Acting sooner reduces latency but risks missing newer inputs. We have to ask, which inputs are critical? Are there certain inputs that should be acted on immediately (e.g. for safety reasons). Are there certain inputs that can’t be dropped (also for safety reasons). If so, not waiting for other inputs can be the right solution here and can save time. But to decide this we need to understand the system holistically, to know what’s important and what’s critical to safety.

Earlier, I promised to explain how top down thinking can be used to make significant improvements in performance. Well, in my mind bottom up thinking is great at identifying where time is spend and then allowing you to make those activities faster thereby improving performance. But, working top down is the best way to identify work that doesn’t have to be done at all. By understanding the connections between the nodes we can identify redundant work or unimportant work and eliminate or deprioritize it so the rest of the system can focus on what’s important.

While both top down and bottom up understandings are valuable I tend to have a preference for top down understanding. Bottom up understanding is more concerned with details, like individual functions. I find that:

The functions aren’t that important on their own. It’s how they work together that really shows what’s going on.
In a system, there are a lot of more details than there are architectures or overarching structures so keeping the top down view in mind is a lot easier.

The problem I find with focusing on the details is that I find that they generally change more rapidly than the overall system architecture. So I tend to understand functions on demand – a bug or task takes me to a function or set of functions and I read it, understand it, work with, and move on. And, I often found when working in high paced companies that the next time I came back to the function it had changed, or been removed, or was used in a completely new way. So getting intimately knowledgeable about that function didn’t pay off. But understanding larger aspects of the system helped so that when I had a bug or task to work on I knew which part of the system to look at. The big blocks of functionality don’t change that frequently so I could come back to them with a broad understanding and come up to speed on the details quickly.

Wrapping Up

Don’t get me wrong, the details are important too.

The details are where the work happens. And the top down architecture or structure is where the decisions about what work to do are made.

I guess what I’m saying is, when diving into a new system: don’t sweat the small stuff.