Leveraging Static Typing to Manage Object State

alex-blog

There has been a lot of talk in the software engineering world about the pains associated with OOP and specifically mutable state. Side-effect free Functional Programming is often touted as a solution. While there are some salient points on mutability, the real issues arise from naive use of implicitly mutable objects. If done properly, mutable state can be a useful tool for reasoning about code, and can guide future developers naturally and seamlessly towards correct code changes. In this post we’ll share some of our headaches that led to a principle of class design–using types to indicate object state.

Stateful Collaborators

Let’s start with an example. At Rescale we use a REST API to keep track of metadata about a job. Our worker nodes query the API to get information about what virtual hardware to spin up, and our cluster nodes query it to get information about what analysis to run. Both of them authenticate by passing an encrypted token. Our first client interface looked something like this:

This interface was good for ensuring that we always passed the correct credentials with each request, but it was a little cumbersome. As we refactored old code to use the new API client, we had to create credentials for each method call. Each worker takes a task for a job and then will make many requests using that job’s authentication token, so we constantly had to create identical credentials objects or pass one around. That made it a hassle to rely on the API as heavily as we wanted.

Our next thought was that we could have a worker set credentials on its API client once, when it pulls a task for a job, and then let all of the helper objects use the client with the assumption that it had already gotten its credentials. The interface was then something like:

Most calling code could now just call the query methods without worrying about credentials. This accomplished the goal of making it much easier to use the API client methods but whenever we introduced the client into a new area of the codebase we would forget to set credentials. It also led us to write code that made implicit assumptions about the context under which it ran, and was hence less reusable and more fragile. Changes to one part of the system had the potential to break other parts, which is one of the principle things to avoid in software systems.

The first implementation is what I’ll call a single-state collaborator. It is just a bag of methods, and developers can easily reason about its behavior when they have an instance. The second does not make things so apparent because it has an implicit state change. If a developer gets a hold of an instance, they see that they can call query methods and those methods will probably work. Understanding the authentication state requires more knowledge of the system.

There’s a better design that leverages static typing to make the authentication state immediately apparent to future developers. We can go back to our first client interface and require credentials on every method call, but also provide a wrapper class whose type indicates its state:

Now if a developer works on a class that has an instance of AuthenticatedMetadataClient as a collaborator, they know for sure that it has authentication and that it will not lose it. If we write new classes that take an instance of AuthenticatedMetadataClient in their constructors, those classes can only be used when authentication has already been provided. Future developers will see from the class what they can do with the client objects, and their IDEs will suggest appropriate methods. They won’t need to keep as much information about the whole system in their heads in order to reason about parts of it. Those are powerful tools for working in the codebase.

Accumulating State in Memory

That was fine and good for an API client, but that class didn’t really need to change state because the real state is held in the API. What about when we want to accumulate state changes in memory before persisting them? Let’s take another example from Rescale’s codebase. We use optimization software that runs an analysis with varying values for initial parameters and selects an “optimal” result. We represent that workflow with a class, say, CaseWorkflow, that will hold the values of the initial parameters for the optimal run once it has been determined. We want to persist those values once everything is completed.

So we initially had some very imperative looking code that performed all the initialization and cleanup actions in a single method:

We decided to refactor this using lifecycle listeners to separate responsibilities and make the code easier to understand and unit test. We wrote an interface like this:

And refactored the original method to use these listeners:

We used a factory object because we wanted a different set of listeners for different types of workflows, but that’s not relevant here. What is relevant is that each listener object was created in scope, and tied to a single workflow. That’s why the following listener made sense at the time:

After all this explanation, the mistake seems obvious: the optimal parameters are not set on the workflow at the time this listener will be instantiated. At that time they are just an empty collection–but in the midst of refactoring, that’s easy to forget. Keeping that at top of mind requires a lot of context about the entire optimization system. We wondered–could we use finer-grained types to prevent this mistake and communicate workflow state to future developers? Yes.

A key issue here was the use of getter and setter methods that is common in Java. A class that has a getOptimalParameters method doesn’t tell the developer when that method can be appropriately called. That class uses implicit state changes like our API client that allowed credentials to be set on itself. Instead, we should write the objects so that they don’t have those methods at all if they’re not appropriate to call:

Like the AuthenticatedClient in the first example, we can now write methods that operate on a CompletedWorkflow and be sure about its state. We don’t have to remember all the ins and outs of what gets set when because the methods available on the class tell us.

Summary

The common factor in these examples was leveraging Java’s type system as a tool for documenting the possible states of objects. With the help of IDE method suggestion, reasoning about objects with informative types is natural and smooth. The types also reduce the context required to correctly understand object behaviour, which is a boon for productivity.

This article was written by Alex Kudlick.