When software engineers look at a piece of code, the first question they ask themselves is “what is it doing?” The next question is “why is it doing it?” In fact, there is a deep connection between these two – why becomes what at the next level of abstraction. That is, what and why are mirrors of each other across an abstraction boundary. Understanding this can help engineers write more maintainable, readable software. Continue reading

This article was written by Alex Kudlick.

abstraction

Sandi Metz recently wrote an article proclaiming that duplication is cheaper than the wrong abstraction. This article raises valuable points about the costs of speculative generalization, but it’s part of a long line of articles detailing and railing against those costs. By now it should be old hat to hear someone criticize abstraction, and yet the meme persists.

The aspect of Sandi Metz’s article that I’d like to respond to in this post is the mindset it promotes, or at least the mindset that has responded to it the most. This mindset is very common to see in comments – just get the task done, nothing more.  Sometimes that’s appropriate and the right approach to take, but the problem here is that the costs of abstraction, especially when it’s gotten wrong, are obvious, and the costs of the “simplicity first” mindset aren’t as obvious. I won’t talk about the specific costs of duplicated code, as those are already well known. I will talk about the opportunity costs – the missed learning opportunities.

Good developers should be constantly learning, constantly honing their skills. There’s always room to improve. The skill that’s most important for developers to practice is recognizing profitable abstractions, because doing so correctly relies on honed intuition. It takes seeing costs manifest over the long term, and it takes making mistakes. Developers should be constantly evaluating their past decisions and taking risks on new ones.

Opportunity cost is an often overlooked aspect of technical debt. The reason accumulating technical debt is the cheaper choice in the moment is that it takes a path for which the solution is already known. There’s nothing to learn, just implement the hack. That’s fine in small doses, but it forgoes the opportunity to learn things about the codebase, to discover missing abstractions and create conceptual tools that can help solve the problem.

So what the developers in Sandi Metz’s example should have done is noticed that this particular abstraction was costing them more than it was benefiting them. That’s a good thing to notice – it’s a valuable learning experience. What specific aspects of the abstraction were slowing down development? Which parts confused new developers and led them to make it worse? These are questions the developers should have asked themselves in order to learn from the experience.

Our development team has a weekly practice that we call “Tech Talks,” in which a developer talks about something they learned that week, some part of the codebase that was thornier than it should have been, and so on. This practice is invaluable for promoting a growth mindset, and the situation from Mz. Metz’s article would have been a perfect example to bring up.

Developers shouldn’t focus on just cranking out code. Those who limit their attention in such a way aren’t growing and will soon be surpassed by better tools. Instead, we should recognize that the job of a developer is to understand which abstractions will prove valuable for the codebase. The only way to learn that is through experience.

This article was written by Alex Kudlick.

alexblog2

I grew up playing Magic: the Gathering.  As a kid I noticed something interesting about the card names – there were no generic names.  There were no cards named “Zombie” or “Elf” or “Wizard”.  There were cards named “Fugitive Wizard”, “Llanowar Elves”, “Gravebane Zombie”, and even “Storm Crow” but no “Crow”.  Modern card names are even more specific and evocative; witness “Crow of Dark Tidings” and “Flameheart Werewolf”.  Why?  Because the designers need to leave space open for new cards.  If there were a card named “Zombie”, that’s it.  That card shows what a zombie is.  If you want to make another zombie card, it will live in the shadow of the original “Zombie”.

This has applications in software engineering.  The names we choose for classes frame how we’ll think about them, and what sort of responsibilities we’ll assign to them.  If you have a class named User, then it makes sense to put things related to the concept of “a user” on that class.  That’s a problem though.  It makes sense to put anything related to that concept on the User class.  You’ll end up with login information, billing preferences, email settings, and permissions.  It’s long been known that large classes are problems.  They’re more difficult to read because they have more logic in the same place, they’re more difficult to change since the logic is more likely to be intertangled, and those make them more likely to be buggy.

Class names set the stage for the logic the class develops over time.  We have to remember that we don’t just write code once and then it’s done.  Code is continually evolving, continually being changed to meet new needs.  At each stage, developers will ask themselves, “where does this logic make sense?”   As Stephen Wolfram noted, “the names of functions … directly determine how people will think about a function”.  Developers will look to class names as one sign for where logic belongs.  They’ll look to the concept embodied by the class as another.  If the name and concept are broad, developers will put lots of pieces of logic in the class.

Another point to make about generic class names is that they aren’t descriptive.  If you open up a class named User, you don’t have an immediate idea of the data it might contain.  There’s a lot of things it might contain.  You’ll have to read over it to find out, and remember for next time.  That imposes a lot of cognitive burden on working developers.  They have to keep in mind the details of large classes or else read over them to be sure.  If the name were something like LoginCredentials, then it’s pretty obvious what it will contain.  The name guides the reader by bounding the role of the class.

We should look for our code to provide a rich set of clues to make itself understood.  Names are an important piece of the puzzle.  Taking a cue from Magic: the Gathering, if we try to rename User with a more evocative name, we’ll quickly realize we need to break it up.  We’ll probably end up with smaller, more focused classes, which is also a boon.

This article was written by Alex Kudlick.

alexblogimage

Every mature MVC application has that one model that’s grown out of control.  The table has 20 columns; there are user preferences stored in it alongside system information; it sends email notifications and writes other models to the database.  The model encompasses so much application logic that any new feature is likely to have to go through it.  It’s the class that makes developers groan whenever they open it up.

How did we get here?

Web applications usually start out with a single purpose, to display a single type of data – maybe it’s Article for an online journal or ClothingItem for a fashion retailer.  It’s common for MVC practitioners to take concepts from their product whiteboarding sessions and directly translate them into database models.  So we start out with a database model that represents a central concept in the application, and as more business requirements emerge, the cheapest way to accommodate them is to add columns to the existing model.  Carry this out over time and you end up with a God Object.

The problem from the start was taking the casual, loose concepts sketched out at the product level and putting them in the codebase.  Software engineers should be well aware that the concepts people use in everyday life and thinking are terribly imprecise and loaded with implicit assumptions.  Highlighting implicit assumptions is often a software engineer’s key contribution, so it’s a wonder we take these concepts from the product level and embed them in our code.  It’s just asking for hidden edge cases to need clarifying logic shoved into the existing class.

The concept of Folk Psychology is illuminating here.  Folk Psychology refers to the innate, loosely specified theories that people have about how other humans operate that they use to infer motivations and predict behavior.  These “folk” theories work well enough in the context of everyday human life, but are not scientifically rigorous and contain blindspots.  Similarly, people make use of “folk object models” in software businesses.  These are the informal concepts people construct to discuss software with other humans – the words product managers use with software engineers, the boxes drawn on the whiteboard.  They work well enough when discussing concepts with other humans, who can be generous in their interpretations, but can fall apart when formalized as code.  These concepts are a useful starting point to frame the product features, but from an OO perspective are too broad to be used as classes.  They tend to accumulate logic since they implicitly encompass so much of the problem domain.

Much as the first obstacle people have to overcome when learning to code is to take their thoughts and explicitly formulate them as steps in an algorithm, experienced software engineers need to take folk object models and break them down into explicit components that can be used as classes.  In the product domain, we may start with a broad “User” concept concept, but as we dig deeper we’ll discover different pieces of logic that would be better served as separate classes- a billing preference, a current status, or notification settings.  Each of those will require their own logic to meet product requirements, and if we don’t separate them out to make space for the logic, we’ll incur bloat.

People often think that data modeling is about encoding the business concepts in software, but really it’s about using model classes as tools to construct a system.  Often codebases are better served when large models are broken into components that each address a specific piece of domain logic.

This article was written by Alex Kudlick.