50

Good Code Depends on Good Names

 5 years ago
source link: https://www.tuicool.com/articles/hit/zaEzmuF
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

I remember the first time I read the saying, “There are only two hard things in Computer Science: cache invalidation and naming things.” It really got me thinking about cache invalidation. For some reason, the “naming things” part didn’t really click until many years later.

Let me explain.

One of my coworkers once asked me to help him fix a bug with paging. There was a settings screen showing what appeared to be a random number of items per page. The first page showed ten, the second page may have shown three, the third page may have shown seven, and so on and so forth. We decided to find the code responsible for this travesty.

After fifteen minutes of searching through code and stepping through debug breakpoints, we finally found the offending snippet. It looked something like this:

for (zipCode = 0; zipCode < total; zipCode++)
{
  // ...
}

You did not misread that. Somehow the paging logic involved a zip code. I don’t remember the exact details of what went wrong, but I do remember my brain shattering into a million pieces and dispersing throughout the galaxy. Ignoring the clear issue with storing zip codes in an integer variable (leading zeros matter in zip codes), it’s pretty clear that the naming went wrong. This code made that painfully clear to me.

While this example is perhaps extreme, it does illustrate the point: names matter. In this case, a variable name misrepresented its use. Of the many ways naming can go wrong, this is probably uncommon, but it does happen.

When I first began seriously studying software design, I spent a lot of time learning design patterns. This eventually led to learning the SOLID design principles, and then it led into various aspects of functional design. That continued for many years. It wasn’t until much later that took a step back and realized a fundamental oversight I had made: code exists for people to read.

It was easy and comfortable to think about why a certain design made a program more extensible or less extensible. When that conversation turned to the more subjective area of whether or not a developer could actually understand what was going on, it became much less appealing. After all, that’s subjective. If the developer doesn’t understand this brilliant design, then they should go study design books – that’s not my problem.

News flash: yes, it is you’re problem .

In all honesty, I think most developers care quite a bit about readability. I’m trying to make the point that readability is more important than design . In most cases, those go hand in hand, because good design tends to produce readable code. When a design doesn’t produce readable code, I’d wager it’s a bad design.

If other developers can’t read your code, then they can’t extend it, and they’ll probably circumvent your design. When they do this, your design no longer has a tangible benefit, and it has quite a few tangible downsides. The onus is on you to make your design as clear as possible. You can’t do that without thinking deeply about names.

Be Specific

We’ll start with the obvious: be specific when choosing names. Naming well often comes down to specificity. It’s easy to pick names that encompass anything: item , object , number , list . Unless you’re writing functions that operate on generic objects, these names most likely do not contain enough specificity to help a developer understand what’s happening.

Let’s look at an example where a variable name lacks specificity:

const items = [ Person.create('David'), Person.create('Michael'), Person.create('Laura') ];

// ...

for (const item of items) {
  item.occupation = selectRandom(OCCUPATIONS);
}

It’s pretty clear from the first part of this example that we’re creating a list of people. Now let’s fast forward to the second part of the example. If you haven’t seen the definition of items , which may have happened in another file, you will likely have no idea what’s in this list. It’s clearly an item, but that doesn’t tell us much. A developer who encounters this code must search through the context to find what this list contains.

In a sense, developers always infer definition from context, so that’s not implicitly bad. The problem here is that unless the code gives the reader a hint, the size of the context the reader must understand is the entire project . It’s the developer’s responsibility to limit the context required to understand a given variable or section of code. Using a name like “item” doesn’t do this at all, since the name doesn’t infer much about it. This name lacks specificity.

It’s not as bad as this:

const x = [ Item.create('David'), Item.create('Michael'), Item.create('Laura') ];

But it’s still not great.

Don’t Lie: Lists are Not Numbers

While we’re still on the topics of lists, take a look at this one:

const rabbits = animals.filter(isRabbit).length;

for (const i = 0; i < rabbits; i++) {
  // ...
}

I’ve seen this quite a few times, and it’s extremely deceptive. If a developer sees this variable used later in a function, they might assume it contains a list of rabbits. There’s a good chance they wouldn’t even check if they see this variable passed into a function. This is a personal pet peeve of mine.

Don’t Lie: Objects are Not Booleans

Objects are not booleans.

const cat = animal.type === 'cat';

if (cat) {
  console.log('meow');
}

Again, the word “cat” indicates that the object contains a cat. It does not contain a cat. It contains the answer to the question, “is this a cat?” Given the actual value contained by this variable, it should perhaps have a more accurate name like:

const isCat = animal.type === 'cat';

Many people suggest prefixing all booleans with has , is , should , can , etc… It’s a good rule.

Don’t Lie: Do What You Say You’re Going to Do

So far we’ve discussed names that miss the bullseye. Now let’s talk about names that mislead:

class Person {
  string name;

  string getName() {
    this.name = this.name.toLower();

    return this.name;
  }
}

The function defined on this class claims to get the name of a person, but it actually sets the name and then gets it! Holy side effects, batman. Never do this. When you name a function, the function should do no more and no less than the function name indicates. Anything else is straight up lying, and creates an impossibly difficult problem for developers reading the code. Once you lose trust in the vocabulary, it becomes incredibly slow to decipher what’s going on.

Instead, the function should have been named something like:

string lowercaseNameAndThenGetIt() {
  // ...
}

While that name brings sadness into my life, it’s technically better than its predecessor.

This may remind you of the single responsibility principle. I think of that principle as a side effect of this one. If you say what you’re going to do, and you do nothing more and nothing less, then most likely you’re going to write functions and classes that have single responsibilities. Honesty and brevity makes it easy to read your code, so the single responsibility principle naturally follows. You’ll end up following it without knowing if you focus on proper naming.

Care about Semantics

When you speak to other people, you attempt to communicate new concepts by relying on the shared experiences associated with words. For instance, if you say something like, “I have a pet zebra,” you assume that the person listening understands the concept of self, of what it means to have something, of what it means to live with an animal, and of what a zebra looks and acts like. Most likely you have specific memories of each of these words, and most likely the person you’re speaking with has different memories. You assume the listener’s experiences overlap with yours, and that the sequence and relationships between these memories communicates what you intend.

When you write code, you frequently rely on a word’s shared understanding while simultaneously redefining it. The phrase “I have a pet zebra” might mean the following in code:

type Animal = 'Zebra' | 'Dog' | 'Cat';
type Pet = {
  name: string,
  animal: Animal,
};
type Person = {
  name: string,
  pets: Array<Animal>,
};

// ...

const me: Person = {
  name: 'Stephen',
  pets: [],
};
const zebra: Animal = 'Zebra';

me.pets.push({
  name: 'Harold',
  animal: zebra,
});

When a new developer reads this, they already know what a person is, and they already know what a zebra is. They don’t yet know your specific definitions of zebras and people, but they suspect they contain subsets of the attributes they know zebras and pets have. Meeting those expectations is hard . It requires substantial consideration.

Let’s look at a very simple example of where this might go wrong:

class Notebook {
  public string name;
  public List<Page> pages;
}

/// ...

Notebook myNotebook = new Notebook();

When a developer writes the first part of this code sample, they define the word Notebook to mean: an object with a name and a list of pages. This is somewhat at odds with Merriam Webster, which defines a notebook as “a book for notes or memoranda,” or “a particularly small or light laptop.” We have introduced a slight disparity in our definitions.

When another developer encounters the second part of this code sample without having seen the first, they most likely associate the word Notebook closer to the dictionary definition than to the written definition. Now most likely the developer will make sure of this by inspecting the class definition, but that’s not always the case when scanning through code. If they don’t do this, they might miss the name property, which might be a crucial piece of information in other parts of the code. This simple conceptual mismatch illustrates one of the insidious problems introduced by naming.

Perhaps the developer ought to have called the class NamedNotebook . Maybe not. Given that some people may name their notebooks, perhaps the original name was adequate. Either way, the discrepancy between the English definition and the code definition is worth considering.

Think Outside the Code

Proper naming doesn’t just help developers understand what’s going on: it also helps developers speak with nontechnical stakeholders. When the names used in code share semantic meaning with the actual definitions, it allows developers and nontechnical stakeholders to use them in conversation. This helps considerably during requirements gathering, domain modeling, and design.

It’s almost impossible for names chosen by developers to live only in the code. They often seep out into product documentation, onto whiteboards, and God forbid into marketing materials. When the definitions used in code make sense, this isn’t a bad thing: it’s actually helpful. It limits misunderstanding and miscommunication.

Eric Evans discusses this exact concept in much more detail in his book: Domain Driven Design. I highly recommend it. It’s one of the most influential book pertaining to software development that I’ve read. This is just one bit of it, and the rest is even better.

I’ve found this concept so important that I’ve incorporated it into all discovery and requirements gathering meetings with clients. From the first day, we create a glossary, and we review each term to make sure the client agrees on its meaning. Every written task uses these definitions, and every class in the code tries to mimic its respective definition as closely as possible. This process considerably reduces miscommunication and helps with the design process.

Outro

It’s really difficult to communicate the mindset needed to name properly, and it’s really easy to list a bunch of common mistakes. The mindset is more important. When writing code, you have to take yourself out of the moment, out of the nitty gritty details, and try to look at your code with the eyes of a developer who has never seen it before. Try to do this with every function, every variable, every line. Do it before you code, during your coding, and after you code. That’s the best way I’ve found to improve names.

English teachers often advise students to write to a specific audience. The same principle applies to code, and the specific audience is developers who haven’t read the code yet. Adopting this mentality will improve your naming and design over time.

Let me know what you think, and happy coding!


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK