Another Side of Good Naming

By | August 21, 2018

In previous posts, I’ve talked about the importance of naming to make it easier for whoever maintains the code. One thing I’ve left out is the importance of not misusing standard names.

Naming Fail

In a previous position, I was manipulating a collection of data, when I hit a bug. After running some new code, the collection was a couple of items shorter than it had been. After quite a bit of troubleshooting, I tracked the problem to a method on the collection object named sort. I was somewhat surprised to find that this method removed items from the container while sorting. In fact, this was such a surprising result, that I didn’t even check for it during most of the troubleshooting. What I discovered was that sort didn’t just sort the collection, it also removed duplicates.

Unfortunately, this violates the Principle of Least Surprise. Every (other) sort algorithm I have ever seen has taken a list or array and returned another with the following properties:

  • Contained only elements from the original
  • Contains all elements from the original
  • Are ordered based on the supplied comparison

Since generating a unique sorted list is useful, many libraries have some form of uniq method. In a few cases, you might find a combined method or an option that can be supplied to the sort function. (An example would be the -u options for the Unix sort utility.)

Better Name

In the example, the method did exactly what it was designed to do. The original programmer even argued that we were never going to want the container to have duplicates in this system. So having a sort that didn’t remove duplicates was useless. (I also argued that we should have avoided inserting duplicates into the container, if there should never be duplicates.) Although he had a point, the name is still wrong. The method should have been called sort_and_uniq() or unique_sort() or even sort_u(). Any other name would have served as a warning to any maintainer that the method does more than just sort.

Using the name sort in that case is a bad name, because it violates the expectations of the maintenance programmers.

Leave a Reply

Your email address will not be published. Required fields are marked *