How to Shatter Magic Numbers in Software Testing


Bread is made by mixing ingredients of more or less fixed proportions. There is a tolerance around these proportions defined by the quality standards. This level of tolerance ensures that the quality of the mix stays the same regardless of the environmental conditions. Add too much flour to the mix, and customers will curse the brand. Add too little flour, and the bread will be a piece of carbon from the outside, waiting to be put in the trash can.

In a mill, some software engineers had to automate the administration and dosage of ingredients. The engineers faced natural resistance, as employees of the mill kept sabotaging requirements engineering. How did the software engineering team overcome this hurdle?

Automated Testing

In my software testing series, you can learn how to set up your software testing environment using Mocha, Chai, and SinonJs. Although I started writing about these topics, and more testing frameworks have emerged, in JavaScript, Mocha and Chai are still leaders in terms of automated unit- and integration testing.

In fact, if I had written about these topics in 2017, my only main change would have been the usage of ES6 to make the code more compact and easier to understand.

Read this series if you are interested in

You will have access to all information you need to set up an environment for automated unit- and integration testing.

The only thing you don’t know is how to write automated tests properly. Although this article mentions what you should avoid, we were just scratching the surface with topics like magic numbers.

What are magic numbers?

Magic numbers have been one of the most frequent mistakes in software testing. These numbers are seemingly random values that are included in your test assertions.

Example:

Where does the value 12473 come from? I have seen this magic number problem quite often, and most of the time, software developers have no idea why they selected their values.

What is even worse, these values are often selected randomly, without any thought or logic. This implies that automated testing is made a mere luck-based approach to catch some bugs, without verifying the underlying behavior properly.

Needless to say, software-engineering and luck should not have much in common.

As a consequence, we should get rid of magic numbers at all cost.

Defining the example

Suppose that we have to test the calculateInterest method of a bank account. The specification is as follows:

  • for negative balance, the interest is 25%. Don’t forget, the interest amount is negative,
  • for a non-negative balance up to $1000, interest is zero
  • Between $1000 and $9999.99, interest is 3%,
  • At or above $10000, interest is 4%.

What meaning does $12473 hold from the perspective of the specification? Not much. We have four cases in total, and this amount falls into just one of the case.

When I review test suites, I often see another test case for calculating the interest of $15399, or not having any test cases for amounts between $1000 and $9999. These are both major mistakes.

Creating a lookup table using equivalence classes

Our job as software-engineers is to make automated testing a systematic, repeatable process. We don’t want to leave testing for chance. This is where equivalence partitioning comes handy.

From the perspective of testing our code, testing $0, $129, $739, $999, and $999.99 falls into the exact same category, or equivalence partition: we are supposed to receive zero interest on all of these values according to the specification. These values belong to the same equivalence partition.

All the account balance values can be placed in disjoint partitions. Each value belongs to exactly one partition.

Selecting two or more random values from the same partition does not make much sense. One idea to do better is to select exactly one random value from each partition. This would mean that we could verify that the correct interest is applied on the values -$369.69, $169, $1069, and $10069.

This strategy could already make slightly more sense than randomly coming up with some values. But wait a minute! Why did the author of these tests select these numbers, and not something more logical? If you try to reverse-engineer the reasoning behind these four values, and the lookup table is not documented well enough, the only pattern you recognize is the obsession of the test author with the number sixty-nine. This may divert your attention from the logical grouping of the tests to the dirty mind of the test author.

What is even worse, if the underlying rules change, values in different equivalence classes may fall into the same class later. We can do better than that.

Boundary values bust magic numbers

Inside our equivalence classes, we always have boundary values. If the values in equivalence partitions are sortable, boundary values are the largest and smallest values in an equivalence class.

Based on common sense and the experience of multiple software engineers including the author, boundary values tend to cause more errors in our code than randomly selected values.

Think about it: if you accidentally write a > symbol instead of >=, chances are that you create an error at the boundaries. Mistakes like this are made on a regular basis.

This is the exact reason why we tend to include the boundary values in our tests.

When formulating the tests, we will make two assumptions. First, we will assume that we have to apply a precision of two digits. Second, we also assume that we don’t have to deal with precision errors arising from floating point arithmetics. For instance, if you write 0.1 + 0.2 in your console, the result will not be 0.3.

For instance, when testing the equivalence partition between 0 and 999.99, we can test the following values:

  • 0.00 and 999.99 as the smallest and largest values in the equivalence class.
  • In the two adjacent equivalence classes, we will test the values -0.01, and 1000.00.

If you are very thorough, you may consider including 0.01 and 999.98. These two values are not strictly boundary values. They are systematically chosen values to verify that our algorithm works for non-boundary values. You gain more information with these tests when a boundary value test fails. For instance, if the 0.00 test fails, and the 0.01 test succeeds, the likelihood that you encounter a boundary error is significantly larger than in case both tests failed. If the tests for 0.00 and 0.01 fail, we can rightfully suspect that there is a generic error in our application not related to boundary values.

In practice, we often omit non-boundary values to reduce the size of the test suite.

What happens if our equivalence partition is infinitely long?

This is a great question to ask. Most of the time, we can trace infinite equivalence partitions back to deficiencies in specification.

Do you really want to test what happens with bank accounts at or above than Number.MAX_VALUE? This is not realistic. In case of bank accounts, there should always be a technical limit, and our code should work for very big accounts too.

Therefore, if you lack the information on what happens to an account with a balance of $1.000.000.000.000.000.000 or larger, ask your project manager or requirements engineer the question on upper bounds to limit potential bugs arising from very big or very small numbers.

Software systems are supposed to be finite, as the number representation is finite.

Including tests around Number.MAX_VALUE is meaningless, as Number.MAX_VALUE is not a value originating from the specification.

What about floating points?

We have already covered the famous 0.1 + 0.2 value.

Another strange value is 10000000000000000 + 1, which yields 10000000000000000.

In computer programming, we work with floating point arithmetics. There is no way around it.

When working with financial data with a precision of two, best practices suggest storing the cent values as integers to avoid problems with floating point arithmetics. Obviously, there is a limit even to integers, as we cannot handle arbitrarily large numbers.

When calculating interest for instance, we have to round the result according to the law of mathematics, or the specification.

Matter does not disappear, except in mills

Beyond boundary value analysis, tests have to indicate potential rounding errors around frequent mistakes in finance.

Recall the story of a huge fraud, where some software developers made millions by accumulating rounding errors of transactions and wiring them to their own bank accounts. The rounding errors were not missing from any books in the system, therefore, the frauders got away with their act for a while. Obviously, at the end, all fraud schemes collapse. The frauders got caught, but the damage was already done.

Getting back to the problem in the mill. The engineers faced natural resistance, as the experienced employees knew exactly when they could take a couple of kilograms of flour for instance while staying inside the tolerance levels.

Needless to say, the engineers faced serious resistance from the employees when implementing the system, as the employees became stakeholders in the process.

As a result, in order to ensure the success of the project, specification had to be altered to allow some level of manual override to the dosage of the ingredients. Automated tests had to support the altered specification. Whether you agree with the solution or not, it does not really matter from the point of view of creating a well engineered solution. In some cases, requirements force you to engineer your tests in such a way that you have to add tests other than boundary value analysis. This loophole in the system has to be documented, and it has to be guided by tests, otherwise, the maintainers of the system won’t be able to trace back the intended behavior.

We as test implementers can take away one important guideline from these two stories: matter does not get created or destroyed from thin air. Matter just gets transformed, and this transformation has to be guided by tests.

Transformation errors occur at boundaries most of the time. In some cases, we have to create tests that are not boundary values. My suggestion to you is that you document the intention behind choosing your values. This documentation will eliminate magic numbers from your code, as these numbers become intentionally chosen.