Category Archives: fundamentals of unit testing

The fundamentals of unit testing : draw attention to ‘interesting’ values

3 is the magic number? No. No, it’s not. Magic numbers/strings/other values in all walks of programming are a readability and maintenance liability. It’s quite common to see all kinds of literals being littered through test code. It’s quite easy to simply switch off and treat test code as a second class citizen.

To illustrate this point, please tell me what this does

[Test]
public void TestInvalidFuelAmount()
{
    // Arrange
    var car = CreateLittleCar();

    // Act & Assert
    var exception = Assert.Throws<ArgumentException>(() =>  car.AddFuel(500));
}

It’s not exactly crystal clear, is it? Although the naming of this test can and should be improved, bear in mind that it’s an extremely simple case (one method call with one argument being passed). Despite this, the intent is still obscured because of a literal.


Replace literals with named consts

Let’s change 500 to a named constant. Let’s also change the test name while we’re at it.

[Test]
public void when_the_car_fuel_tank_capacity_is_too_low_and_fuel_is_added_an_exception_is_thrown()
{
    // Arrange
    var car = CreateLittleCar();
    const int LitresOfFuelThatIsTooLargeForTankCapacity = 500;

    // Act & Assert
    var exception = Assert.Throws<ArgumentException>(() => 
            car.AddFuel(LitresOfFuelThatIsTooMuchForTankCapacity));
}

Now that’s much clearer. We can see what the value represents, the units it’s measured in and its place in the test. Some people shit the bed when they see really long story-like variable names in tests, but they’re generally a help, not a hindrance.

While the previous example is simple, this tip makes much more of a difference in situations where a function/method/constructor has numerous parameters, but the test is largely focused on only one of them. How do you know which one to pay attention to?

[Test]
public void when_account_balance_is_low_money_transfers_fail_when_would_be_overdrawn()
{
    // Arrange
    var currentAccount = GetCurrentAccountWithBalanceInUSD(10);

    // Act    
    bool transferSucceeded = currentAccount.TryTransfer(1000, 42, 602402);

    // Assert
    Assert.That(transferSucceeded, Is.False);
}

Fine, this is a crufty-looking API, but in certain circumstances, it’s quite common to see functions/constructors that take multiple arguments of a similar type. At a glance, how can you tell which of the arguments is the one of interest to the test? Is it 1000, 42 or 602402? What is the significance of any of them? Which one is causing the transfer to fail?

Let’s use a named const to draw attention to the interesting argument.

[Test]
public void when_account_balance_is_low_money_transfers_fail_when_would_be_overdrawn()
{
    // Arrange
    var currentAccount = GetCurrentAccountWithBalanceInUSD(10);
    const int AmountInUSDThatWillForceAccountIntoOverdraft = 1000;    

    // Act    
    bool transferSucceeded = currentAccount.TryTransfer(AmountInUSDThatWillForceAccountIntoOverdraft , 42, 602402);

    // Assert
    Assert.That(transferSucceeded, Is.False);
}

Turns out the 42 is the number of cents in the transfer, and that 602402 is just a dummy customer number we’re passing through.

We can go further than this to clean things up still further by doing things such as:

  • Extract a wrapper test method that calls TryTransfer, but only has one parameter: the amount of USD to withdraw (the other parameters can be satisfied with constants). This draws attention to the important part of the test (varying the value of the USD transfer amount) and also helps insulate the tests from changes to the TryTransfer method.
  • In the production code, using types that are more appropriate to reduce ambiguity (e.g. in the case of a customer account number, an AccountCredentials class would be more suitable than an int!)
  • Finally, if lots of things are changed in the tests, but in slightly different ways each time, consider the use of a Test Data Builder. Test data builders are very useful, as the provided defaults are sensible for most cases. As a result, when arranging your test, the only things that are changed in the builder are the things that actually matter. Test data builders can also insulate the tests from breaking changes.

The fundamentals of unit testing : Arrange, Act, Assert

Arrange, Act, Assert” (aka “AAA”) is a very simple way to structure your tests – I thoroughly recommend it. It is especially helpful when learning the ropes.

For this post, I’ll use the following example test scenario: When a stack is empty, pushing an item increments the count to one (this example was previously discussed in Narrow & Focused, if you want more context)

[Test]
public void pushing_an_item_onto_an_empty_stack_increments_count()
{
    // Arrange
    var stack = new Stack<bool>();

    // Act
    stack.Push(false);

    // Assert
    Assert.That(stack.Count, Is.EqualTo(1));   
}

Each ‘phase’ of the test happens in order and occurs precisely once. Arrange precedes Act, Act precedes Assert. I’ll go into a little more detail later as to why this is a useful property.

A brief description of each “A” follows.

Read more »

The fundamentals of unit testing : Tests shouldn’t ape the production code

Aping the logic of the class under test is an bad idea when writing tests, but it’s an easy mistake to make.

Aping… Bueller?

"Aping" simply means imitating/repeating the logic used in the class under test. In its simplest form, the problem can be illustrated like so:

Production code:

public class Calculator
{
        public int Add(int a, int b)
        {
                return a + b; // actual logic of the class under test
        }
}

A test for the above class:

[Test]
public void adding_two_numbers_returns_the_sum()
{
        // Arrange
        var calculator = CreateCalculator();

        // Act
        int result = calculator.Add(1, 2);

        // Assert
        Assert.That(result, Is.EqualTo(1 + 2)); // ape alert, ooh ooh ooh! >:O
}

Although this doesn’t look like the end of the world, we can see that the test method line "Assert.That(result, Is.EqualTo(1 + 2))" is directly replicating the logic in the Add method.  This is “aping”.

Why is this a bad thing?

This is bad for a few reasons. Firstly, if the class under test is bugged and the test apes its logic to get an expected answer, we’ve just reproduced the bug in our test code and erroneously passed the test. So, instead of having working production code, we have two bits of bugged code and a false sense of security.

Secondly, when writing tests, if state can be tested, it should be done using as declarative a style as possible — we don’t want to say how we get to the answer, just what it should be.

I don’t want to see the test code literally add 1 and 2 to get 3. Instead, I want to see 3 in the test.  Here’s a fixed-up version of the test that does not ape the production code:

[Test]
public void adding_two_numbers_returns_the_sum()
{
        // Arrange
        var calculator = CreateCalculator();

        // Act
        int result = calculator.Add(1, 2);

        // Assert
        Assert.That(result, Is.EqualTo(3));   
}

You should always scan your test code looking for this anti-pattern.  The example I’ve cited is very simple, but I’ve seen a lot of real world code that has succumbed to aping the production code resulting in bugs going undiscovered.

Validating your answers

You may be thinking, “OK genius, so you skipped adding 1 and 2 in the test code, but you must’ve known to follow the addition algorithm in your head to be able to come up with the answer of 3.  How do you know your brain didn’t get it wrong and the test is bugged?” 

It’s a good question and the truth is, it’s often the case that it’s not clear-cut.  However, there are steps you can take to validate your answers (or at least be a little more sure of them).

  • Don’t just assume you’ve got it right first time.  Double check it for a few minutes more.
  • Write out some notes on paper (I sometimes do the testing version of truth tables).
  • Maybe you have a specification for this particular part.  Use it to check the answer  (stop laughing at the back, there! – it occasionally does happen; e.g. conventions for a maths library).
  • Use an alternative method to come up with the same answer (it’s often the case that the same value can be calculated via multiple algorithms / schemes).
  • Use an existing implementation to corroborate your findings.  E.g. if you’re writing a test to make sure that JSON is validated correctly, then check that JSONLint.com agrees with your test results.
  • If you’re not sure how to corroborate an exact answer, then check any invariants you’re aware of as a smoke test.  If any of the invariants don’t hold, your answer is probably wrong.

The fundamentals of unit testing : Use factory methods!

Tests are just like normal code in the sense that repeating yourself can cause problems.  Practically every single call to new in a test method is a maintenance risk.

A Simple Illustration of the Problem

Say we have 30 tests, each directly invoking constructors like the following example:

[Test]
public void when_adding_a_valid_order_it_is_appended_to_audit_trail()
{
        var orderService = new OrderService(); // hello Mr. Constructor

        … // rest of test
}

It doesn’t look so bad but, if in 3 months, OrderService is changed to use constructor injection and its interface is changed to include a new “Start” method that must be called prior to certain methods being usable, we have a problem. Now we have 30 broken tests that must be fixed individually.  I once had to clean up hundreds of tests broken by a single constructor call.  It wasn’t fun.  Or productive.

A Solution

I say “a solution” as there are other ways to tackle this problem, such as using test framework setup methods / test data builders etc.  However, this is the easiest way to get started and is the approach I tend to reach for in the majority of cases.  I’ll blog about why I tend to eschew framework setup methods later.

We can insulate our tests from breaking changes by moving object creation to simple private factory methods. It is then often a case of updating the factory method once.

[Test]
public void when_adding_a_valid_order_it_is_appended_to_audit_trail()
{
        // Arrange
        var orderService = CreateOrderService();

        … // rest of test
}

private OrderService CreateOrderService()
{
        var orderService = new OrderService(new StubLogger(), new MessageGateway());
        orderService.Start();

        return orderService;
}

Benefits

This has tangible benefits. Firstly, it only takes maybe 5 seconds of work (extract method using ReSharper) to create the factory method, but it can save literally hours of work down the line.

Secondly, the tests and the creation of objects are separated to improve maintainability.

Thirdly, it often makes the test code easier to read. If the factory method were inlined in the test code, it adds a load of noise. All the test needs is a default configuration of the order service. We only worry about the details if it’s relevant. Most test fixtures only need a handful of factory methods; for those that require more varied configurations, check out the Test Data Builder and Object Mother patterns (though be wary of over-using the Object Mother pattern as not only does it have a stupid name, but it’s inflexible and has other problems) . 

An alternative to this approach is to use a parameterised factory method and/or optional parameters.

In terms of cost/benefit trade-offs, creating one or more factory methods is a simple, easy technique.  Now when constructors change, you don’t spend half an hour fixing up constructor calls.

The fundamentals of unit testing : KISS

KISS, (Keep it Simple, Stupid) is a guiding principle in software engineering and it’s of vital importance when writing tests.  Tests should be straightforward and contain as little logic as possible. 

I know from painful experience that trying to do something overly ‘clever’ in a test is a recipe for disaster.

Control Flow / Extra Logic in Tests

The moment you start changing the flow of a test based on some condition, you are setting yourself up for a fall.  Think about it:  we test a particular unit of production code because its logic may be flawed.  If the test that validates the production code tends towards the same level of complexity, there is a distinct and real risk that the test code will also contain errors.

Here’s a real example of a bad test that contains error-prone logic (please ignore the potential floating point comparison issues… I’ll touch on that later):

[Test]
public void IdentityTest()
{
        var m = Mtx4f.Identity;

        for(int r = 0; r < 4; ++r) // potential error; looping
        {
                for(int c = 0; c < 4; ++c) // potential error; looping
                {
                        if(r != c) // potential error; logical test / branching
                            Assert.AreEqual(m[r, c], 0.0f);
                        else  // potential error; logical test / branching
                        {
                                Assert.AreEqual(m[r, c], 1.0f);
                        }
                }
        }
}

Read more »

The fundamentals of unit testing : Repeatable

The main purpose of automated testing is to highlight problems with correctness in production code.  When a test fails, it means, “something is wrong here; we have a bug in our production code!”   I’m sorry about your canary.  It died a valiant death.

Tests Should be Repeatable

… so what do I mean when I say that tests should be repeatable?  By repeatable, I mean each test can be executed an arbitrary number of times without observing a change in the test results.  If a developer can run a unit test 100 times in a row and get two or more distinct results without altering any production/test code, the test is not repeatable. 

Why Should I Care?

If a unit test cannot yield the same result with every single test run, then it is non-deterministic and untrustworthy.

An unreliable, flaky test is analogous to a car alarm that has been going off for two days. There might be something wrong, but nobody will pay any attention to it. 

Most people who’ve worked with large test suites will have experienced a particular situation from time to time – the case where a handful of the tests intermittently fail.  Developers quickly recognise the problematic test name(s) and exclaim, “these tests sometimes fail for no reason”.  The next step is to click the “rebuild” button on the continuous integration server.  Without changing a thing, the tests pass.  Hooray! If you joined in with the cheering, then please punch yourself in the face (it was a ruse).

Read more »

The fundamentals of unit testing : Atomic

Every single unit test you write should be atomic. 

A definition

Atomic tests are order-independent, relying on and causing no side effects.  They neither rely on nor alter shared state; they do not mutate their environment. If a test run involves 2, 10, 100 or 1,000,000 tests, the tests should be able to be executed in any order.  If one legitimately fails because a bug has been introduced in the production code, it shouldn’t have a domino effect.

For example, if we have 3 tests named A, B and C, we should be able to run these tests in any order.  For a tiny set of 3 tests, there are already 6 ways of arranging the order of execution: ABC, ACB, BAC, BCA, CAB, CBA.  If the order ABC is fine, but the order BAC causes test C to fail, then one of the tests is non-atomic. 

Put short, a test should be built up and torn down cleanly, leaving no trace of the fact that it has been executed. 

Why is this important?

A good automated test has a descriptive name, is implemented using straightforward code and leads us to the cause of the failure without much hassle.  When dealing with well written atomic tests, at the very worst, diagnosing the cause of a failure should be a simple debugging job.

However, if non-atomic tests create knock-on effects it introduces a significant maintenance risk, as to diagnose failures, we must understand the whole picture.  Instead of just inspecting the local state that exists prior to executing the failing test, we have to start delving into environmental differences (machine configuration, file systems) and AppDomain issues (typically global / static state).

Read more »

The fundamentals of unit testing: Spring The Trap

This is a lesson I learned from the Pragmatic Unit Testing book.  Five years later and it’s still a tool that I frequently reach for – it’s simple and incredibly useful.

Who tests the tests?

I’ve had a few long-winded and borderline philosophical discussions with colleagues in the past where the question is asked: “Who tests the tests?”  In essence, how can we trust the tests to tell us that the production code is working correctly? Who’s to say the test isn’t bugged? If a test is bugged and erroneously passes, we have the worst of both worlds.  We have a piece of (potentially) broken functionality and we have a test that is offering false assurances.

Is this likely?

At first glance, it sounds a bit fanciful – I mean, what are the chances of writing a bad test? The truth is: It’s not a matter of if, it’s a matter of when. Everyone slips up now and again. Test code is code. Programmers habitually write incorrect code.

In my experience, it’s especially easy to write an incorrect test when you’re pre-occupied by something else (such as understanding & fixing a complicated bug) and aren’t really in the mood to question yourself before you check in and head home for a deserved nap.

I’ll put it more pointedly: if the tests are written after the production code, how do you know that the test is correct? Think about it, if your workflow follows this pattern: “create method that does X, write test for method to make sure X is correct, the test passes”, how can you have confidence that the production code and its tests work? The test has never failed and, as a result, you are on the happy path.

A common cause is copy-pasting an existing test and forgetting to change it to do something else by varying an argument or value. It happens. You now have two identical tests – one is misleadingly named and both pass.

Reducing Errors

There’s two things you can do to increase the chance of a test being correct.

Read more »

The fundamentals of unit testing : Correct

Once you’ve decided what you’re going to test, chosen a good name and started to write the code, the next step is to make sure that the test is purposeful and correct.  This is a simple tip and largely common sense.

Does the test code’s functionality match its name?

It is simple to accidentally write a test that purports to test one thing, but actually tests another. E.g.

[Test]
public void model_is_loaded_correctly_from_stream()
{
        // Arrange
        var resourceLoader = CreateResourceLoader();
        var modelDataStream = CreateModelDataStream();

        // Act
        var loadedResource = resourceLoader.LoadFromStream(modelDataStream); // fine so far

        // Assert
        Assert.That(resourceLoader.LoadedResources, Is.EqualTo(1)); // buh?!
}

In this test, we can see that it claims to test that loading a model from a stream returns a valid resource. However, the test is then asserting that the loaded resources count has increased. It is not checking the returned resource for correctness.  Either the test name is out of date, or it is not testing the correct thing. 

If it is the former case, we should rename the test to “loading a model increments the loaded resources count”.  If it is the latter case, we should change the test such that we are asserting on the model’s state.  How do you know the model has been loaded correctly if you’re not inspecting its vertices, indices and UVs?

If we are misled by our test names, it is irritating because it looks like something is covered, but it’s not.  If we’re asserting on the wrong stuff, then our test won’t save us when the unit of code malfunctions.  Maybe it’s already bugged.

Is the Assert meaningful?

All tests should assert against some meaningful result. Put simply, if a test does not prove much, then it is not worth writing or maintaining.

Here is an example of a test that looks useful but, on closer inspection, is doesn’t really prove much at all.

[Test]
public void game_properties_are_correctly_serialized()
{
        // Arrange
        var currentLevel = 4;
        var playerName = "Steve";
        var game = CreateGame(playerName, currentLevel);               
        var serializer = CreateGamePersistence();
       
        using(Stream stream = new MemoryStream())
        {
                // Act
                serializer.Save(game, stream);
        }
        
        Assert.That(true, Is.True) // WHAT?!?!?
}

I’ve seen the “Assert true == true” thing numerous times.  If the assert is missing or tells us nothing useful, the chances are the test isn’t going to prove much, so why are some developers compelled to do it? My guess is that people think, “hmm, I’ve written a test with no asserts, there we go”.  If you ever get tempted to do this, don’t!  It’s a sign that something is wrong.

Note: it’s sometimes useful to have methods like this to act as surrogate Main() methods so we can invoke them via a test runner for easy debugging but, in general, it’s not a terribly good idea and you wouldn’t want to run these tests as a part of your normal development workflow.

The fundamentals of unit testing : Narrow & Focused

When it comes to writing unit tests, it pays to be specific.  

If you have a narrow & focused test, it will…

  • be easy to make a descriptive name for the test
  • perform the bare minimum of work required to achieve its goal. 
  • be easy to understand
  • be easy to maintain 
  • execute in a shorter period of time

An unfocused example

To give you an idea of what an unfocused test looks like, consider the following (and yes, I’ve seen plenty of these, both in professional [RTW] & open source code):

[Test]
public void StackTestA()
{
        var stack = new Stack<int>();
        Assert.That(stack.Count, Is.EqualTo(0));

        stack.Push(1);
        Assert.That(stack.Count, Is.EqualTo(1)); 

        stack.Push(2);
        Assert.That(stack.Count, Is.EqualTo(2)); 

        var p = stack.Peek();
        Assert.That(p, Is.EqualTo(2);
       
        stack.Push(3);
        Assert.That(stack.Count, Is.EqualTo(3)); 
       
        stack.Pop();
        Assert.That(stack.Count, Is.EqualTo(2)); 
       
        bool c0 = stack.Contains(1);
        Assert.That(c0, Is.True);
       
        bool c1 = stack.Contains(10000);
        Assert.That(c1, Is.False);
       
        stack.Push(3);
        Assert.That(stack.Count, Is.EqualTo(3)); 

        int popped = stack.Pop();
        Assert.That(popped, Is.EqualTo(3));
        Assert.That(stack.Count, Is.EqualTo(2));
}

Can you spot the problems? 

Read more »