testing

Data-driven testing tricks

Posted in Uncategorized, c#, patterns, testing, tips on May 8th, 2010 by Mark Simpson – Be the first to comment

It’s a fairly common occurrence — somebody wants to use NUnit’s data driven testing, but they want to vary either the action under test, or the expectation.  I.e. they’re not parametrising simple data, they’re parametrising the actions.

You cannot encode these things via normal data-driven testing (short of doing really nasty things like passing string names of methods to be invoked or using enums and a dictionary of methods) and even if you use a hackish workaround, it’s unlikely to be flexible or terse.

Test readability is paramount, so if you have some tests written in an unfamiliar style, it is very important to express the intent clearly, too.

NUnit’s data-driven testing

NUnit uses a few mechanisms to parametrise tests.  Firstly, for simple test cases, it offers the [TestCase] attribute which takes a params object[] array in its constructor.  Each argument passed to the TestCaseAttribute’s constructor is stored, ready for retrieval by the framework.  NUnit does the heavy lifting for us and casts/converts each argument to the test method’s parameter types.  Here’s an example where three ints are passed, then correctly mapped to a test method:

The main limitation here is that we can only store intrinsic types.  Strings, ints, shorts, bools etc.  We can’t new up classes or structs because .NET doesn’t allow it.  How the devil do we do something more complicated?

Passing more complicated types

It would appear we’re screwed, but fortunately, we can use the [TestCaseSource] attribute.  There are numerous options for yielding the data, and one of them is to define an IEnumerable<TestCaseData> as a public method of your test class (it works if it’s private, but since it’s accessed via reflection it’s a good idea to keep it public so that ReSharper or other tools do not flag it as unused).  You can then fill up and yield individual TestCaseData instances in the same fashion as before.  Once again, NUnit does the mapping and the heavy lifting for us.

If you do not require any of the fancy SetDescription, ExpectedException etc. stuff associated with the TestCaseData type, you can skip one piece of ceremony by simply yielding your own arbitrary type instead (i.e. change the IEnumerable<TestCaseData> to IEnumerable<MyType> and then simply yield return new MyType()).

Passing a delegate as a parameter (simple)

The simplest case is that you want to vary which methods are called.  For example, if you have multiple types implementing the same interface or multiple static methods, encoding which method to call is very simple.

Here’s an example from Stackoverflow that I answered recently where the author wanted to call one of three different static methods, each with the same signature and asserts.  The solution was to examine the method signature of the call and then use the appropriate Func<> type (Funcs and Actions are convenience delegates provided by the .NET framework).  It was then easy to parametrise the test by passing in a delegates targeting the appropriate methods.

More advanced applications

Beyond calling simple, stateless methods via delegates or passing non-intrinsic types, you can do a lot of creative and cool stuff.  For example, you could new up an instance of a type T in the test body and pass in an Action<T> to call.  The test body would create an instance of type T, then apply the action to it.  You can even go as far as expressing Act/Assert pairs via a combination of Actions and mocking frameworks.  E.g. you could say “when I call method X on the controller, I expect method Y on the model to be called”, and so forth.

The caveat is that as you do use more and more ‘creative’ types of data-driven testing, it gets less and less readable for other programmers.  Always keep checking what you’re doing and determine whether there is a better way to implement the type of testing you’re doing.  It’s easy to get carried away when applying new techniques, but it’s often the case that a more verbose but familiar pattern is a better choice.

More readable data-driven tests

Posted in c#, testing, tips on February 13th, 2010 by Mark Simpson – Be the first to comment

When the logic of a test method remains constant but the data varies, data-driven testing is a great tool.  It allows you, the test author, to write compact code and to add new test cases rapidly.  Unfortunately, data-driven tests have a disadvantage: The inputs are often less readable.

A simple example

Let’s take an example; testing Rob Conery’s PagedList implementation.  A page is basically a slice of the data returned by a linq query.  If more data exists beyond the ‘slice’ represented by the PagedList<T> instance, its “HasNextPage” property should return true to indicate that it is available.  Now, suppose we want to test whether a particular page has a next available page.  Three things spring to mind that can influence the result: The page size, the current page index and the number of items in the list.

Here’s a quick data-driven test for HasNextPage:

As you can see, the method itself is readable, but the parametrised values fed in via the [TestCase] attribute are not.  It’s really hard to keep everything in your head and remember what each number maps to in the function, especially when the method parameter types are all identical.  If you have a list of 20 [TestCase] attributes, you start wondering what’s for your tea and forget that the second value is the (checks image) page size.  Mince.  Mince for tea.

Hmm, if only we could those TestCases more readable; something like object initializers would be ideal.

A simple trick: Subclass TestCase

My friend Hughel helped me come up with this one and it works quite well.  Simply subclass the TestCaseAttribute class and add your own properties to represent the test parameters.  It gets a little bit hairy when you have to access the Arguments array directly (especially since Attributes can be weird), but in practice, it works fine.   In most of the tests, we’re only interested in parametrising three things, so it’s simple to add them as properties.

The end result

Finally, we apply these attributes to our data-driven test, significantly improving the readability!

I would hasten to add that I don’t recommend using this approach willy-nilly — only when you have a large amount of tests that are parametrised by the same data types, causing the test cases to become hard to follow.  I’ve used the [TestCaseSource] attribute and the [TestCase] attribute a lot in the past and most of the time it’s not a problem.

AutoMapper and Test Data Builders

Posted in c#, patterns, testing, tips on January 11th, 2010 by Mark Simpson – Be the first to comment

I’ve recently been tinkering with WCF and, as many people already know, writing data transfer objects is a pain in the balls.  Nobody likes writing repetitive, duplicate and tedious code, so I was delighted when I read about AutoMapper.  It works really nicely;  with convention over configuration, you can bang out the entity => data transfer code in no time, the conversions are less error prone, the tests stay in sync and you’re left to concentrate on more important things.

Anyway, I immediately realised that I’ve used the same pattern in testing — with property initializers & test data builders.  I’ve posted before about Test Data Builders and I’d recommend you read that post first.

For small test data builder classes, it’s really not that big a deal.  For larger classes, using AutoMapper is quite useful.  For example, for testing purposes we’ve got an exception details class that is sent over to an exception logging service.

Every time the app dies, we create an exception data transfer object, fill it out and then send it over the wire.  When unit testing the service, I use a Test Data Builder to create the exception report so that I can vary its properties easily.  Guess what?  The test data builder’s properties map 1:1 with the exception report — hmm!

So, rather than create the same boilerplate code to map the 10+ properties on the exception builder => data transfer object, I just used AutoMapper to handle the mapping for me :)

public class ExceptionReportDto
{
    public string ExceptionType { get; set; }
    public string StackTrace { get; set; }
    public string AssemblyName { get; set; }
    public string EntryPoint { get; set; }
    public string UserName { get; set; }
    public string MachineName { get; set; }
    // etc
}
public class ExceptionReportBuilder
{
   public string ExceptionType { get; set; }
   public string StackTrace { get; set; }
   public string AssemblyName { get; set; }
   public string EntryPoint { get; set; }
   public string UserName { get; set; }
   public string MachineName { get; set; }
   // etc

// create the mapping when the static ctor is invoked
static ExceptionReportBuilder()
}
   Mapper.CreateMap<ExceptionReportBuilder, ExceptionReportDto>();
}

public void ExceptionReportDto()
{
    // set up defaults
    ExceptionType = "System.ArgumentException";
    StackTrace = "Oh no I am a stack trace";
    //etc.
}

 public ExceptionReportDto Build()
 {
     // go go automagic!
     return Mapper.Map<ExceptionReportBuilder, ExceptionReportDto>(this);
 }
}

I’ve had good results with this approach.  The only bit I’m remotely concerned about is creating the mapping in the static constructor.  Any AutoMapper gurus out there who can say whether there’s any reason I shouldn’t do that?

Avoiding the file system

Posted in c#, patterns, testing on November 26th, 2009 by Mark Simpson – Be the first to comment

Going from experience and, as illustrated by Misko’s recent presentation, the more dependencies you have on your environment, the less trustworthy and maintainable your tests become.  One of the foremost offenders in this area is touching the file system.

Any time someone says “hey, I’m trying to open a file in a unit test…”, my first reaction is to say “woah”, and not in the “I know Kung Fu” way!  If you introduce a dependency on the file system, bad things are more likely to happen.  You now depend on something that may not be there/accessible/consistent etc.  Ever written a test that tried to access a common file, or read a file that something else may write to?  It’s horrible.

It is for these reasons that many folk will say “it’s not a unit test if it hits the file system”.  In particular, if you have a TestFixtureSetUp/TearDown method that deletes a file, it’s a sure sign that the fixture is going to flake out at some point.

A real example

Recently at work, my colleagues have experienced the joy of a huge refactoring job pertaining to restructuring our projects/solutions to reduce build times and increase productivity.  This work included maintaining something close to ten thousand tests.

As the job progressed, they kept finding that some test fixtures did not live in isolation.  The tests depended on various things they shouldn’t have and, most saliently, file dependencies proved to be a total pain in the balls.  Everything built OK, but when run, the tests failed due to missing files.  Paths and file attributes had to be checked (Copy if newer, etc.), lost files had to be hunted down and so forth.  It’s hassle that people don’t need!  As I’ve stated before, when it comes to testing, maintenance is king.

Anyway, if you have a hard dependency on the file system, consider the alternatives.  This is never a hard rule as it is not suitable for all uses, but it always worth thinking about.

Alternative approaches

Firstly, abstract the file operations to some degree.  You can do numerous things here, from changing the internal loading strategy (via Dependency Injection) to — even better — separating the loading/use of the file so that the consumer of the file’s contents doesn’t even have to care about the loading strategy.

Once you’ve done this, you no longer need to use files in your unit tests, as you can use plain ‘ol strings, streams or even just directly construct instances to represent the contents of the file.

Say we had a simple class called “MyDocument”, and MyDocument could be loaded from a .doc file on disk.  The simplest approach would be to do something like this:

Approach One

 // #1: this is tightly coupled to file system.  
 void SimpleLoading()
 {
    var simpleDocument = new MyDocument("filenameToLoad.doc");
 }

Depending on your needs and the demands of the user, this may be OK.  However, to test MyDocument’s methods/properties properly, we need access to the file system to instantiate it.  If our tests are to be robust & fast, we need something better.  Here’s something that’s testable:

Approach Two

 // #2: doc calls DocumentLoader's methods to get data needed
 void SlightlyImprovedAndTestable()
 {
    // this type implements IDocumentLoader
    var docLoaderStrategy = new FileDocumentLoader("filenameToLoad.doc");

    // Uses DI; ctor calls doc loader's methods to construct itself
    var doc = new MyDocumentType(docLoaderStrategy);
 }

From a testability standpoint, this is slightly better, as we can now feed in an IDocumentLoader instance which the MyDocument constructor uses to get the data.

On the flipside, the MyDocument type now needs to know about IDocumentLoader.  To load a document, the user now needs to know about creating an IDocumentLoader and feeding it into the constructor — it’s more complicated.  I often see people do this as the default step for abstracting their file operations — they alter the code to make it testable, but fail to spot the problems it brings if done at the wrong ‘level’.  If you gnash your teeth every time you have to use your own code, it’s a warning sign that something is wrong.

When we think about it though, why should MyDocument need to know about loading strategies?  In many cases, we can parse a file and produce some output using a factory or builder instead.  E.g.:

Approach Three

 // #3: Break loading + doc creation into two distinct parts
 void DecoupledAndTestable()
 {
    // loading is now a separate step :)
    MyDocument doc;
    using(var docLoader = new DocumentLoader("filenameToLoad.doc"))
    {
       doc = docLoader.LoadDocument();
    }

    // Similarly, we can do something like this
    var testDoc = new TestLoader()
                      .LoadDocumentFromText(SomeResourceFile.ValidDocument);
 }

To clarify how this works: The DocumentLoader would parse the .doc file and construct the object instances required to build up a real document, then pass them into the document’s constructor (or build up the document via some other means, such as iteratively calling methods on a blank document to fill it up as new items are found — whatever makes sense).  This totally decouples the loading and instantiation, meaning we can test each step in isolation.

I.e. the flow goes: Read Input => Parse Input => Create Document from Parsed Input

Life after the File System

Once you’re no longer dependent on the file system, you are free to use one of many strategies for loading/creating your type.  Depending on the abstraction, some options include:

  • Just declare your data inline in the test methods as a const string, or as a const/readonly field of the test fixture.  This works well for small amounts of text.
  • Add your test files as text file resources.  You can then access the file contents as a static string property.  This is handy, as you get a strongly typed resource name and don’t need to mess around with paths + copying files.  This works well for larger sets of data, or data you want to re-use in multiple tests.
  • Use embedded resources & GetManifestResourceStream.  This is slightly messier; it doesn’t require copying files, but it does require that the namespace + filenames used to reference the embedded resources are correct (runtime failures ahoy).  You also need to handle streams when using this method.

If my loading logic deals with strings, I tend to just build an ‘inner’ parser that works with the strings, then wrap it in another class that opens and reads files, then passes the (raw) string to the ‘inner’ parser class.  This allows me to thoroughly test the parsing logic independent of the file system, but also means I can re-use it for test classes or other cases.  I.e. I can exercise more of the production code without any of the file system pain :)

Depending on the thing being loaded, this isn’t always the best solution, but for relatively simple loading I tend to favour this method.

Is your test code readable?

Posted in testing, tips on October 26th, 2009 by Mark Simpson – Be the first to comment

One of the things that really slashes the return on investment in testing is unreadable code.  “This is pretty obvious”, you say.  “What’s the point in a blog post about something so obvious?”  What’s not obvious is that the very people writing these tests are unaware of it.  Maybe you do it as well.  Given that this blog post is about things we don’t know we do, I think it’s a fair bet that I’ve also recently written test code that was convoluted without realising it, too.

It’s mostly down to testing in a vacuum.  The tests are often functionally fine.  However, as with nearly all code, maintenance is easily overlooked.  On the day that the test was written, it made sense to the author.  They understood the logic they wanted to test and how to implement it.  Code written, job done.

In the court of the test engineer, readability is king

In my opinion, readability in tests is the number one thing.  If your fellow programmers and your future self cannot decipher what a test is proving, the test becomes worthless.  If a monstrously bearded mathematics genius solved solved P versus NP but failed to write an understandable proof, it would all be for nothing.

It’s the same for testing.  You’ll know the shit has hit the fan when you’re refactoring something and break a load of tests.  “Balls.  I’ll fix the tests I guess”, you murmur, bleary eyed and idiot-faced.  However, when you examine the tests, you can’t figure out what they’re proving, why they’re proving it or how it’s actually proved!

You know in films when a bad guy is holding a hand grenade, then has a moment of realisation regarding an absent pin?  He slowly looks up, face furnished with a quizzical look, staring into oblivion.  That’s what a software engineer looks like when they encounter reams of broken tests that cannot be deciphered (and the person who wrote ‘em isn’t available to pick up the pieces).

Grab a friend and test your tests

To avoid these kinds of bomb-scares, do yourself a favour and have somebody else trawl through your test code without giving them any help.  If they cannot easily follow the test code’s intent, then the test needs re-written.  Everybody’s code can be improved through peer review.  Peer review is the litmus test — if somebody else cannot understand it, it’s useless.  As such, tests should be part of code reviews and buddy processes.

Think about how much time we spend reading, maintaining, refactoring and extending old code compared to writing new code, and then think about how self-defeating it is to write unreadable, un-reviewed test code.  Spending a little more time on reviewing new test code pays off in the long run.

Understanding test doubles

Posted in c#, testing, tips on August 22nd, 2009 by Mark Simpson – Be the first to comment

There is a bewildering array of types of ‘mock’ object available to a tester.  The canonical list of test doubles was probably coined by the venerable Martin Fowler in his article “Mocks Aren’t Stubs” and, to me, this list is fairly complete and makes sense.  The reason it makes sense is that I’ve manually written classes that perform these roles.

  • I needed to fill out a parameter list with non-null objects, so I created a dumb class with absolutely no implementation.
  • I wanted to listen in on additions to a list, so I wrote a class that stored the objects, acting as a spy.
  • I wanted to provide canned results to test another part of my system, so I wrote a stub.
  • I needed to ensure a method was called, so I wrote a Mock.
  • I needed a coherent, fast implementation of a class, so I wrote a fake implementation.

My problem with testing terminology is that it’s harmful to newcomers, especially when the semantics affect the result of the test!  Not only is there a high barrier to entry when it comes to writing accurate, robust and maintainable tests, but the terminology is another unwelcome complication.

For this reason, I personally advocate keeping new testers away from mocking frameworks until they’ve become comfortable with state-based testing and hand-rolled a variety of their own test doubles.

Information Overload

Even when no additional frameworks are involved (i.e. when using vanilla xUnit), writing good unit tests involves a steep learning curve.  It’s easy to take a wrong turn and the quality of the tests written will improve only with experience/guidance.  I was not surprised when I read Roy Osherove’s blog and discovered that the majority of organisations’ attempts to embrace unit testing resulted in failure.

Mocking frameworks like Rhino Mocks are absolutely excellent tools, but it’s yet another thing to learn.  Suddenly the type of object (mock, strict mock, stub etc.) affects the result of the test.  It took me a week to get my head around it, so it doesn’t surprise me when I see newcomers totally abusing these frameworks.  Not only does this create a maintenance nightmare, but it sours their first taste of testing.

The problem is doubly hard to tackle, as Mocking frameworks allow you to write the same tests with fewer lines of code.  Developers who are new to testing see this and immediately try to use the mocking framework.  After all, only a fool would eschew such benefits, right?  Well, some people just ‘get it’ from the start.  I’m not one of these people and, in my experience, nor are most others.  Starting with interaction based testing and mocking frameworks is akin to throwing someone out of the back of a van that’s moving at 70mph and expecting them to start running when they hit the tarmac.  Chances are they’re going to land on their face.

The unfortunate result is that some folk tie themselves in knots.  They don’t understand the responsibilities of each type of testing object and, as a result, create extremely brittle or utterly pointless tests. If you have no idea of what you’re trying to achieve with a test, there is no point in writing it.  I once saw a question on StackOverflow featuring a confused fellow asking why his test did not work.  The poster was using Rhino Mocks and created a Mock of the class under test.  I.e. it was not a collaborator, it was the class under test and he was mocking it!  I suspect he was simply overwhelmed by trying to learn multiple new things.

Other things I’ve witnessed include developers writing obscure lambda expressions using c#, then chaining together RhinoMocks methods to perform something which somehow works.  When I pointed out that I could barely infer its purpose by reading the code and that writing a hand rolled stub would probably be a better idea, I was met with “yes, you’re probably right but I want to use Rhino Mocks”.

Warning bells should also start ringing when you return a mock and assert that its method was called by another mock which returns a mock which… errrr!  It’s much easier to make a dog’s dinner of interaction-based testing; a good grounding in state-based testing is essential.

One step at a time

  1. Learn to sit up before you crawl.  Write simple xUnit tests that involve state-based testing.  It doesn’t have to be great, isolated code.  Even writing tests that involve scores of classes is a good way to start.  Finer granularity is something that comes with experience.
  2. Crawl before you walk.  Start to experiment and find better ways of testing pieces of functionality.  Ask yourself whether the test is useful, maintainable etc.  Will other parts of the system break it if they change?  Can you make the components and tests themselves finer grained?  This stage should be about developing your sense of what constitutes a good test.
  3. Walk before you run.  Begin to experiment with different types of test doubles, but hand roll them.  Yes, it’s painful at times, but it will give you a better understanding of roles in tests and the different types of test doubles, even if you don’t have names for them yet.  Furthermore, constantly having to update your hand rolled stubs when disparate parts of your class changes will also give you an appreciation for the interface segregation principle.
  4. Finally, install a mocking framework and start sprinting.

    If you do sprint head-first into a wall, you will be better equipped to understand where you went wrong, as you understand the fundamentals.  You will also have a better grasp of the terminology, as it will be grounded in real, tangible code you’ve written.

    What’s in a name?

    Posted in testing, tips on May 15th, 2009 by Mark Simpson – Be the first to comment

    One of the things I try to encourage is the careful selection of names.  Just as self-documenting code is easier to read, so is a self-documenting test.  As I have previously stated, unit testing is programming, too — you can apply the same good practices to tests.

    On numerous occasions I’ve had to review some code and tests.  I open the classes and find well-named, self-documenting, loving crafted, carefully designed code.  Then I look at the test for that code and find that the same principles have not been applied to the tests.  In fact, it’s almost like the tests have been written by Evil Chuck, the programmer’s alter-ego.

    It’s not uncommon to see something like the hugely descriptive and easy to read:

    [Test]
    public void TestMethodName()
    {
    ....
    }

    .. the excellent

    [Test]
    public void TestMethodName4()
    {
    ....
    }

    .. and the ubiquitous

    [Test]
    public void TestConstructor()
    {
    ....
    }

    This is not a good state of affairs!

    Problems with badly named tests

    Numerous problems exist with a name like “TestSomeMethod4B”.  Here are a selection of good reasons not to name your tests like this:

    Readability
    It’s not descriptive.  In fact, it’s totally obscure.  It might as well not exist.  What can you say about “TestSomeMethod4B”? Nothing much.  Even if it is documented with a comment, it hinders readability in the IDE and the test runner.  It’s better to name something descriptively and without a comment than the other way around.  That’s not to say that comments and descriptions are redundant, but in most cases you don’t need them if the test is descriptively named.

    Intent
    It says absolutely nothing about the intent.  It might be checking an exception is thrown, it might be checking a value is clamped to an accepted range of values.  It might be doing nothing.  To be able to decipher its intent, you have to read the code.  Imagine if you couldn’t understand the intent of any part or method of a program.  How would you manage to break it up into understandable chunks?  Answer: with great difficulty.

    Obfuscation
    It defeats any kind of attempt to understand the state and thoroughness of the testing for that class as a whole.  You cannot obtain an overview.  You can’t determine whether you’ve tested a method with 10 different invalid parameters or whether you’ve tested 10 different simple cases.  Without good naming you are reduced to skimming the code while trying to remember too much.  Just as well named subroutines aid comprehension of a larger problem, well-named tests give a good overview without forcing you to examine the contents of those tests.

    Redundant Prefix
    In 99% of cases, if a public method is part of a test fixture and it has a [Test] attribute, it’s a test.  You don’t need the Test prefix.  It just hinders the skimming of the tests in alphabetic order.  If you really insist on putting “Test” somewhere, put it at the end of the name.  I personally wouldn’t bother, though.

    What are you doing?
    Finally and arguably most importantly, if you don’t name your test well, there is a greater chance that you don’t know what the test is trying to achieve. Would you start to write a production code method without any inkling as to what it did?  Even if you did, would you then leave it in existence with the name “DoSomeStuff”?

    Badly named tests are a smell.  In my experience, well-named tests are nearly always attempting to prove something regardless of the quality of the test body.  I cannot say the same for badly named tests.

    Good intent and bad execution is often better than a tidy, aimless test.  The former can be refactored into something useful, the latter requires decryption just to understand why it exists in the first place.  In many cases, the test proves nothing.

    If you write a test method and can’t think of a name that describes the test before you write it, stop.  Think about what you’re trying to achieve, then choose a name that describes it adequately.

    How do I choose a good name?

    First and foremost, what are you trying to prove with the test?  Tests are meant to demonstrate something.  They are meant to assert that some meaningful state is set, or some sort of interaction has taken place.

    You need at least three pieces of information to name a test.

    1. The thing that is being tested, such as a function, method or property (or a sequence of them)
    2. The arguments/data/circumstance involved
    3. The expected outcome.  What is meant to happen?  Be as explicit as you can.

    That’s it.  That’s all you need.  I prefer to write mine in the format 1_2_3, but I’m sure everyone has their own personal style.  As long as it’s readable and consistent, I don’t care.

    Compare and contrast

    Here’s a few examples of some good test names for a bounding box class.

    • AddPoint_ValidPointOutsideExistingBounds_IncreasesBoundsToContainPoint()
    • AddPoint_ValidPointInsideExistingBounds_DoesNothing()
    • AddPoint_ValidPointOnEdgeOfBounds_DoesNothing()
    • AddPoint_InvalidPointContainingNaN_DoesNothing()
    • AddPoint_InvalidPointNull_ThrowsException()

    Now compare those to equivalent, but badly named tests:

    • TestAddPoint()
    • TestAddPoint2()
    • TestAddPoint3()
    • TestAddPointBad()
    • TestAddPointBad2()

    One set is descriptive, easily graspable and allows you to skim the members list to get a good feel for the thoroughness of the tests.  The other set of names tells us very little.

    To those who say “I don’t like long method names”, I say, “It’s a test.  You write it once and read it hundreds of times.  You never have to call into it from other code.  Your screen is plenty wide enough to accomodate it.”  There is no reason to use short or bad test names ‘just because’.  I’ve yet to hear any meaningful criticism against giving test methods long names.

    As the famous nerd quote goes (paraphrasing):

    “When I wrote this code, only God and I knew what I was doing.  Now only God knows”.

    Just think about the poor sod who has to maintain your code in a couple of years. If you didn’t know what the hell you were doing, what are they going to make of it?

    Other Advantages

    If I have some good ideas about how to test something, or if I’m writing my tests before the production code, I will often use the 1_2_3 naming system to write out scores of empty test bodies.  You may be surprised to see how effective this is.

    You can plough through a group of methods in no time at all, thinking of all of the horrible things that could go wrong and writing them down.  In no time at all, you can have a comprehensive test suite in waiting.  The names are there, the intent is clear and you can proceed.

    This is also a great tactic for division of labour when you’re testing old code.  Prototype the test bodies, check in the skeleton fixture(s) and multiple people can get cracking on different areas.  I’ve done this quite successfully in the past.

    Benefits of designing for testability

    Posted in patterns, testing on May 11th, 2009 by Mark Simpson – Be the first to comment

    When I started my job as a Software Test Engineer, I had very little knowledge about unit testing.  I had a good degree award and a load of acronyms to put on my (in retrospect, rather horrible) CV.  I thought I knew a bit about design, encapsulation, patterns, object-oriented programming and all the rest.  With a little trepidation, I felt I was ready to face the world as a programmer.

    I applied for a few programming jobs at Realtime Worlds and did not succeed.  After the first rejection, I did not become disheartened.  I did what any sensible young chap would do — I went back to the drawing board.  I continually improved my skill, learned c#, created some new programs and re-wrote old programs with the knowledge I’d gleaned.  Every so often I’d check the website or speak to my friends who worked there, asking if anything suitable was potentially coming up.  Eventually, I landed at Realtime Worlds as a Software Test Engineer.

    In an ideal world, I would’ve succeeded first time.  After all, it was my goal to be a software engineer.

    Picture the scene

    My tremendously awesome CV plopped onto the doormat, awaiting the arrival of the hiring manager.  At 9am on the dot, that fine fellow picked it up and was instantly startled.  “Oh my!”, he cried, “Who is this remarkable young gentleman programmer who wishes to join our fine establishment?”

    The hiring manager sprinted up the stairs, burst into a CV triage meeting — cheeks purple and lungs wheezing — before hurling my golden-tinted paper across the room.   “Look! Look!  We’ve found him“, he honked.  The lead programmers threw their hats in the air, linked arms and then danced a merry dance.

    The search was over, and the party went on long into the night (though the party did involve programming a Spectrum emulator for the 3DO).

    It would’ve been great.  Well, in many ways yes.  In other ways, no.  Hindsight being 20/20, I am actually grateful that I didn’t get my first job doing normal development.  Why?

    Things wot I knew

    Firstly, bear in mind the fact that I said I thought I knew about design, encapsulation, this that and the other.  I did know a little bit, but I knew precisely nothing in the grand scheme of things.  Here’s the lowdown:

    My knowledge of patterns was the singleton and some others.  I did know some others, but it may as well have just been “singleton”.  Lalala, I can’t hear you.  I remember the shock when my friend linked my to the “Singleton considered stupid” article prior to getting a job.  “But they’re my best mates!”, I gasped.  “Yeah, but they’re stupid”, he replied, before jabbing me in the eye with a stick and berating me for my incompetence.

    My idea of simplifying problems by breaking them into smaller systems usually involved multiple managers all interacting via singletons.

    My idea of extensible software was using horribly complicated, deep inheritance hierarchies everywhere.  Yeah let’s make this base class and then…

    Pretty much everything I wrote was tightly coupled.  I thought that I had abstracted things away, but in general I just moved problems around.  Nearly every class relied on multiple custom, concrete types.  I never used factories.

    I relied on implementation details.  I often reached into classes several levels deep.  House->GetKitchen()->GetSink()->GetTap();  I didn’t just break the Law of Demeter, I dropped it on the ground and used its smashed remains as a (crap) bouncy castle.

    I could go on.  In short, as a dumb graduate, I was interested in my craft and enjoyed programming, but I had some bad habits and didn’t understand why a lot of the things I was doing were flat-out wrong, unmaintanable and are diametrically opposed to the principle of least surprise.  However, sometimes you don’t find out about these things until you’re forced to broach a particular topic.

    The reason I’m so glad to be a software test engineer is that all, and I mean all, of these coding horrors were laid bare when I started learning how to test properly.  When your code is fundamentally untestable, there is no denying it.

    Sowing the seeds

    As a newbie, I was tasked with writing tests for a lot of our existing codebase.  This meant I was exposed to a lot of different structures and idioms of production code written by everyone. Some of it was very easy to test; other parts not so much.

    We’re a games company and unit testing is not something that has gained widespread acceptance in games.  Back then, it was no surprise that the results were variable.  I spent months writing loads of tests and it took me a long time to feel like I was doing it in a way that I was happy with.

    Anyway, rewind back to when I sucked more than I suck now.  Even though I had no idea about what testability concerns should be, I quickly learned through doing.  I read articles and kept attempting to write better tests.

    After a couple of weeks, I started rocking back and forth if I saw a static class.  “Oh God”, I’d cry, “What do I need to initialise then tear down this time?”

    After a month, I hated the sight of any class containing scores of methods, lengthy method bodies, multiple indentations of control flow logic etc.  “Jeez, what do I need to do to hit this side of the double nested if statement that’s part of a switch which is called in a chain of 8 private methods?  If only the logic were split up into nicer chunks…”

    After another month, I started to wonder why many of the tests I was writing were so slow, given that I was only interested in testing one class at a time.  “Oh man!”, I’d exclaim.  “Why do I have to create all these slow thing?  Bah, I can’t even get at the logic I want to test! If only I could instantiate only what I need and totally control the test environment…”

    After another month, I reasoned that if an interface were to be provided and the dependencies could be abstracted away into ‘seams’, it made the code infinitely more testable.  “Hmm, I see why loose coupling is good…”

    After another month, I wondered why some of the tests were so fragile.  When some system or other changed, the tests for an unrelated class would fail!  “Oh… that’s why singletons are frowned upon.  The dependencies are hidden!”

    After another month, a workmate found Misko Hevery’s guide to testability and the penny dropped.  Like Google’s testing blog logo — it was like switching on a light bulb!

    Beyond that day, I’ve kept learning more and more techniques to use as part of my development and testing arsenal.  Making my code more testable was the goal, but it has given me so much more.  Testability is a great thing, but as with all software engineering techniques, it is not a silver bullet.

    The most important thing is that, with seemingly no concerted (separate) effort on my own part, the code I write to be testable is magically a lot better than the way I used to write it.

    Inadvertent Benefits

    As Luke Halliwell succinctly pointed out a while back, testability concerns and good design practices tend to converge.  If your code is testable, there’s a greater chance that the problem has been broken down into units of work.  Read his summary; it definitely coincides with my own experiences thus far.

    Actively seeking out solutions to make code more testable as resulted in some extremely valuable lessons — it has exposed me to new techniques (such as Dependency Injection) and ways of thinking.   These didn’t just alter the way I tested, they fundamentally altered the way I approach the writing of software.  In the ~x or so years I’ve been programming, designing for testability has been my single most valuable expedition!

    Am I suddenly the world’s greatest programmer?  Far from it.  I know scores of folk at work who can program circles around me.  On the other hand, is my code easier to understand, more maintainable, more cohesive and less tightly coupled compared to what I was writing a year or so ago?  Undoubtedly.  Are there fewer surprises?  You bet. Would anyone who had to maintain my code be inclined to hunt me down and murder me?  No.  They may perform some sort of grievous wounding, but I will live.

    I can’t believe how much I sucked.  I definitely suck less this year, though.

    The big block method

    Posted in c#, debugging, testing on May 2nd, 2009 by Mark Simpson – Be the first to comment

    Have you ever been in this situation? You have thousands of tests in scores of assemblies.  All of the tests pass.  However, when you run the test suite a second time without closing NUnit (or your test runner of choice) you find hundreds of failures occur in a specific area.  I’m not talking about in the same fixture or even the same assembly; this is NUnit wide. Something is trashing the environment, but there are no obvious warning signs.

    So, we have thousands of tests — the problem could be anywhere.  The answer is obviously not “look through all the tests” or “disable one project at a time”, there has to be an easier way…

    Unrelated, but applicable

    This just happened to me, but tracking down the culprit wasn’t as bad as you’d think.

    Something I learned as a budding level designer (circa ~1999) was how to find a leak in my level.  A leak in level design occurs when the world is not sealed from the void.  A decent analogy is to imagine the inside of the level is a submarine and the walls are the ‘hull’ — if there is a gap anywhere in the hull the water will get in; it will leak.

    A leak could be something as tiny and obscure as a 1 unit gap between a wall and a floor.  Most walls are 128 to 256 units, so a 1 unit gap is very small.  Even now, it’s not really feasible to find one in the editor unless you know exactly where it is.

    Half-Life’s goldsrc engine was BSP based; the visibility computations were performed at map compile time.  A failure to build VIS data meant that your level caused the game to run at about 3 frames per second.

    Unfortunately, tools were really … rudimentary back then.  Picture Borat’s wife ploughing a field and then contrast the image with that of a bloody huge combine harvester.  That’s how far we’ve come.  These days pretty much every editor has built in pointfile loading (meaning it will take you directly to the leak!) but back then, you had to be creative.

    The big block method

    Back when tools weren’t so great, to find leaks in a level, I used the big block method.  It’s a very simple technique.  Say we have a rubbish, leaky level like so (top down view):

    A level, yesterday.

    A leaky level, yesterday.

    If one of those connections between walls/floors/ceilings/whatever is not tight, it will leak.  We cannot see the site of the leak using our eyes.  We cannot be sure where the leak is by simply scrutinising each wall joint or entity.  What we can do instead, though, is place a big block over ~50% of the level.

    50% of the map is now covered

    The red area is a newly created solid block

    If we compile and find that the leak has disappeared, we know that the leak was definitely in the area that is now covered by the block.  On the other hand, if the leak is still present, it’s in the other 50% of the level that remains uncovered.  To hone in on the problem area, all we have to do is recursively add blocks to the problem area:

    Recursively adding blocks half the size of the previous block...

    A smaller block has been added

    We then recompile and check to see if the leak has disappeared as before.  Notice that in two steps, we’ve narrowed down the problem’s location to an area of 25% of the original size!  The next step will yield a further 12.5% reduction.  We quickly hone in on the problem.

    Same thing, different discipline

    Applying the same principle to finding problem tests or code is simple!  Divide and conquer.

    Open the NUnit test project file and remove 50% of the projects (though in my case, I kept the assembly with the tests that failed, as I needed to see them fail on the second run to know the problem had occurred).  Run the tests twice to see if the failure occurs on repeated runs.  If they fail, you know your problem is in that group of assemblies.  If they pass, you know the problem is in the other half.

    It’s then a case of whittling it down in the same way — disable a further 25% of your assemblies, run the tests twice and check the result.  Rinse, repeat.

    Eventually you will (most likely!) be down to two assemblies — the assembly that exposes the problem and the problem itself.  If there’s a large amount of tests and fixtures in the assembly you’re scrutinising, disable half of the fixtures and repeat the process.  You will rapidly converge on a fixture and, finally, the test that causes the problem.  From then on it, it’s just standard debugging.

    In my case, the culprit ended up being a single line of code calling into a method that has been a long-standing part of our code base.  It looks totally innocuous, and there is absolutely no way I’d have found so quickly without dividing and conquering.

    From the top level, with scores of assemblies and thousands of tests, it may as well be a 1 unit gap.

    The pride of a programmer

    Posted in testing on April 18th, 2009 by Mark Simpson – 2 Comments

    I was thinking about this the other day, and something struck me (and no, it wasn’t a disgruntled developer).  Automated testing is a valuable and widely accepted part of the software engineering process.  Many companies and organisations require that functionality be well tested prior to it being checked in and integrated into the main version control branch.  As a result, developers can spend hour upon hour every week writing unit tests.  In many cases, the quantity of test code can be more than double the code being tested!

    If you give any programmer a problem to solve, their brains will start whirring and they will come up with numerous solutions.  They will draw upon past experiences, new technologies and articles/books they’ve read.  They will use specific parts of the language to elegantly solve the problem in a terse and expressive fashion.  When they come up with a really cool solution, others will marvel at it and learn from it.

    In short, most programmers will think critically about problems before, during and after solving them.

    The indifference of a unit tester

    So, why is it that some folk just stop thinking when they put on their unit testing hat?  I fully understand that testing is not as glamorous or fun as banging out production code, but that’s no reason to accept sub-par test code.

    Given that unit testing is part of the software development cycle, I personally think that anyone who phones in their test code is doing themselves a disservice.  If you do the minimum possible amount of work, do not think critically about what you’re doing and ultimately learn nothing from it, then that’s a lot of time that is thrown away every week.

    It also doesn’t say much about you as a developer if you phone it in.  “Pff, functioning software, who needs that?”

    Test code is still code.  You have to write a lot of it.  It is fundamentally entwined with the production code in that it is vital to proving the solution works.  Moreover, if you test in a brain-dead fashion and fail to draw upon any of the faculties used when writing production code, you are often making more work for yourself later on, too. I usually cringe at the phrase “work smarter, not harder“, but in this case it is often applicable.

    Testing informs design

    Firstly, the most obvious point is that, if it’s hard to test something due to it being at the mercy of the environment, static/global state, hard wired dependencies and such, then it’s probably a sign that your code needs refactored.  If you want to test a loop exit condition and end up having to connect to a real database to do so, you know you’ve got problems.

    If you simply shrug your shoulders and plough on, you’ll probably run into all kinds of obstacles.  For example, you may have to initialise the environment to be just right for your test, then carefully negate the results of the test in the teardown.  But then what if the test order matters?  How will you know?  I’ve been in this situation before and it’s not pleasant.  It was such a battle getting the environment configured correctly for each test that I lost the will to live.  Plus the tests ran incredibly slowly.

    One of the major benefits of unit testing is that it encourage you to split up functionality into discrete units.  It just so happens that the discrete units are more easily understood as a result.  I liken it to building blocks.  Instead of having a monolithic structure (or, as less kind folk would call it, a big ball of mud), you have a load of little blocks.  Not only are the blocks much easier to understand on their own, but they are much nicer to work with.  I won’t bang on about this much because it’s nothing new (see Test Driven Development).

    I don’t always practice TDD, but at the very least I try to test classes/methods not long after I’ve written them.  Putting it off any longer tends to result in having to write a lot of test code at once, which sucks.  If you don’t like writing test code and you save up the tests for later, it’s like putting off doing your school homework.  It hangs over you like a cloud… and then you have to do it all at once on a Sunday night.  Why punish yourself? :)

    Unit testing is programming, too

    Naively attacking a testing problem may result in generating reams of test code that proves very little or, worse, has negative value.

    When writing tests, I am constantly looking for the following warning signs:

    • You have to configure the environment / concrete collaborators in a fashion that you barely understand yourself (how is anyone else meant to deal with this?)
    • You can’t easily test certain conditions or logical branches due to tight coupling (e.g. trying to test the behaviour of a class when one of its collaborators throws an exception that could occur in production systems, such as a network connection exception)
    • The test code is disproportionately large compared to the class under test due to repetition.  The DRY principle isn’t just for production code (though tests should favour readability if push comes to shove).
    • The test code is convoluted.  It is not clear what is going on, the test names are poor and it doesn’t help you understand the class under test.
    • The tests are brittle, causing them to frequently break, and break badly (e.g. a change in a concrete collaborator breaks tests that shouldn’t really depend on it, resulting in a debugging session rather than a simple fix…)
    • The test code could be refactored to use patterns to cut down on the noise (see previous Test Data Builder post).
    • The tests do not seem to prove anything.  If a test calls some methods with trivial cases and does very little verification, then what’s the point in writing the test?

    If I encounter one or more of these signs, I try to think about whether the class under test or the tests themselves would benefit from refactoring.

    Pointless Tests

    The last point is in the list particularly salient.  Why write a test if it doesn’t prove anything? I find this to be the most insidious problem of all.

    In general development, no self respecting programmer would repeatedly write code that:

    • Has no purpose
    • Looks like it’s doing the job correctly, but actually isn’t
    • Sets a bad example to anyone else reading the code
    • Sets good practices to one side (e.g. loads of repetition, unrepresentative method names)

    Programmers should hold themselves to the same standards when writing tests.  Bad testing is arguably worse than no testing at all.

    Here’s an example of a pointless test:

        [Test]
        public void SaveGameTest()
        {
            var game = new Game();
            var serializer = new GamePersistence();
    
            using(Stream stream = new MemoryStream())
            {
                serializer.Save(game, stream);
            }
    
            Assert.IsTrue(true);
        }

    What does this prove?  Nothing.  At the very most, it proves that you won’t encounter an exception when saving a game in the most trivial of cases.

    What does it do that is harmful?  Lots.  It took time to write.  It takes up space in the solution explorer / code file.  It gives impression that functionality has been tested in some capacity.  It skirts over areas of concern.  It sets a bad example to other developers who may look to existing tests for an example of testing this sort of thing.

    Finally, anyone who writes test code like this will probably find that it makes them hate testing even more than before, because they got absolutely nothing out of it.  Bad tests rarely find any bugs because they’re not asking the right questions, or any questions at all.  If no bugs are found, then the testing pass feels like a waste of time, further reinforcing the dislike for testing.

    In Summary

    Testing is part of the software development process. It has its own language, methodologies, patterns, best practices etc.

    If you apply the same critical thought processes and rigour to testing that you apply to production code, you will write better software with fewer bugs.  If you phone it in, you’ll receive a return call down the line…