Category Archives: c# - Page 2

AutoMapper and Test Data Builders

I’ve recently been tinkering with WCF and, as many people already know, writing data transfer objects is a pain in the balls.  Nobody likes writing repetitive, duplicate and tedious code, so I was delighted when I read about AutoMapper.  It works really nicely;  with convention over configuration, you can bang out the entity => data transfer code in no time, the conversions are less error prone, the tests stay in sync and you’re left to concentrate on more important things.

Anyway, I immediately realised that I’ve used the same pattern in testing — with property initializers & test data builders.  I’ve posted before about Test Data Builders and I’d recommend you read that post first.

For small test data builder classes, it’s really not that big a deal.  For larger classes, using AutoMapper is quite useful.  For example, for testing purposes we’ve got an exception details class that is sent over to an exception logging service.

Every time the app dies, we create an exception data transfer object, fill it out and then send it over the wire.  When unit testing the service, I use a Test Data Builder to create the exception report so that I can vary its properties easily.  Guess what?  The test data builder’s properties map 1:1 with the exception report — hmm!

So, rather than create the same boilerplate code to map the 10+ properties on the exception builder => data transfer object, I just used AutoMapper to handle the mapping for me :)

I’ve had good results with this approach.  The only bit I’m remotely concerned about is creating the mapping in the static constructor.  Any AutoMapper gurus out there who can say whether there’s any reason I shouldn’t do that?

Avoiding the file system

Going from experience and, as illustrated by Misko’s recent presentation, the more dependencies you have on your environment, the less trustworthy and maintainable your tests become.  One of the foremost offenders in this area is touching the file system.

Any time someone says “hey, I’m trying to open a file in a unit test…”, my first reaction is to say “woah”, and not in the “I know Kung Fu” way!  If you introduce a dependency on the file system, bad things are more likely to happen.  You now depend on something that may not be there/accessible/consistent etc.  Ever written a test that tried to access a common file, or read a file that something else may write to?  It’s horrible.

It is for these reasons that many folk will say “it’s not a unit test if it hits the file system”.  In particular, if you have a TestFixtureSetUp/TearDown method that deletes a file, it’s a sure sign that the fixture is going to flake out at some point.

A real example

Recently at work, my colleagues have experienced the joy of a huge refactoring job pertaining to restructuring our projects/solutions to reduce build times and increase productivity.  This work included maintaining something close to ten thousand tests.

As the job progressed, they kept finding that some test fixtures did not live in isolation.  The tests depended on various things they shouldn’t have and, most saliently, file dependencies proved to be a total pain in the balls.  Everything built OK, but when run, the tests failed due to missing files.  Paths and file attributes had to be checked (Copy if newer, etc.), lost files had to be hunted down and so forth.  It’s hassle that people don’t need!  As I’ve stated before, when it comes to testing, maintenance is king.

Anyway, if you have a hard dependency on the file system, consider the alternatives.  This is never a hard rule as it is not suitable for all uses, but it always worth thinking about.

Alternative approaches

Firstly, abstract the file operations to some degree.  You can do numerous things here, from changing the internal loading strategy (via Dependency Injection) to — even better — separating the loading/use of the file so that the consumer of the file’s contents doesn’t even have to care about the loading strategy.

Once you’ve done this, you no longer need to use files in your unit tests, as you can use plain ‘ol strings, streams or even just directly construct instances to represent the contents of the file.

Say we had a simple class called “MyDocument”, and MyDocument could be loaded from a .doc file on disk.  The simplest approach would be to do something like this:

Approach One

Depending on your needs and the demands of the user, this may be OK.  However, to test MyDocument’s methods/properties properly, we need access to the file system to instantiate it.  If our tests are to be robust & fast, we need something better.  Here’s something that’s testable:

Approach Two

From a testability standpoint, this is slightly better, as we can now feed in an IDocumentLoader instance which the MyDocument constructor uses to get the data.

On the flipside, the MyDocument type now needs to know about IDocumentLoader.  To load a document, the user now needs to know about creating an IDocumentLoader and feeding it into the constructor — it’s more complicated.  I often see people do this as the default step for abstracting their file operations — they alter the code to make it testable, but fail to spot the problems it brings if done at the wrong ‘level’.  If you gnash your teeth every time you have to use your own code, it’s a warning sign that something is wrong.

When we think about it though, why should MyDocument need to know about loading strategies?  In many cases, we can parse a file and produce some output using a factory or builder instead.  E.g.:

Approach Three

To clarify how this works: The DocumentLoader would parse the .doc file and construct the object instances required to build up a real document, then pass them into the document’s constructor (or build up the document via some other means, such as iteratively calling methods on a blank document to fill it up as new items are found — whatever makes sense).  This totally decouples the loading and instantiation, meaning we can test each step in isolation.

I.e. the flow goes: Read Input => Parse Input => Create Document from Parsed Input

Life after the File System

Once you’re no longer dependent on the file system, you are free to use one of many strategies for loading/creating your type.  Depending on the abstraction, some options include:

  • Just declare your data inline in the test methods as a const string, or as a const/readonly field of the test fixture.  This works well for small amounts of text.
  • Add your test files as text file resources.  You can then access the file contents as a static string property.  This is handy, as you get a strongly typed resource name and don’t need to mess around with paths + copying files.  This works well for larger sets of data, or data you want to re-use in multiple tests.
  • Use embedded resources & GetManifestResourceStream.  This is slightly messier; it doesn’t require copying files, but it does require that the namespace + filenames used to reference the embedded resources are correct (runtime failures ahoy).  You also need to handle streams when using this method.

If my loading logic deals with strings, I tend to just build an ‘inner’ parser that works with the strings, then wrap it in another class that opens and reads files, then passes the (raw) string to the ‘inner’ parser class.  This allows me to thoroughly test the parsing logic independent of the file system, but also means I can re-use it for test classes or other cases.  I.e. I can exercise more of the production code without any of the file system pain :)

Depending on the thing being loaded, this isn’t always the best solution, but for relatively simple loading I tend to favour this method.

Strongly typed commandline arguments

I’ve read quite a bit about Static Reflection and found it to be very appealing, but I hadn’t used it… until now!  Please have a quick look at the article, as I’m not going to parrot its key points, I’m going to write something that is horrendously over-engineered to solve a trivial problem, instead!  P.s. I apologise for interchanging arguments/parameters throughout this post.  My attention span is akin to that of a hey did anyone play Batman yet?

A bit of background on something I’m working on:  I have a c# app that is responsible for starting other processes.  Every now and then, the arguments/parameters go out of sync — I change the parameter list in the callee process and the caller, with its piddly weakly-typed guesses, causes the callee to bomb out as the arguments supplied do not match the parameters required.

E.g. I’d write something like:

But oh, oh no!  I’d renamed “IceCream” to “JimmyNeedsSomeIceCream” or removed it.  Since there is no communication between the two processes, it’s hard to write an integration test that proves all is well and, besides, most of the time it’s just down to me renaming or removing an argument.  When it bombs out, I then have to go setting breaks points in the callee or trawling through log files.  Not ideal.  I’d rather the compiler told me when there was an obvious problem.  So, my challenge was to make the command line arguments more robust.

You need a target to hit

Firstly, I’ll say that if you have non-trivial arguments to parse, the first piece of the puzzle is to grab a good parser.  I use one written by Peter Hallam (the link on his blog forwards you to a defunct site, but you can find the source in loads of open source projects) which works really nicely. I think I’d rather stop a bus with my face than write another crap command line parser, so it’s always nice to drop an existing, proven one in.

Here’s an example arguments class.  Notice that the types are not strings, they can be any simple type that can be parsed:

Anyway, now you have the parameter names, half of the battle is won.  Move the arguments class into a common assembly that both the caller and the callee can access.  The next step is using those parameter names in a strongly typed fashion.

Latch on to the target

For this, static reflection fits the bill.  Using a slightly modified version of the static reflection code, you can already write code like this:

In the above snippet, I’m getting the name of a field or property of a type using static reflection.  If the field or property disappears, the compiler will tell me.  If it is renamed, the code will update as with normal refactoring.

Here’s my barely modified example based on the first link in this post:

Let’s take a bit further, Jimmy.  It doesn’t take much effort at all to write a command line argument builder based on the same principles.  The existing Static Reflection code is operating on an expression tree.  All we need to do is receive an expression tree from the user and (if applicable, a value to go with the param name) and add it to our argument string.

5 minutes later

I wrote a 5 minute job that contained a stringbuilder and added some formatting and a nice fluid interface.  Behold!

So there we go.  I can now modify my command line arguments in one project and all callers will be brought up to speed at compile time (or the compiler will tell you that they’ve gone out of synch).  Of course, they might still be semantically incorrect, but hey, you can’t have everything.

You could probably extend this further by adding extra validation to the values of the arguments or by setting it on fire.

Understanding test doubles

There is a bewildering array of types of ‘mock’ object available to a tester.  The canonical list of test doubles was probably coined by the venerable Martin Fowler in his article “Mocks Aren’t Stubs” and, to me, this list is fairly complete and makes sense.  The reason it makes sense is that I’ve manually written classes that perform these roles.

  • I needed to fill out a parameter list with non-null objects, so I created a dumb class with absolutely no implementation.
  • I wanted to listen in on additions to a list, so I wrote a class that stored the objects, acting as a spy.
  • I wanted to provide canned results to test another part of my system, so I wrote a stub.
  • I needed to ensure a method was called, so I wrote a Mock.
  • I needed a coherent, fast implementation of a class, so I wrote a fake implementation.

My problem with testing terminology is that it’s harmful to newcomers, especially when the semantics affect the result of the test!  Not only is there a high barrier to entry when it comes to writing accurate, robust and maintainable tests, but the terminology is another unwelcome complication.

For this reason, I personally advocate keeping new testers away from mocking frameworks until they’ve become comfortable with state-based testing and hand-rolled a variety of their own test doubles.

Information Overload

Even when no additional frameworks are involved (i.e. when using vanilla xUnit), writing good unit tests involves a steep learning curve.  It’s easy to take a wrong turn and the quality of the tests written will improve only with experience/guidance.  I was not surprised when I read Roy Osherove’s blog and discovered that the majority of organisations’ attempts to embrace unit testing resulted in failure.

Mocking frameworks like Rhino Mocks are absolutely excellent tools, but it’s yet another thing to learn.  Suddenly the type of object (mock, strict mock, stub etc.) affects the result of the test.  It took me a week to get my head around it, so it doesn’t surprise me when I see newcomers totally abusing these frameworks.  Not only does this create a maintenance nightmare, but it sours their first taste of testing.

The problem is doubly hard to tackle, as Mocking frameworks allow you to write the same tests with fewer lines of code.  Developers who are new to testing see this and immediately try to use the mocking framework.  After all, only a fool would eschew such benefits, right?  Well, some people just ‘get it’ from the start.  I’m not one of these people and, in my experience, nor are most others.  Starting with interaction based testing and mocking frameworks is akin to throwing someone out of the back of a van that’s moving at 70mph and expecting them to start running when they hit the tarmac.  Chances are they’re going to land on their face.

The unfortunate result is that some folk tie themselves in knots.  They don’t understand the responsibilities of each type of testing object and, as a result, create extremely brittle or utterly pointless tests. If you have no idea of what you’re trying to achieve with a test, there is no point in writing it.  I once saw a question on StackOverflow featuring a confused fellow asking why his test did not work.  The poster was using Rhino Mocks and created a Mock of the class under test.  I.e. it was not a collaborator, it was the class under test and he was mocking it!  I suspect he was simply overwhelmed by trying to learn multiple new things.

Other things I’ve witnessed include developers writing obscure lambda expressions using c#, then chaining together RhinoMocks methods to perform something which somehow works.  When I pointed out that I could barely infer its purpose by reading the code and that writing a hand rolled stub would probably be a better idea, I was met with “yes, you’re probably right but I want to use Rhino Mocks”.

Warning bells should also start ringing when you return a mock and assert that its method was called by another mock which returns a mock which… errrr!  It’s much easier to make a dog’s dinner of interaction-based testing; a good grounding in state-based testing is essential.

One step at a time

  1. Learn to sit up before you crawl.  Write simple xUnit tests that involve state-based testing.  It doesn’t have to be great, isolated code.  Even writing tests that involve scores of classes is a good way to start.  Finer granularity is something that comes with experience.
  2. Crawl before you walk.  Start to experiment and find better ways of testing pieces of functionality.  Ask yourself whether the test is useful, maintainable etc.  Will other parts of the system break it if they change?  Can you make the components and tests themselves finer grained?  This stage should be about developing your sense of what constitutes a good test.
  3. Walk before you run.  Begin to experiment with different types of test doubles, but hand roll them.  Yes, it’s painful at times, but it will give you a better understanding of roles in tests and the different types of test doubles, even if you don’t have names for them yet.  Furthermore, constantly having to update your hand rolled stubs when disparate parts of your class changes will also give you an appreciation for the interface segregation principle.
  4. Finally, install a mocking framework and start sprinting.

    If you do sprint head-first into a wall, you will be better equipped to understand where you went wrong, as you understand the fundamentals.  You will also have a better grasp of the terminology, as it will be grounded in real, tangible code you’ve written.

    Getting up and running with Fluent NHibernate

    I’ve been meaning to try out NHibernate for a good ol’ while.  It’s a long-established and respected O/R M library and one of the authors (Ayende) writes a blog that I’ve read for a long time.

    Anyway, NHibernate is great, but its object => db mappings are a bit of a pain.  They are based on xml which is verbose, fiddly to write and the separation makes refactoring and testing mappings somewhat hard.  There are other ways to create mappings in code, such as via attributes, but this approach pollutes your business objects with DB specific code and still doesn’t help with the testing issue.  This and the lack of a LINQ to NHibernate are the only two main gripes I’ve heard about NHibernate.  The latter problem is getting solved for the next release.

    The weakly typed mappings solution is already most of the way there.  Step forward Fluent NHibernate.  Fluent NHbernate alleviates these problems by providing both convention based auto mappings and mappings created via strongly-typed code.  Lots of blogs and articles cover Fluent NHibernate; the point of this post is to point out a few gotchas that may occur when getting up and running.

    Firstly, the FluentNhibernate example project does not have all of the required assemblies when you try to run it.  If you check the InnerException message, it’s clear which assemblies are missing.  From memory, it’s one of the byte code .dlls.  Either set up an assembly reference or create a postbuild step to copy it.

    Moving on:

    When writing my own Noddy sample application, I followed This Tutorial and, while it is good, it misses out a few things:

    Gotcha #1:

    If you use SQLite to run your application and/or test your mappings, the version of SQLite provided with Fluent NHibernate is an x86 assembly.  If you have a 64 bit OS and fail to build your project in x86 mode, you’ll get various obscure error messages (instead of a BadImageFormatException or whatever .NET usually throws).  The solution to this particular problem is to set the project(s) to build in x86 mode.

    Gotcha #2:

    The following line will also cause an exception (or at least it did on my PC — running 64 bit Windows 7):

    Id(c => c.Id).GeneratedBy().HiLo(“customer”);

    Again, it’s a very vague exception.  The article has someone seeking help for the same problem, but no solution.  I asked on StackOverflow and the solution is to either remove the trailing .GeneratedBy…. fluent method calls, or to replace it with something like HiLo(“1000”).

    There may be implications for making such a change, but when you just want to get a Noddy application up and running so you can do a bit of fiddling about, it’ll do the job. :)

    Invert logical statements to reduce nesting

    As a test engineer, I spend a lot of my time reading –and making sense of– other people’s code.  I find it interesting that logically equivalent, re-arranged code can be much more easily understood.  Some of this follows on from the layout / style guide in the excellent Code Complete.  Perhaps I have unknowingly assimilated these idioms from reading, understanding and ultimately copying the layout of ‘good’ code.  Either way, they’re useful from a testing point of view, as it’s simpler to reason about code that doesn’t jump around so much.

    As has been stated many times before, in general, the best programmers are the ones who program around the limitations of the human brain.  E.g. splitting a method into self documented sub methods, reducing nesting depth, reducing number of exits from a loop, grouping related logical statements and so forth.

    Along similar lines, here’s a very simple but effective one: something I like to call “flattening“.

    Reduce nesting by inverting logical statements

    It’s a fairly simple concept.  By flipping a logical test or statement, you change the layout of the code, but the result remains the same.  Compare the two following two snippets:

    Approach One

    Approach Two

    Even with these trivial examples, I think the first one is much more convoluted and harder to follow.  If invariants must hold at the start of a method, it makes much more sense to check them and return / return an error / throw an exception (as appropriate).  If multiple nested if statements are used from the start, the nesting depth in the method is increased by default.  Any further indentations compound the problem and reduce clarity.

    I’ve seen fairly large methods (multiple pages) that suffered from this kind of problem and it made the code much harder to follow, especially when non-trivial work was done towards the end or a value was set earlier and returned later.  By the time you scroll to the end to deal with the problems, you’ve forgotten what came before.

    By contrast, the second method is much flatter.  It basically reads like: “if this is invalid, return.  Done.  If the other one is invalid, return.  Done.  OK, we have valid inputs, so forget the validation phase and get started with the real work”.  It’s like removing a couple of juggling balls from the air — it removes a mental burden because you can simply ignore the initial checking logic.

    You can also do things like introduce the continue keyword in loops and a few other things, but there are often caveats associated with such choices.

    I use ‘flattening’ it in a lot of places, but there are occasions when it doesn’t make sense to flatten a method.  Sometimes it’s nicer to deal with the ‘standard’ case first regardless of the extra nesting depth.  In other scenarios, you can reduce nesting through the use of the continue keyword in loops, but lobbing a ‘continue’ keyword half way into a loop body adds a different kind of complexity, even if it means reduced nesting depth.

    I.e. I like to use this a lot but, like most things, it shouldn’t be used indiscriminately.

    Visual Studio 2008 ordering fail

    Something I’ve just noticed is that Visual Studio falls far, far short when it comes to solution ordering.

    Not only does it fail in an utterly abject fashion due to ordering things in an arbitrary fashion, but it doesn’t even allow you to drag things around and order them manually.  The end result is a frustrating experience, especially when you’re trying to make your solution neat, tidy and easily browsable.

    I don’t understand why this is the case, because when you move projects around they are ordered alphabetically just fine.  Save, close and then re-open the solution and your projects will lose most of their ordering.

    It’s akin to raking leaves into a neat pile then sitting back to admire your handy-work, only for Steve Ballmer to crash through your garden face, scrape the rake across your shins and then kick leaves in your face while bellowing “WOOOOOOO YEAH!”.

    Well not really, but still.  It’s annoying.

    Success

    Here’s one I just created.  It looks grand.  Everything is ordered just as it should be.

    Success!

    Success!

    Failure

    Here’s the same solution after it has been closed and re-opened.

    Fail!

    Fail!

    The ordering is FUBAR.  There’s an issue on Microsoft’s VS product feedback page about this and it was duly ignored by the looks of things :(

    How can something so simple not just, well, work?  Anyone know of any workarounds to enforce ordering?

    The big block method (binary search)

    Have you ever been in this situation? You have thousands of tests in scores of assemblies.  All of the tests pass.  However, when you run the test suite a second time without closing NUnit (or your test runner of choice) you find hundreds of failures occur in a specific area.  I’m not talking about in the same fixture or even the same assembly; this is NUnit wide. Something is trashing the environment, but there are no obvious warning signs.

    So, we have thousands of tests — the problem could be anywhere.  The answer is obviously not “look through all the tests” or “disable one project at a time”, there has to be an easier way…

    Unrelated, but applicable

    This just happened to me, but tracking down the culprit wasn’t as bad as you’d think.

    Something I learned as a budding level designer (circa ~1999) was how to find a leak in my level.  A leak in level design occurs when the world is not sealed from the void.  A decent analogy is to imagine the inside of the level is a submarine and the walls are the ‘hull’ — if there is a gap anywhere in the hull the water will get in; it will leak.

    A leak could be something as tiny and obscure as a 1 unit gap between a wall and a floor.  Most walls are 128 to 256 units, so a 1 unit gap is very small.  Even now, it’s not really feasible to find one in the editor unless you know exactly where it is.

    Half-Life’s goldsrc engine was BSP based; the visibility computations were performed at map compile time.  A failure to build VIS data meant that your level caused the game to run at about 3 frames per second.

    Unfortunately, tools were really … rudimentary back then. These days pretty much every editor has built in pointfile loading (meaning it will take you directly to the leak!) but back then, you had to be creative.

    The big block method

    Back when tools weren’t so great, to find leaks in a level, I used the big block method.  It’s a very simple technique.  Say we have a rubbish, leaky level like so (top down view):

    A level, yesterday.

    A leaky level, yesterday.

    If one of those connections between walls/floors/ceilings/whatever is not tight, it will leak.  We cannot see the site of the leak using our eyes.  We cannot be sure where the leak is by simply scrutinising each wall joint or entity.  What we can do instead, though, is place a big block over ~50% of the level.

    50% of the map is now covered

    The red area is a newly created solid block

    If we compile and find that the leak has disappeared, we know that the leak was definitely in the area that is now covered by the block.  On the other hand, if the leak is still present, it’s in the other 50% of the level that remains uncovered.  To hone in on the problem area, all we have to do is recursively add blocks to the problem area:

    Recursively adding blocks half the size of the previous block...

    A smaller block has been added

    We then recompile and check to see if the leak has disappeared as before.  Notice that in two steps, we’ve narrowed down the problem’s location to an area of 25% of the original size!  The next step will yield a further 12.5% reduction.  We quickly hone in on the problem.

    After I started programming, I realised that the leak-finding method I used as a level designer is a simple binary search.

    Same thing, different discipline

    Applying the same principle to finding problem tests or code is simple!  Divide and conquer.

    Open the NUnit test project file and remove 50% of the projects (though in my case, I kept the assembly with the tests that failed, as I needed to see them fail on the second run to know the problem had occurred).  Run the tests twice to see if the failure occurs on repeated runs.  If they fail, you know your problem is in that group of assemblies.  If they pass, you know the problem is in the other half.

    It’s then a case of whittling it down in the same way — disable a further 25% of your assemblies, run the tests twice and check the result.  Rinse, repeat.

    Eventually you will (most likely!) be down to two assemblies — the assembly that exposes the problem and the problem itself.  If there’s a large amount of tests and fixtures in the assembly you’re scrutinising, disable half of the fixtures and repeat the process.  You will rapidly converge on a fixture and, finally, the test that causes the problem.  From then on it, it’s just standard debugging.

    In my case, the culprit ended up being a single line of code calling into a method that has been a long-standing part of our code base.  It looks totally innocuous, and there is absolutely no way I’d have found so quickly without dividing and conquering.

    From the top level, with scores of assemblies and thousands of tests, it may as well be a 1 unit gap.

    The Test Data Builder pattern with C# 3.0

    Update

    Since writing this, I’ve come to prefer the original  method chaining style.  While the property initialiser style works very well for simple builders, the original method chaining style works a bit better if you have nested builders.  See the link in this post for the original; it’s more versatile.

    The problem

    If you write automated tests, then you are bound to have come across a situation where, during the course of testing your classes, you have to configure them in different ways.  This difference may be slight (say, varying one argument passed into a constructor every time), but it is enough to make instantiating the objects a pain.  You can’t use a common setup method to make life easier, so that leaves you with two obvious options:

    1. Instantiate the objects via a direct constructor call
    2. Abstract the creation a little and use multiple factory methods

    Drawbacks

    The first option is verbose, cumbersome and brittle.  Every time the constructor changes or its rules change (e.g. “parameter string houseName must not contain spaces”), you’ll need to update the tests to use the new form.  If you have scores of calls, writing and maintaining the tests can consume a lot of time, plus it’s tedious.

    The second option may seem appealing, but it also has shortcomings.  This is the so called “Object Mother” pattern.  You basically have some factory methods (or a test class) that returns objects in various configurations.  When you have method names like “CreateHouseWithKitchenSinkAndStairs”, “CreateHouseWIthKitchenSinkAndRoof” and “Create HouseWithRoof” (or equivalent overloaded methods) then you’ll recognise this.

    Depending on how much method re-use you can share between tests, this may also be brittle and a lot of effort to maintain.  Both ways of doing things also lack clarity when lots of arguments are being passed around — the intent of the test (read: the interesting argument[s]) can be hidden by the noise of setting up safe defaults.  E.g. If you have 10 ‘normal’ objects and one badly formatted string that you’re using to make sure that an exception occurs, you want to focus on the argument that will cause the exception, not other parts.

    Test Data Builders

    So… what is this mythical Test Data Builder pattern?  It’s actually quite simple, though the syntax can look strange if you haven’t come across fluent interfaces before.

    A Test Data Builder is an extension of a standard factory method.  The difference between a standard factory and a builder is the way that the builder wires things up.  A factory typically has a “Create” call which creates an instance of an object (or objects) in pre-defined configurations, or with some customisability provided via arguments in the Create call, or injected into the factory’s constructor.

    A TDB varies this approach by setting up default, safe values to be used to instantiate the class under test.  When the builder’s constructor runs, it creates these defaults and stores them as fields.  There is usually a one to one correspondence of the class under test’s parameter list and the values stored in the builder.

    The cool part is that anyone using the TDB can selectively modify these values by calling methods or setting properties on the TDB.  Finally, when the TDB has been configured satisfactorily for the test, its Build method is called, injecting its fields into the class under test’s constructor as arguments.

    Nat Pryce posted an excellent article on Test Data Builders so I would implore you to read it before reading the rest of this post.

    Terser is more gooder – C# 3.0

    Once I’d got my head around the strange fluent interface syntax, I found that example to be invaluable.  It got me thinking: what if we could write these builders in an even more concise fashion?  Well, we can.  C# 3.0 can go one step further and, through the use of object initialisers, eliminate the need for method chaining.  This results in terser syntax still.

    It’s simple: Instead of using a method per value change, we use a property instead.  Since object initialisers are run immediately after a constructor and are nicely grouped, it means we can create and configure our builder in one step.

    The following code is provided merely to illustrate the syntax (there’s not much point in using a TDB on a class with so few parameters).  Here’s an overly simple example of a Person class to test.

    Here’s the builder.  Notice that it has 3 auto properties that match the Person class’s constructor parameter names and types.  Default values are instantiated in the constructor, and the properties allow test writers to swap out the defaults for their own specific values.

    Finally, here’s the test code itself.  Notice that the only arguments that are varied are the ones that are named in the test.  If the builder’s property is not set, the defaults held in the builder are used to construct the instance of the class under test.

    If any of this is unclear, just paste the code into your IDE and step through it in your debugger to see what’s happening (you’ll have to remove the non existent methods & exception test, obviously!)

    Summary

    I’ve found the following advantages when using this pattern in tests with large amounts of setup and little commonality in object construction:

    1. Less fragile.  Since object creation is centralised and uses safe values by default, tests are more resistant to change compared to direct constructor calls.
    2. Terser test code (extremely important when setup is complicated)
    3. The intent of the test is clearer.  There is less noise in test code, so you can more easily pick out the arguments/interactions of interest.

    Testing gotchas – c# Weak References

    If you ever have to test a class that uses a WeakReference, or even just have to use Weak References, be very careful.  Numerous strange-looking things can occur when Weak References are involved.

    If you have even a cursory understanding of the .NET Garbage Collector (GC), you will know that it keeps track of objects.  When an object is no longer strongly referenced, the GC will potentially collect it, freeing up its resources.  This causes the object to ‘disappear’. So, if you have a strong reference to an object in your program, you are generally safe in the assumption that the object will stick around.  The GC won’t pull the carpet from under you while you’re using that object.

    Weak References, on the other hand, do not stop the GC from collecting the object they refer to.  In certain circumstances, it can be advantageous to use Weak References because you do want to use/observe/whatever an object, but you don’t want to stop it from being collected.  So far so good.  All obvious stuff.

    The GC moves in mysterious ways

    OK you say, what’s the point in this article?  The point is that GC is extremely clever.  Almost a little too clever.  So clever it may aggressively collect objects to the point where it can mess with your head, and your tests.  This ‘problem’ can manifest itself in subtle ways — certain types of tests involving Weak References will usually pass in debug mode, but may sporadically fail in release mode.  Heads will be scratched.  Bemused gurning will commence.

    Obligatory Contrived Example

    Why does this occasionally fail?

    If you look carefully, you may spot the problem.  Recall that I said that objects can be collected while still in scope.  After the builder reference has been passed to the GuyWhoUsesWeakRef constructor, it is no longer used anywhere.  The GuyWhoUsesWeakRef class doesn’t take a strong reference, so the moment the parameter is no longer used, that reference also gets discarded.

    As a result, immediately after the new GuyWhoUsesWeakRef(builder) call, the GC figures out that the StringBuilder object we’ve created will never be used again.  After all, if the object is never used again, why not collect it as soon as possible?

    In debug mode, this won’t throw a spanner in the works.  The test will pass because the GC is not aggressively collecting.  However, in release mode, the GC may well collect the StringBuilder when we fully expect it to still be alive for our Assert.That() call.

    The main problem is that it won’t happen every time.  The GC is non-deterministic, so this test will pass and fail intermittently; it depends on the timing of the collection.  Coming from C++ where objects are destroyed as they exit scope, I found this somewhat bemusing.  In the context of the GC, it makes sense, though.  You just have to be careful.

    The solution

    The good folks at Microsoft they provided a very simple static method call to solve this particular problem; enter GC.KeepAlive.  Placing a call to GC.KeepAlive(builder) at the end of this test method will ensure that the object we’re referring to will not be collected until after the GC.KeepAlive call has been made.  Problem solved.