c#

Castle DynamicProxy2 quirks

Posted in c#, gotchas, patterns, software on May 15th, 2010 by Mark Simpson – Be the first to comment

I’ve been faffing around with Castle.DynamicProxy2 a bit lately and it’s a pretty interesting bit of kit.  Castle Dynamic Proxy (CDP) allows you to dynamically generate proxies at runtime to weave aspects of behaviour into existing types.  Aspect oriented programming is typically employed for implementing crosscutting concerns such as logging, performance measuring, raising INotifyPropertyChanged and various other types of repetitive and/or orthogonal concerns.  I’m a newbie to this stuff so I won’t say much more on AOP.

While I really like CDP, I’ve found that the documentation and tutorials (the best of which is Krzysztof Koźmic‘s excellent tutorial series) aren’t particularly explicit on how CDP achieves its effects, and sometimes these details are important.

There are two main ways of creating proxies that most developers will encounter.

CreateClassProxy

This is nearly always the first demonstrated method in tutorials.  ProxyGenerator.CreateClassProxy dynamically subclasses the target class, so if you have a class named Pogo and you call ProxyGenerator.CreateClassProxy, what you’ll get back is an instance of a subclass of Pogo (i.e. the new type is-a Pogo) that weaves in the interception behaviour via overriding methods.  This is why it is a stipulation that methods / properties must be virtual when they’re intercepted.

With class based interceptors, you cannot intercept non virtual methods because unlike Java, C# does not make methods virtual by default.  If you try to intercept a non-virtual method, nothing will happen, though mechanisms do exist to allow you to identify these situations and warn the developer (the most common example of this is that NHibernate will die on its arse if you try to use lazy loading with a non-virtual member).

CreateInterfaceProxyWithTarget

The second method is ProxyGenerator.CreateInterfaceProxyWithTarget, and it is the primary reason for writing this blog post!  CreateInterfaceProxyWithTarget does not dynamically subclass target types, it simply creates a dynamically generated class, implements the same target interface and then passes through to it.  I.e. it’s an implementation of the decorator pattern.  This has two effects, one of which is very important!

  1. You don’t need to mark your methods/properties as virtual
  2. Since it is a proxy working via decoration rather than subclassing, for the effects of the interceptor to be applied, all calls must be made on the proxy instance.  Think about it.

The most salient point is #2.  I’ll elaborate.

A worked example: Rocky Balboa

Say you have an interface called IBoxer like this:

… and you implement it like this:

If you then turn to aspect oriented programming and decide to gather statistics on punches thrown for the duration of a boxing round, it’s a reasonable assumption that you can simply proxy the IBoxer interface and intercept only the StraightLeft/StraightRight punch calls, tally them up and report the metrics (ignore whether this is a good idea to be doing this, it’s a contrived example).  On the face of it this isn’t a horrible idea.  However, it won’t work as expected.

The key here is that OneTwo() calls through to StraightLeft() and StraightRight().  Once the proxy has delegated to the decorated type it loses the ability to intercept the calls.  We can follow the call graph easily enough.  We have a reference to the proxy via the IBoxer interface.   We call “OneTwo()” on it and when the invocation proceeds, it delegates to the decorated Rocky instance.  The Rocky instance then calls StraightLeft(), StraightRight().  Both of these calls will immediately go straight to the ‘real’ implementation, bypassing the proxy.

Just as with the normal decorator pattern, the decorator (the IBoxer dynamic proxy in this case) loses its influence when the decorated object calls methods declared in its own public interface.  In this particular situation we could write an interceptor that knows to add two punches when OneTwo() is called on the proxy, but compare this approach to one using class based proxy.  If we were using a class proxy we could rest safe in the knowledge that all calls to StraightLeft() and StraightRight() will always be intercepted, as the extra stuff we’ve bolted on resides in a method override.

The results vary depending on the type of proxy we generate and the way the types are written. In hindsight it’s pretty obvious, but it still caught me out.

Data-driven testing tricks

Posted in Uncategorized, c#, patterns, testing, tips on May 8th, 2010 by Mark Simpson – Be the first to comment

It’s a fairly common occurrence — somebody wants to use NUnit’s data driven testing, but they want to vary either the action under test, or the expectation.  I.e. they’re not parametrising simple data, they’re parametrising the actions.

You cannot encode these things via normal data-driven testing (short of doing really nasty things like passing string names of methods to be invoked or using enums and a dictionary of methods) and even if you use a hackish workaround, it’s unlikely to be flexible or terse.

Test readability is paramount, so if you have some tests written in an unfamiliar style, it is very important to express the intent clearly, too.

NUnit’s data-driven testing

NUnit uses a few mechanisms to parametrise tests.  Firstly, for simple test cases, it offers the [TestCase] attribute which takes a params object[] array in its constructor.  Each argument passed to the TestCaseAttribute’s constructor is stored, ready for retrieval by the framework.  NUnit does the heavy lifting for us and casts/converts each argument to the test method’s parameter types.  Here’s an example where three ints are passed, then correctly mapped to a test method:

The main limitation here is that we can only store intrinsic types.  Strings, ints, shorts, bools etc.  We can’t new up classes or structs because .NET doesn’t allow it.  How the devil do we do something more complicated?

Passing more complicated types

It would appear we’re screwed, but fortunately, we can use the [TestCaseSource] attribute.  There are numerous options for yielding the data, and one of them is to define an IEnumerable<TestCaseData> as a public method of your test class (it works if it’s private, but since it’s accessed via reflection it’s a good idea to keep it public so that ReSharper or other tools do not flag it as unused).  You can then fill up and yield individual TestCaseData instances in the same fashion as before.  Once again, NUnit does the mapping and the heavy lifting for us.

If you do not require any of the fancy SetDescription, ExpectedException etc. stuff associated with the TestCaseData type, you can skip one piece of ceremony by simply yielding your own arbitrary type instead (i.e. change the IEnumerable<TestCaseData> to IEnumerable<MyType> and then simply yield return new MyType()).

Passing a delegate as a parameter (simple)

The simplest case is that you want to vary which methods are called.  For example, if you have multiple types implementing the same interface or multiple static methods, encoding which method to call is very simple.

Here’s an example from Stackoverflow that I answered recently where the author wanted to call one of three different static methods, each with the same signature and asserts.  The solution was to examine the method signature of the call and then use the appropriate Func<> type (Funcs and Actions are convenience delegates provided by the .NET framework).  It was then easy to parametrise the test by passing in a delegates targeting the appropriate methods.

More advanced applications

Beyond calling simple, stateless methods via delegates or passing non-intrinsic types, you can do a lot of creative and cool stuff.  For example, you could new up an instance of a type T in the test body and pass in an Action<T> to call.  The test body would create an instance of type T, then apply the action to it.  You can even go as far as expressing Act/Assert pairs via a combination of Actions and mocking frameworks.  E.g. you could say “when I call method X on the controller, I expect method Y on the model to be called”, and so forth.

The caveat is that as you do use more and more ‘creative’ types of data-driven testing, it gets less and less readable for other programmers.  Always keep checking what you’re doing and determine whether there is a better way to implement the type of testing you’re doing.  It’s easy to get carried away when applying new techniques, but it’s often the case that a more verbose but familiar pattern is a better choice.

More readable data-driven tests

Posted in c#, testing, tips on February 13th, 2010 by Mark Simpson – Be the first to comment

When the logic of a test method remains constant but the data varies, data-driven testing is a great tool.  It allows you, the test author, to write compact code and to add new test cases rapidly.  Unfortunately, data-driven tests have a disadvantage: The inputs are often less readable.

A simple example

Let’s take an example; testing Rob Conery’s PagedList implementation.  A page is basically a slice of the data returned by a linq query.  If more data exists beyond the ‘slice’ represented by the PagedList<T> instance, its “HasNextPage” property should return true to indicate that it is available.  Now, suppose we want to test whether a particular page has a next available page.  Three things spring to mind that can influence the result: The page size, the current page index and the number of items in the list.

Here’s a quick data-driven test for HasNextPage:

As you can see, the method itself is readable, but the parametrised values fed in via the [TestCase] attribute are not.  It’s really hard to keep everything in your head and remember what each number maps to in the function, especially when the method parameter types are all identical.  If you have a list of 20 [TestCase] attributes, you start wondering what’s for your tea and forget that the second value is the (checks image) page size.  Mince.  Mince for tea.

Hmm, if only we could those TestCases more readable; something like object initializers would be ideal.

A simple trick: Subclass TestCase

My friend Hughel helped me come up with this one and it works quite well.  Simply subclass the TestCaseAttribute class and add your own properties to represent the test parameters.  It gets a little bit hairy when you have to access the Arguments array directly (especially since Attributes can be weird), but in practice, it works fine.   In most of the tests, we’re only interested in parametrising three things, so it’s simple to add them as properties.

The end result

Finally, we apply these attributes to our data-driven test, significantly improving the readability!

I would hasten to add that I don’t recommend using this approach willy-nilly — only when you have a large amount of tests that are parametrised by the same data types, causing the test cases to become hard to follow.  I’ve used the [TestCaseSource] attribute and the [TestCase] attribute a lot in the past and most of the time it’s not a problem.

Making sense of your codebase : NDepend

Posted in c#, software on January 18th, 2010 by Mark Simpson – Be the first to comment

I was given an NDepend license to play around with (thanks to Patrick) and said I’d blog about it.  Apologies for my tardiness!

What is NDepend?

NDepend is a piece of software that allows developers to analyse and visualise their code in many interesting ways.  Here’s an ancient relic from my hard drive, revived in NDepend:

As you can see, there’s quite a lot of functionality included.  NDepend includes features that allow you to trawl your code for quality problems, identify architectural constraints that have been breached, find untested, complicated code or pinpoint changes between builds and much more.  This post will merely scratch the surface, but will hopefully provide some decent information on what NDepend can do for you and your team.

NDepend works with Visual Studio solution files and integrates with Visual Studio, meaning it’s simple to generate an NDepend project based on a solution.  You can then press the analyse button and watch NDepend do its thing — it generates a nice HTML report with various visualisations and statistics.

CQL all the way

NDepend provide a lot of pre-defined metrics out of the box, but its best feature by far (and the one on which most of its functionality is built) is the CQL query language.  The SQL / Linq-esque query language allows you to search, filter and order data in a simple fashion.  Once you get accustomed to the query language, it is simple to start bashing out meaningful queries.

You enter queries into a command window and the results are displayed instantly.  For example, you could write something like this:

At the time of writing, 82 separate metrics are available; the metrics are grouped by categories including: assembly, namespace, method etc.  If you have an idea, then you can probably express that idea via CQL.   Furthermore, your queries can be saved and reused/applied to projects as required.

To write a query, you just tap it into the query editor:

Don’t be Paralysed by Choice

While it is a hugely powerful program, I found it overwhelming to begin with.  The default analysis of your code will result in a verbose report including every metric under the sun; some of these are unarguably very useful (Cyclomatic complexity, IL nesting depth) whereas others have a more narrow purpose (suggesting which attributes should be sealed, boxing and unboxing warnings and so on).

Due to the mammoth amount of information in the report, it’s akin to pulling the FxCop lever for the first time.  You get metrics-shotgunned in the face and potentially paralysed by choice.  As a result, I chose to view the default report as a rough guide for how to use the CQL query language and to give me ideas of how I could form queries that were tailored to my needs.  There’s an excellent placemat PDF available for download, too.

I would encourage first time users to skim the main report for interesting nuggets then immediately begin to play around with the CQL query language :).  It’s really cool to think, “I wonder if…”, type a few words and immediately see the question answered.

Manual trawling versus NDepend

Back when I was a Test Engineer, I was assigned to look at an unfamiliar part of our codebase.  I decided to do a little experiment using NDepend to determine its effectiveness (we have a few licenses at work too and are looking to integrate it into our build process at some stage soon).  The test was simply this:  I would work through the code manually checking each file/type for problems, then I would do a sweep using NDepend to see if it could pick out the same problems (and potentially some others I’d missed).

The sort of things I was looking for included:

  • Big balls of mud / God classes
  • Types with the most complicated functionality (high cyclomatic complexity / code nesting level)
  • Types that are heavily used (‘arteries’ of the codebase — failure would be more costly)
  • Types that have poor coverage and high complexity

The results were good.  The majority of the problems I had identified during a manual sweep showed up near the top of the query results. It took me a few hours to manually trawl a relatively small portion of the codebase.  It took me about an hour to get to grips with the NDepend basics, write some CQL and make sense of the results.  I’d say that’s very good going considering I hadn’t used it before.

One thing to watch out for is that some manual tweaking is often required as, if you have some ugly utility classes or use open source software in source form (such as the command line argument parser class that is ubiquitous at work), these sorts of thing monopolise the results.  To get around this problem you can ignore these types, either by applying attributes to your types in code, or by adding an explicit exclude via your query (some examples of which are included below).

Sample Queries

Here’s a few examples of how to express the aforementioned concerns as queries.  Note: I’m an NDepend newbie, so there’s probably better ways to do this.

Types that have so many lines of code that it makes your head spin

Complicated nesting

Complicated methods

High complexity with poor test coverage

‘Popular’ types with poor test coverage

None of these are ‘hard’ rules — you have to play around with them while browsing your codebase (which is easy as NDepend’s CQL editor is constantly updating as you enter the query).  You may find that one project has really simple code that doesn’t even register on a solution-wide analysis, whereas another project may hog the results pane.  If you have a specific goal in mind, you can iteratively tailor the query to get what you want :).

The role of NDepend?

NDepend is not a silver bullet and doesn’t claim to be — it’s a complementary tool that can be used in addition to buddy systems, code reviews and so forth.  Having said that, the amount of information you can mine from your codebase is pretty impressive.

In terms of practical usage, we plan to integrate it into our build system to aid us in identifying potential problems, including code changes not backed by tests, architectural rules (project x is not allowed to reference project y) and general rules of thumb that should be respected (cyclomatic complexity, nesting depth and so forth).

In short, it’s well worth checking out NDepend.

AutoMapper and Test Data Builders

Posted in c#, patterns, testing, tips on January 11th, 2010 by Mark Simpson – Be the first to comment

I’ve recently been tinkering with WCF and, as many people already know, writing data transfer objects is a pain in the balls.  Nobody likes writing repetitive, duplicate and tedious code, so I was delighted when I read about AutoMapper.  It works really nicely;  with convention over configuration, you can bang out the entity => data transfer code in no time, the conversions are less error prone, the tests stay in sync and you’re left to concentrate on more important things.

Anyway, I immediately realised that I’ve used the same pattern in testing — with property initializers & test data builders.  I’ve posted before about Test Data Builders and I’d recommend you read that post first.

For small test data builder classes, it’s really not that big a deal.  For larger classes, using AutoMapper is quite useful.  For example, for testing purposes we’ve got an exception details class that is sent over to an exception logging service.

Every time the app dies, we create an exception data transfer object, fill it out and then send it over the wire.  When unit testing the service, I use a Test Data Builder to create the exception report so that I can vary its properties easily.  Guess what?  The test data builder’s properties map 1:1 with the exception report — hmm!

So, rather than create the same boilerplate code to map the 10+ properties on the exception builder => data transfer object, I just used AutoMapper to handle the mapping for me :)

public class ExceptionReportDto
{
    public string ExceptionType { get; set; }
    public string StackTrace { get; set; }
    public string AssemblyName { get; set; }
    public string EntryPoint { get; set; }
    public string UserName { get; set; }
    public string MachineName { get; set; }
    // etc
}
public class ExceptionReportBuilder
{
   public string ExceptionType { get; set; }
   public string StackTrace { get; set; }
   public string AssemblyName { get; set; }
   public string EntryPoint { get; set; }
   public string UserName { get; set; }
   public string MachineName { get; set; }
   // etc

// create the mapping when the static ctor is invoked
static ExceptionReportBuilder()
}
   Mapper.CreateMap<ExceptionReportBuilder, ExceptionReportDto>();
}

public void ExceptionReportDto()
{
    // set up defaults
    ExceptionType = "System.ArgumentException";
    StackTrace = "Oh no I am a stack trace";
    //etc.
}

 public ExceptionReportDto Build()
 {
     // go go automagic!
     return Mapper.Map<ExceptionReportBuilder, ExceptionReportDto>(this);
 }
}

I’ve had good results with this approach.  The only bit I’m remotely concerned about is creating the mapping in the static constructor.  Any AutoMapper gurus out there who can say whether there’s any reason I shouldn’t do that?

Avoiding the file system

Posted in c#, patterns, testing on November 26th, 2009 by Mark Simpson – Be the first to comment

Going from experience and, as illustrated by Misko’s recent presentation, the more dependencies you have on your environment, the less trustworthy and maintainable your tests become.  One of the foremost offenders in this area is touching the file system.

Any time someone says “hey, I’m trying to open a file in a unit test…”, my first reaction is to say “woah”, and not in the “I know Kung Fu” way!  If you introduce a dependency on the file system, bad things are more likely to happen.  You now depend on something that may not be there/accessible/consistent etc.  Ever written a test that tried to access a common file, or read a file that something else may write to?  It’s horrible.

It is for these reasons that many folk will say “it’s not a unit test if it hits the file system”.  In particular, if you have a TestFixtureSetUp/TearDown method that deletes a file, it’s a sure sign that the fixture is going to flake out at some point.

A real example

Recently at work, my colleagues have experienced the joy of a huge refactoring job pertaining to restructuring our projects/solutions to reduce build times and increase productivity.  This work included maintaining something close to ten thousand tests.

As the job progressed, they kept finding that some test fixtures did not live in isolation.  The tests depended on various things they shouldn’t have and, most saliently, file dependencies proved to be a total pain in the balls.  Everything built OK, but when run, the tests failed due to missing files.  Paths and file attributes had to be checked (Copy if newer, etc.), lost files had to be hunted down and so forth.  It’s hassle that people don’t need!  As I’ve stated before, when it comes to testing, maintenance is king.

Anyway, if you have a hard dependency on the file system, consider the alternatives.  This is never a hard rule as it is not suitable for all uses, but it always worth thinking about.

Alternative approaches

Firstly, abstract the file operations to some degree.  You can do numerous things here, from changing the internal loading strategy (via Dependency Injection) to — even better — separating the loading/use of the file so that the consumer of the file’s contents doesn’t even have to care about the loading strategy.

Once you’ve done this, you no longer need to use files in your unit tests, as you can use plain ‘ol strings, streams or even just directly construct instances to represent the contents of the file.

Say we had a simple class called “MyDocument”, and MyDocument could be loaded from a .doc file on disk.  The simplest approach would be to do something like this:

Approach One

 // #1: this is tightly coupled to file system.  
 void SimpleLoading()
 {
    var simpleDocument = new MyDocument("filenameToLoad.doc");
 }

Depending on your needs and the demands of the user, this may be OK.  However, to test MyDocument’s methods/properties properly, we need access to the file system to instantiate it.  If our tests are to be robust & fast, we need something better.  Here’s something that’s testable:

Approach Two

 // #2: doc calls DocumentLoader's methods to get data needed
 void SlightlyImprovedAndTestable()
 {
    // this type implements IDocumentLoader
    var docLoaderStrategy = new FileDocumentLoader("filenameToLoad.doc");

    // Uses DI; ctor calls doc loader's methods to construct itself
    var doc = new MyDocumentType(docLoaderStrategy);
 }

From a testability standpoint, this is slightly better, as we can now feed in an IDocumentLoader instance which the MyDocument constructor uses to get the data.

On the flipside, the MyDocument type now needs to know about IDocumentLoader.  To load a document, the user now needs to know about creating an IDocumentLoader and feeding it into the constructor — it’s more complicated.  I often see people do this as the default step for abstracting their file operations — they alter the code to make it testable, but fail to spot the problems it brings if done at the wrong ‘level’.  If you gnash your teeth every time you have to use your own code, it’s a warning sign that something is wrong.

When we think about it though, why should MyDocument need to know about loading strategies?  In many cases, we can parse a file and produce some output using a factory or builder instead.  E.g.:

Approach Three

 // #3: Break loading + doc creation into two distinct parts
 void DecoupledAndTestable()
 {
    // loading is now a separate step :)
    MyDocument doc;
    using(var docLoader = new DocumentLoader("filenameToLoad.doc"))
    {
       doc = docLoader.LoadDocument();
    }

    // Similarly, we can do something like this
    var testDoc = new TestLoader()
                      .LoadDocumentFromText(SomeResourceFile.ValidDocument);
 }

To clarify how this works: The DocumentLoader would parse the .doc file and construct the object instances required to build up a real document, then pass them into the document’s constructor (or build up the document via some other means, such as iteratively calling methods on a blank document to fill it up as new items are found — whatever makes sense).  This totally decouples the loading and instantiation, meaning we can test each step in isolation.

I.e. the flow goes: Read Input => Parse Input => Create Document from Parsed Input

Life after the File System

Once you’re no longer dependent on the file system, you are free to use one of many strategies for loading/creating your type.  Depending on the abstraction, some options include:

  • Just declare your data inline in the test methods as a const string, or as a const/readonly field of the test fixture.  This works well for small amounts of text.
  • Add your test files as text file resources.  You can then access the file contents as a static string property.  This is handy, as you get a strongly typed resource name and don’t need to mess around with paths + copying files.  This works well for larger sets of data, or data you want to re-use in multiple tests.
  • Use embedded resources & GetManifestResourceStream.  This is slightly messier; it doesn’t require copying files, but it does require that the namespace + filenames used to reference the embedded resources are correct (runtime failures ahoy).  You also need to handle streams when using this method.

If my loading logic deals with strings, I tend to just build an ‘inner’ parser that works with the strings, then wrap it in another class that opens and reads files, then passes the (raw) string to the ‘inner’ parser class.  This allows me to thoroughly test the parsing logic independent of the file system, but also means I can re-use it for test classes or other cases.  I.e. I can exercise more of the production code without any of the file system pain :)

Depending on the thing being loaded, this isn’t always the best solution, but for relatively simple loading I tend to favour this method.

Strongly typed commandline arguments

Posted in c# on September 28th, 2009 by Mark Simpson – Be the first to comment

I’ve read quite a bit about Static Reflection and found it to be very appealing, but I hadn’t used it… until now!  Please have a quick look at the article, as I’m not going to parrot its key points, I’m going to write something that is horrendously over-engineered to solve a trivial problem, instead!  P.s. I apologise for interchanging arguments/parameters throughout this post.  My attention span is akin to that of a hey did anyone play Batman yet?

A bit of background on something I’m working on:  I have a c# app that is responsible for starting other processes.  Every now and then, the arguments/parameters go out of sync — I change the parameter list in the callee process and the caller, with its piddly weakly-typed guesses, causes the callee to bomb out as the arguments supplied do not match the parameters required.

E.g. I’d write something like:

public class ProcessFactory
{
    public Process CreateProcess()
    {
        // lots of stuff happens, then I try to make the arguments string
        // to set as part of the process start info
        string arguments = string.Format(
            "-IceCream:{0}", numberOfScoops);
    }
}

But oh, oh no!  I’d renamed “IceCream” to “JimmyNeedsSomeIceCream” or removed it.  Since there is no communication between the two processes, it’s hard to write an integration test that proves all is well and, besides, most of the time it’s just down to me renaming or removing an argument.  When it bombs out, I then have to go setting breaks points in the callee or trawling through log files.  Not ideal.  I’d rather the compiler told me when there was an obvious problem.  So, my challenge was to make the command line arguments more robust.

You need a target to hit

Firstly, I’ll say that if you have non-trivial arguments to parse, the first piece of the puzzle is to grab a good parser.  I use one written by Peter Hallam (the link on his blog forwards you to a defunct site, but you can find the source in loads of open source projects) which works really nicely. I think I’d rather stop a bus with my face than write another crap command line parser, so it’s always nice to drop an existing, proven one in.

Here’s an example arguments class.  Notice that the types are not strings, they can be any simple type that can be parsed:

public class TargetProcessArguments
{
    /// <summary>Allow multiple instances?</summary>
    [Argument(ArgumentType.AtMostOnce, HelpText = "I'm a knife.  Knifin' around.")]
    public int NumberOfScoops;

    // etc. and so forth
}

Anyway, now you have the parameter names, half of the battle is won.  Move the arguments class into a common assembly that both the caller and the callee can access.  The next step is using those parameter names in a strongly typed fashion.

Latch on to the target

For this, static reflection fits the bill.  Using a slightly modified version of the static reflection code, you can already write code like this:

[Test]
public void GetName_WithValidField_ReturnsFieldName()
{
    var fieldName = StaticReflection.GetName<TestClass>(x => x.TestField);
    Assert.That(fieldName, Is.EqualTo("TestField"));
}

In the above snippet, I’m getting the name of a field or property of a type using static reflection.  If the field or property disappears, the compiler will tell me.  If it is renamed, the code will update as with normal refactoring.

Here’s my barely modified example based on the first link in this post:

/// <summary>
/// http://www.lostechies.com/blogs/gabrielschenker/archive/
/// 2009/02/03/dynamic-reflection-versus-static-reflection.aspx
/// </summary>
public static class StaticReflection
{
    /// <summary>
    /// Works with either a property or
    /// a field (simplifies use)
    /// </summary>
    public static string GetName<TEntity>(
            Expression<Func<TEntity, object>> expression)
    {
        var memberExpression = GetMemberExpression(expression);
        return memberExpression.Member.Name;
    }

    private static MemberExpression GetMemberExpression<T>(
         Expression<Func<T, object>> expression)
    {
        MemberExpression memberExpression = null;
        if (expression.Body.NodeType == ExpressionType.Convert)
        {
            var body = (UnaryExpression)expression.Body;
            memberExpression = body.Operand as MemberExpression;
        }
        else if (expression.Body.NodeType == ExpressionType.MemberAccess)
        {
            memberExpression = expression.Body as MemberExpression;
        }

        if (memberExpression == null)
        {
            throw new ArgumentException(
                "Not a member access", "expression");
        }

        return memberExpression;
    }
}

Let’s take a bit further, Jimmy.  It doesn’t take much effort at all to write a command line argument builder based on the same principles.  The existing Static Reflection code is operating on an expression tree.  All we need to do is receive an expression tree from the user and (if applicable, a value to go with the param name) and add it to our argument string.

5 minutes later

I wrote a 5 minute job that contained a stringbuilder and added some formatting and a nice fluid interface.  Behold!

[Test]
public void TestParamPair_WithFieldExpression_ParamPairIsWrittenCorrectly()
{
    string result = new CommandLineBuilder<TestArguments>()
        .ParamPair(x => x.NumberOfScoops, "3")
        .Build();

    Assert.That(result, Is.EqualTo("-NumberOfScoops:3"));
}

So there we go.  I can now modify my command line arguments in one project and all callers will be brought up to speed at compile time (or the compiler will tell you that they’ve gone out of synch).  Of course, they might still be semantically incorrect, but hey, you can’t have everything.

You could probably extend this further by adding extra validation to the values of the arguments or by setting it on fire.

Understanding test doubles

Posted in c#, testing, tips on August 22nd, 2009 by Mark Simpson – Be the first to comment

There is a bewildering array of types of ‘mock’ object available to a tester.  The canonical list of test doubles was probably coined by the venerable Martin Fowler in his article “Mocks Aren’t Stubs” and, to me, this list is fairly complete and makes sense.  The reason it makes sense is that I’ve manually written classes that perform these roles.

  • I needed to fill out a parameter list with non-null objects, so I created a dumb class with absolutely no implementation.
  • I wanted to listen in on additions to a list, so I wrote a class that stored the objects, acting as a spy.
  • I wanted to provide canned results to test another part of my system, so I wrote a stub.
  • I needed to ensure a method was called, so I wrote a Mock.
  • I needed a coherent, fast implementation of a class, so I wrote a fake implementation.

My problem with testing terminology is that it’s harmful to newcomers, especially when the semantics affect the result of the test!  Not only is there a high barrier to entry when it comes to writing accurate, robust and maintainable tests, but the terminology is another unwelcome complication.

For this reason, I personally advocate keeping new testers away from mocking frameworks until they’ve become comfortable with state-based testing and hand-rolled a variety of their own test doubles.

Information Overload

Even when no additional frameworks are involved (i.e. when using vanilla xUnit), writing good unit tests involves a steep learning curve.  It’s easy to take a wrong turn and the quality of the tests written will improve only with experience/guidance.  I was not surprised when I read Roy Osherove’s blog and discovered that the majority of organisations’ attempts to embrace unit testing resulted in failure.

Mocking frameworks like Rhino Mocks are absolutely excellent tools, but it’s yet another thing to learn.  Suddenly the type of object (mock, strict mock, stub etc.) affects the result of the test.  It took me a week to get my head around it, so it doesn’t surprise me when I see newcomers totally abusing these frameworks.  Not only does this create a maintenance nightmare, but it sours their first taste of testing.

The problem is doubly hard to tackle, as Mocking frameworks allow you to write the same tests with fewer lines of code.  Developers who are new to testing see this and immediately try to use the mocking framework.  After all, only a fool would eschew such benefits, right?  Well, some people just ‘get it’ from the start.  I’m not one of these people and, in my experience, nor are most others.  Starting with interaction based testing and mocking frameworks is akin to throwing someone out of the back of a van that’s moving at 70mph and expecting them to start running when they hit the tarmac.  Chances are they’re going to land on their face.

The unfortunate result is that some folk tie themselves in knots.  They don’t understand the responsibilities of each type of testing object and, as a result, create extremely brittle or utterly pointless tests. If you have no idea of what you’re trying to achieve with a test, there is no point in writing it.  I once saw a question on StackOverflow featuring a confused fellow asking why his test did not work.  The poster was using Rhino Mocks and created a Mock of the class under test.  I.e. it was not a collaborator, it was the class under test and he was mocking it!  I suspect he was simply overwhelmed by trying to learn multiple new things.

Other things I’ve witnessed include developers writing obscure lambda expressions using c#, then chaining together RhinoMocks methods to perform something which somehow works.  When I pointed out that I could barely infer its purpose by reading the code and that writing a hand rolled stub would probably be a better idea, I was met with “yes, you’re probably right but I want to use Rhino Mocks”.

Warning bells should also start ringing when you return a mock and assert that its method was called by another mock which returns a mock which… errrr!  It’s much easier to make a dog’s dinner of interaction-based testing; a good grounding in state-based testing is essential.

One step at a time

  1. Learn to sit up before you crawl.  Write simple xUnit tests that involve state-based testing.  It doesn’t have to be great, isolated code.  Even writing tests that involve scores of classes is a good way to start.  Finer granularity is something that comes with experience.
  2. Crawl before you walk.  Start to experiment and find better ways of testing pieces of functionality.  Ask yourself whether the test is useful, maintainable etc.  Will other parts of the system break it if they change?  Can you make the components and tests themselves finer grained?  This stage should be about developing your sense of what constitutes a good test.
  3. Walk before you run.  Begin to experiment with different types of test doubles, but hand roll them.  Yes, it’s painful at times, but it will give you a better understanding of roles in tests and the different types of test doubles, even if you don’t have names for them yet.  Furthermore, constantly having to update your hand rolled stubs when disparate parts of your class changes will also give you an appreciation for the interface segregation principle.
  4. Finally, install a mocking framework and start sprinting.

    If you do sprint head-first into a wall, you will be better equipped to understand where you went wrong, as you understand the fundamentals.  You will also have a better grasp of the terminology, as it will be grounded in real, tangible code you’ve written.

    Getting up and running with Fluent NHibernate

    Posted in c# on August 19th, 2009 by Mark Simpson – Be the first to comment

    I’ve been meaning to try out NHibernate for a good ol’ while.  It’s a long-established and respected O/R M library and one of the authors (Ayende) writes a blog that I’ve read for a long time.

    Anyway, NHibernate is great, but its object => db mappings are a bit of a pain.  They are based on xml which is verbose, fiddly to write and the separation makes refactoring and testing mappings somewhat hard.  There are other ways to create mappings in code, such as via attributes, but this approach pollutes your business objects with DB specific code and still doesn’t help with the testing issue.  This and the lack of a LINQ to NHibernate are the only two main gripes I’ve heard about NHibernate.  The latter problem is getting solved for the next release.

    The weakly typed mappings solution is already most of the way there.  Step forward Fluent NHibernate.  Fluent NHbernate alleviates these problems by providing both convention based auto mappings and mappings created via strongly-typed code.  Lots of blogs and articles cover Fluent NHibernate; the point of this post is to point out a few gotchas that may occur when getting up and running.

    Firstly, the FluentNhibernate example project does not have all of the required assemblies when you try to run it.  If you check the InnerException message, it’s clear which assemblies are missing.  From memory, it’s one of the byte code .dlls.  Either set up an assembly reference or create a postbuild step to copy it.

    Moving on:

    When writing my own Noddy sample application, I followed This Tutorial and, while it is good, it misses out a few things:

    Gotcha #1:

    If you use SQLite to run your application and/or test your mappings, the version of SQLite provided with Fluent NHibernate is an x86 assembly.  If you have a 64 bit OS and fail to build your project in x86 mode, you’ll get various obscure error messages (instead of a BadImageFormatException or whatever .NET usually throws).  The solution to this particular problem is to set the project(s) to build in x86 mode.

    Gotcha #2:

    The following line will also cause an exception (or at least it did on my PC — running 64 bit Windows 7):

    Id(c => c.Id).GeneratedBy().HiLo(“customer”);

    Again, it’s a very vague exception.  The article has someone seeking help for the same problem, but no solution.  I asked on StackOverflow and the solution is to either remove the trailing .GeneratedBy…. fluent method calls, or to replace it with something like HiLo(“1000″).

    There may be implications for making such a change, but when you just want to get a Noddy application up and running so you can do a bit of fiddling about, it’ll do the job. :)

    Invert logical statements to reduce nesting

    Posted in c#, software on June 28th, 2009 by Mark Simpson – Be the first to comment

    As a test engineer, I spend a lot of my time reading –and making sense of– other people’s code.  I find it interesting that logically equivalent, re-arranged code can be much more easily understood.  Some of this follows on from the layout / style guide in the excellent Code Complete.  Perhaps I have unknowingly assimilated these idioms from reading, understanding and ultimately copying the layout of ‘good’ code.  Either way, they’re useful from a testing point of view, as it’s simpler to reason about code that doesn’t jump around so much.

    As has been stated many times before, in general, the best programmers are the ones who program around the limitations of the human brain.  E.g. splitting a method into self documented sub methods, reducing nesting depth, reducing number of exits from a loop, grouping related logical statements and so forth.

    Along similar lines, here’s a very simple but effective one: something I like to call “flattening“.

    Reduce nesting by inverting logical statements

    It’s a fairly simple concept.  By flipping a logical test or statement, you change the layout of the code, but the result remains the same.  Compare the two following two snippets:

    Approach One

    if(someVal != null)
    {
       if(someOtherVal != null)
       {
           DoSomeStuff();
    
           //... 10 years later
           if(someOtherCondition)
           {
               DoExtraStuff();
           }
       }
    }
    

    Approach Two

    if(someVal == null)
        return;
    
    if(someOtherVal == null)
        return;
    
    DoSomeStuff();
    //... 10 years later
    if(someOtherCondition)
    {
        DoExtraStuff();
    }

    Even with these trivial examples, I think the first one is much more convoluted and harder to follow.  If invariants must hold at the start of a method, it makes much more sense to check them and return / return an error / throw an exception (as appropriate).  If multiple nested if statements are used from the start, the nesting depth in the method is increased by default.  Any further indentations compound the problem and reduce clarity.

    I’ve seen fairly large methods (multiple pages) that suffered from this kind of problem and it made the code much harder to follow, especially when non-trivial work was done towards the end or a value was set earlier and returned later.  By the time you scroll to the end to deal with the problems, you’ve forgotten what came before.

    By contrast, the second method is much flatter.  It basically reads like: “if this is invalid, return.  Done.  If the other one is invalid, return.  Done.  OK, we have valid inputs, so forget the validation phase and get started with the real work”.  It’s like removing a couple of juggling balls from the air — it removes a mental burden because you can simply ignore the initial checking logic.

    You can also do things like introduce the continue keyword in loops and a few other things, but there are often caveats associated with such choices.

    I use ‘flattening’ it in a lot of places, but there are occasions when it doesn’t make sense to flatten a method.  Sometimes it’s nicer to deal with the ‘standard’ case first regardless of the extra nesting depth.  In other scenarios, you can reduce nesting through the use of the continue keyword in loops, but lobbing a ‘continue’ keyword half way into a loop body adds a different kind of complexity, even if it means reduced nesting depth.

    I.e. I like to use this a lot but, like most things, it shouldn’t be used indiscriminately.