6

Storing coordinates in C# - Performance versus readability

 3 years ago
source link: https://markheath.net/post/coord-performance-versus-readability
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

This year I enjoyed solving the Advent of Code puzzles once again. And one of the recurring themes was needing to deal with coordinates, both 2D, 3D and even 4D (as well as hexagonal coordinates again).

Today I thought I'd share a slightly long and rambling tale of a rabbit-hole I went down solving one of the puzzles and a few of the things I discovered along the way.

Storing coordinates

In .NET there are many options for storing coordinates, such as Point, although that brings in an unwanted dependency on System.Drawing. There are also some Vector classes kicking around (including a 3D one) although I didn't need or want floating point coordinates in this case.

I could also have chosen an int[], which is flexible enough to store any number of dimensions but can't be used as the key for a HashSet which I needed for several puzzles. And so ValueTuple<int,int,int> was the obvious choice and is what I used initially in all the puzzles this year.

ValueTuple limitations

For the most part, value tuples in C# are fine, but they do have a few rough edges. For example, tuple deconstruction doesn't work in LINQ statements, meaning you have to either use the ugly Item1 and Item2 names, or explicitly declare the names everywhere (e.g. (int X, int Y)) which can get a bit repetitive.

I also wanted to add my own custom methods, such as adding together two coordinates, or enumerating all the "neighbours" of a point. Of course this could be achieved with simple extension methods on an (int,int,int) tuple:

public static (int X, int Y, int Z) Add(this (int X, int Y, int Z) a, 
                                             (int X, int Y, int Z) b)
    => (a.X + b.X, a.Y + b.Y, a.Z + b.Z);

But for the code I was writing it would be really convenient to have a few additional characteristics for the type I used to store coordinates. I wanted it to implement IEnumerable<int> (which ValueTuple<int,int,int> doesn't) and for the coordinate types for 2D, 3D and 4D to share a common base class or interface so I could write generic algorithms that worked against coordinates in any number of dimensions.

So to clean up my code a little I tried a quick experiment to create my own Coord class.

Making a custom Coordinate class

My first idea was super simple. Just store the coordinate values in an int[]. That way I could very easily implement IEnumerable<int>, and support any arbitrary number of points.

I don't have the original version of my Coord class anymore, but it was something along these lines, with a bit of LINQ thrown in to implement Equals and GetHashCode for an arbitrary number of dimensions. I knew I needed Equals and GetHashCode because I was storing instances in a HashSet.

// n.b. this code has some issues - don't copy this!
public class Coord : IEnumerable<int>
{
    private readonly int[] coords;
    public int this[int index] { get => coords[index]; }
    public Coord(int x, int y) { coords = new[] { x, y}; }
    public Coord(int x, int y, int z) { coords = new[] { x, y, z}; }
    public Coord(IEnumerable<int> c) { coords = c.ToArray(); }
    public override bool Equals(object other)
    {
        if (other is Coord ca)
            return coords.Zip(ca.coords).All(x => x.First == x.Second);
        return false;
    }
    public override int GetHashCode() => coords.Aggregate((a, b) => a ^ b);
    public IEnumerator<int> GetEnumerator() => 
                ((IEnumerable<int>)coords).GetEnumerator();
    IEnumerator IEnumerable.GetEnumerator() => coords.GetEnumerator();
}

Nice and simple, and although I hadn't particularly thought about performance, I wasn't expecting it to be awful. However, it was terrible. Switching from (int,int,int) to Coord slowed down my solution by almost 100 times!

Performance optimization round one

After a bit of experimentation, I realized that the main source of my performance woes was the implementation of Equals and GetHashCode. I also thought that switching to a struct would likely help, and I also abandoned the idea of using an int[] and just stored each dimension as a separate int.

This would mean I would need to create separate types for 2D, 3D and 4D coordinates, but they could at least share a common base interface (structs are not allowed to inherit from each other in .NET), and could still implement IEnumerable<int>.

This let me rewrite the Equals and GetHashCode in what appeared to be code so simple it had to perform extremely fast right?

public override bool Equals(object other)
{
    if (other is Coord ca)
        return coords.x == ca.x && coords.y == ca.y && coords.z == ca.z;
    return false;
}
public override int GetHashCode() => x.GetHashCode() ^ 
    y.GetHashCode() ^ z.GetHashCode();

Well to my surprise, despite being much faster it was still horribly slow compared to plain old ValueTuple<int,int,int>. What could I be missing?

Proper hash codes

Turns out my hash code algorithm was stupid. The hashcode of an integer in .NET is just the value of that integer. And XORing integers together produces the same result, regardless of the order. So the hashcodes of coordinates (1,2,3), (3,2,1), (1,3,2) etc were all the same. This really hurts the performance of HashSet if you are storing lots of values that have hash collisions.

This led me explore the hash code generation used by ValueTuple<int,int,int>.

The first source code I found here, revealed this implementation at its base:

internal static class HashHelpers
{
    public static readonly int RandomSeed = 
        new Random().Next(int.MinValue, int.MaxValue);

    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public static int Combine(int h1, int h2)
    {
        // RyuJIT optimizes this to use the ROL instruction
        // Related GitHub pull request: dotnet/coreclr#1830
        uint rol5 = ((uint)h1 << 5) | ((uint)h1 >> 27);
        return ((int)rol5 + h1) ^ h2;
    }
}

This greatly improved overall performance, but I was still not quite as fast as just using (int,int,int). I think the actual .NET Core hashing algorithms used by ValueTuple can be found here, but in the end I decided that this very simple implementation from Jon Skeet on StackOverflow (who else) would be fast and good enough for my needs:

public override int GetHashCode()
{
    unchecked // Overflow is fine, just wrap
    {
        int hash = 17;
        hash = hash * 23 + x;
        hash = hash * 23 + y;
        hash = hash * 23 + z;
        return hash;
    }
}

Performance optimizations round 2

By this stage, I'd accomplished my goal of making a Coord type that made my code more generic and readable, and performed reasonably well. But annoyingly it still wasn't quite as fast as ValueTuple.

I got a little bit more of a speedup by directly implementing IEquatable<int> as suggested here.

But at that point I was running out of ideas. Even pre-calculating the hash in the constructor didn't speed me up at all, and a few other off-the wall ideas couldn't quite make my Coord type as fast as just using (int,int,int).

However, I suspect that part of the difference was that I wasn't doing proper benchmarking. My Coord class was compiled under debug, whereas the ValueTuple would have been a release build. So it's quite possible that my Coord type can actually match ValueTuple in a fair fight.

Obviously Benchmark.net would be the ideal tool to use if I were to really want to properly compare the two approaches.

Operator overloading

One of the goals of creating my own Coord type was to make useful helper methods directly available. One of those was an Add method. This is obviously a good candidate for operator overloading, which can be achieved in C# with the following syntax:

public static Coord operator +(Coord a, Coord b)
{
    return new Coord(a.x + b.x, a.y + b.y, a.z + b.z);
}

Tuple deconstructing

One new technique I was able to apply was "tuple deconstruction". This basically enables you "unpack" the elements of the struct into their own named variables just like you can with a regular ValueTuple. All you need to do is implement a Deconstruct method like this.

public void Deconstruct(out int x, out int y, out int z)
{
    x = this.x;
    y = this.y;
    z = this.z;
}

With this in place you can write code like this:

var (a,b,c) = myCoordinate;

And I also added some implicit casting operators in as well making it easy to switch between my Coord type and ValueTuple<int,int,int>:

public static implicit operator (int, int, int)(Coord c) => 
                                (c.x, c.y, c.z);
public static implicit operator Coord((int X, int Y, int Z) c) => 
                                new Coord(c.X, c.Y, c.Z);

This allows me to write code like this, and benefit from the more concise C# syntax of ValueTuples:

Coord pos = (1,6,2);

Performance versus readability

So I eventually managed to achieve the goal of a Coord type instead of using ValueTuple which made my code read a little better and opened the door to writing more generic code for different numbers of dimensions.

But it did come at a slight performance penalty. Which raises the interesting question of what matters most, performance or readability?

The good news is that in many cases, it's not a tradeoff you need to worry about.

First of all, performance and readability are not necessarily at odds - much of the time the simpler your code is, the better its performance and readability will be. In addition, the more readable you code is, the easier it is to spot ways to improve its performance and inefficiencies in its structure.

Secondly, not all the code you write needs to be performance tuned to a high degree. It turned out that certain methods on the type I chose to create were being called millions of times a second in a tight loop, and so even small inefficiencies resulted in big slow-downs.

This is why profiling your code is so important before attempting to improve performance. Find out which bits of code that are actually taking the most time, and focus your efforts on improvement there.

Lessons learned

Obviously this whole exercise was just for a throwaway fun puzzle, but I learned a lot in the process, which is one of the benefits of doing something like Advent of Code.

I certainly learned a few things about how to get fast performance in a HashSet, and this exercise also highlighted the value of having good unit tests. I could very quickly try out different implementations of my Coord class and be sure I hadn't broken anything, as well as being able to use the unit tests as a rudimentary form of benchmarking.

By the way, here's the source code for the Coord class. Sadly I never did get round to extending it to have 2D and 4D versions which was a key reason for making this in the first place, and I also wanted to create a Grid class that provided convenience methods to access elements in a grid by their coordinates.

And of course, I'm sure some of you will be able to let me know in the comments some ways to improve performance further, so I look forward to reading those.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK