In the previous post, I asked which function, in the following code, would fill the array with 1’s faster and the reason for that.
using BenchmarkDotNet.Attributes; using BenchmarkDotNet.Running; namespace ToArrays { public class Program { public const int Size = 10000; private static void Main() { BenchmarkRunner.Run<Program>(); } [Benchmark] public void IxJ() { var array = new int[Size, Size]; for (var i = 0; i < Size; i++) { for (var j = 0; j < Size; j++) { array[i, j] = 1; } } } [Benchmark] public void JxI() { var array = new int[Size, Size]; for (var i = 0; i < Size; i++) { for (var j = 0; j < Size; j++) { array[j, i] = 1; } } } } }
In this post, I will answer these questions.
Which one is faster?
The algorithm complexities for both functions are the same. But the execution times are different.
Here is the output produced by BenchmarkDotNet on my machine:
/ * Summary * BenchmarkDotNet=v0.11.3, OS=Windows 10.0.17134.407 (1803/April2018Update/Redstone4) Intel Core i7-8550U CPU 1.80GHz (Kaby Lake R), 1 CPU, 8 logical and 4 physical cores Frequency=1945312 Hz, Resolution=514.0564 ns, Timer=TSC .NET Core SDK=2.1.403 [Host] : .NET Core 2.1.5 (CoreCLR 4.6.26919.02, CoreFX 4.6.26919.02), 64bit RyuJIT DefaultJob : .NET Core 2.1.5 (CoreCLR 4.6.26919.02, CoreFX 4.6.26919.02), 64bit RyuJIT Method | Mean | Error | StdDev | Median | ------- |-----------:|----------:|----------:|-----------:| IxJ | 472.8 ms | 9.543 ms | 23.41 ms | 476.8 ms | JxI | 1,619.6 ms | 88.361 ms | 253.53 ms | 1,723.1 ms |
As you can see, “IxJ” is ~4x faster than “JxI”
Why?
Cleber Dantas nailed it in the comments of the last post.
The real problem is CPU cache; the slower method will perform a lot of cache misses, forcing the CPU to access RAM (which is slower than cache).
Learned lesson
Whenever it is possible, access memory sequentially.