Skip to main content
Updated answer based on latest comments; added 49 characters in body
Source Link
Saeed
  • 846
  • 5
  • 6

EDIT 2:
I played around with the code snippet in your pastie link and I was originally getting the same results as you, but I think I've finally gotten to the bottom of it.

What's happening is that without that Func wrapper, the jitter is able to do some serious optimizing. I know it's the jitter and not the C# compiler, because all of the code is intact in the ildasm output. In fact, for 2 out of your 3 methods (mod and mask), it's able to deduce that the methods aren't accomplishing anything at all (because they're doing a simple calculation, and the return value is just discarded), so it doesn't even bother calling them.

To validate that statement, simply comment out the line in WrapIndex() and return a plain 0, which is sure to get optimized out. Run your program again and you'll see that all three methods now report the same times. There's no way that doing a mod or a mask has the same cost as returning a constant 0, so that tells you that the code just isn't being executed.

This is another issue you have to be aware of when doing perf tests. If your test is too simple, the optimizer will just discard all of your code.

In my test, the Func wrapper was preventing the jitter from optimizing as much, because it doesn't know which code is going to be executed, so it can't inline and then discard the code, which is why you get more reasonable numbers.

So I think the results from my code, using the Func delegate, give a more accurate reflection of the relative costs of your three index methods.

EDIT 2:
I played around with the code snippet in your pastie link and I was originally getting the same results as you, but I think I've finally gotten to the bottom of it.

What's happening is that without that Func wrapper, the jitter is able to do some serious optimizing. I know it's the jitter and not the C# compiler, because all of the code is intact in the ildasm output. In fact, for 2 out of your 3 methods (mod and mask), it's able to deduce that the methods aren't accomplishing anything at all (because they're doing a simple calculation, and the return value is just discarded), so it doesn't even bother calling them.

To validate that statement, simply comment out the line in WrapIndex() and return a plain 0, which is sure to get optimized out. Run your program again and you'll see that all three methods now report the same times. There's no way that doing a mod or a mask has the same cost as returning a constant 0, so that tells you that the code just isn't being executed.

This is another issue you have to be aware of when doing perf tests. If your test is too simple, the optimizer will just discard all of your code.

In my test, the Func wrapper was preventing the jitter from optimizing as much, because it doesn't know which code is going to be executed, so it can't inline and then discard the code, which is why you get more reasonable numbers.

So I think the results from my code, using the Func delegate, give a more accurate reflection of the relative costs of your three index methods.

Updated function to take % out of loop
Source Link
Saeed
  • 846
  • 5
  • 6

Your results do seem odd, so I tried to rewrite your test and see if I can get different results. I suspected that you might be starting and stopping the stopwatch too frequently, which can skew your results due to the loss of precision every time you start and stop. In my code, I only start and stop the stopwatch once per type of calculation.

Using the code below, I get results more in line with what you would expect, namely that masking is the fastest, followed by plain subtraction, followed by modulo division.

Here is a typical output from one of my test runs:

Plain: 059.016374547033
Mod: 064.016983986872
Mask: 058.015272841923

In various runs, Plain and Mask tend to vary with respect to each other, and sometimes show very similar numbers. Mod always tends to be slower.

And here is the code:

static void Main(string[] args)
{
    TestPerf(WrapIndex, "Plain");
    TestPerf(WrapIndexMod, "Mod");
    TestPerf(WrapIndexMask, "Mask");
}

public static void TestPerf(Func<int, int, int, int> getIndex, string label, int numRuns = 10000000010000)
{
    int index = 256;
    int endIndex = 0;
    int maxSize = 4096;

    Stopwatch sw = new Stopwatch();
    sw.Start();

    for (int i = 0; i < numRuns; i++)
    {
        getIndex(index, endIndex, maxSize);

  for (int index = 0; index //< changemaxSize; indexesindex++)
        endIndex++;{
        endIndex = endIndex % maxSize;

      getIndex(index, 1234, index++;maxSize);
        index = index % maxSize;}
    }

    sw.Stop();
    Console.WriteLine(string.Format("{0}: {1}", label, ((double)sw.ElapsedTicks) / numRuns));
}

EDIT: Also notice that I cast ElapsedTicks to double before dividing to avoid rounding. You should be careful to avoid rounding in your calculations because that just adds noise to your final result.

Your results do seem odd, so I tried to rewrite your test and see if I can get different results. I suspected that you might be starting and stopping the stopwatch too frequently, which can skew your results due to the loss of precision every time you start and stop. In my code, I only start and stop the stopwatch once per type of calculation.

Using the code below, I get results more in line with what you would expect, namely that masking is the fastest, followed by plain subtraction, followed by modulo division.

Here is a typical output from one of my test runs:

Plain: 0.01637454
Mod: 0.01698398
Mask: 0.01527284

And here is the code:

static void Main(string[] args)
{
    TestPerf(WrapIndex, "Plain");
    TestPerf(WrapIndexMod, "Mod");
    TestPerf(WrapIndexMask, "Mask");
}

public static void TestPerf(Func<int, int, int, int> getIndex, string label, int numRuns = 100000000)
{
    int index = 256;
    int endIndex = 0;
    int maxSize = 4096;

    Stopwatch sw = new Stopwatch();
    sw.Start();

    for (int i = 0; i < numRuns; i++)
    {
        getIndex(index, endIndex, maxSize);

        // change indexes
        endIndex++;
        endIndex = endIndex % maxSize;

        index++;
        index = index % maxSize;
    }

    sw.Stop();
    Console.WriteLine(string.Format("{0}: {1}", label, ((double)sw.ElapsedTicks) / numRuns));
}

EDIT: Also notice that I cast ElapsedTicks to double before dividing to avoid rounding. You should be careful to avoid rounding in your calculations because that just adds noise to your final result.

Your results do seem odd, so I tried to rewrite your test and see if I can get different results. I suspected that you might be starting and stopping the stopwatch too frequently, which can skew your results due to the loss of precision every time you start and stop. In my code, I only start and stop the stopwatch once per type of calculation.

Using the code below, I get results more in line with what you would expect, namely that masking is the fastest, followed by plain subtraction, followed by modulo division.

Here is a typical output from one of my test runs:

Plain: 59.7033
Mod: 64.6872
Mask: 58.1923

In various runs, Plain and Mask tend to vary with respect to each other, and sometimes show very similar numbers. Mod always tends to be slower.

And here is the code:

static void Main(string[] args)
{
    TestPerf(WrapIndex, "Plain");
    TestPerf(WrapIndexMod, "Mod");
    TestPerf(WrapIndexMask, "Mask");
}

public static void TestPerf(Func<int, int, int, int> getIndex, string label, int numRuns = 10000)
{
    int maxSize = 4096;

    Stopwatch sw = new Stopwatch();
    sw.Start();

    for (int i = 0; i < numRuns; i++)
    {
        for (int index = 0; index < maxSize; index++)
        {
            getIndex(index, 1234, maxSize);
        }
    }

    sw.Stop();
    Console.WriteLine(string.Format("{0}: {1}", label, ((double)sw.ElapsedTicks) / numRuns));
}

EDIT: Also notice that I cast ElapsedTicks to double before dividing to avoid rounding. You should be careful to avoid rounding in your calculations because that just adds noise to your final result.

Removed the index methods because they duplicate the code in the question and aren't relevant to the answer.
Source Link
Saeed
  • 846
  • 5
  • 6

Your results do seem odd, so I tried to rewrite your test and see if I can get different results. I suspected that you might be starting and stopping the stopwatch too frequently, which can skew your results due to the loss of precision every time you start and stop. In my code, I only start and stop the stopwatch once per type of calculation.

Using the code below, I get results more in line with what you would expect, namely that masking is the fastest, followed by plain subtraction, followed by modulo division.

Here is a typical output from one of my test runs:

Plain: 0.01637454
Mod: 0.01698398
Mask: 0.01527284

And here is the code:

static void Main(string[] args)
{
    TestPerf(WrapIndex, "Plain");
    TestPerf(WrapIndexMod, "Mod");
    TestPerf(WrapIndexMask, "Mask");
}

public static void TestPerf(Func<int, int, int, int> getIndex, string label, int numRuns = 100000000)
{
    int index = 256;
    int endIndex = 0;
    int maxSize = 4096;

    Stopwatch sw = new Stopwatch();
    sw.Start();

    for (int i = 0; i < numRuns; i++)
    {
        getIndex(index, endIndex, maxSize);

        // change indexes
        endIndex++;
        endIndex = endIndex % maxSize;

        index++;
        index = index % maxSize;
    }

    sw.Stop();
    Console.WriteLine(string.Format("{0}: {1}", label, ((double)sw.ElapsedTicks) / numRuns));
}
 
// plain wrap
public static int WrapIndex(int index, int endIndex, int maxSize)
{
    return (endIndex + index) > maxSize ? (endIndex + index) - maxSize : endIndex + index;
}

// wrap using mod
public static int WrapIndexMod(int index, int endIndex, int maxSize)
{
    return (endIndex + index) % maxSize;
}

// wrap my masking out the top bits
public static int WrapIndexMask(int index, int endIndex, int maxSize)
{
    return (endIndex + index) & (maxSize - 1);
}

EDIT: Also notice that I cast ElapsedTicks to double before dividing to avoid rounding. You should be careful to avoid rounding in your calculations because that just adds noise to your final result.

Your results do seem odd, so I tried to rewrite your test and see if I can get different results. I suspected that you might be starting and stopping the stopwatch too frequently, which can skew your results due to the loss of precision every time you start and stop. In my code, I only start and stop the stopwatch once per type of calculation.

Using the code below, I get results more in line with what you would expect, namely that masking is the fastest, followed by plain subtraction, followed by modulo division.

Here is a typical output from one of my test runs:

Plain: 0.01637454
Mod: 0.01698398
Mask: 0.01527284

And here is the code:

static void Main(string[] args)
{
    TestPerf(WrapIndex, "Plain");
    TestPerf(WrapIndexMod, "Mod");
    TestPerf(WrapIndexMask, "Mask");
}

public static void TestPerf(Func<int, int, int, int> getIndex, string label, int numRuns = 100000000)
{
    int index = 256;
    int endIndex = 0;
    int maxSize = 4096;

    Stopwatch sw = new Stopwatch();
    sw.Start();

    for (int i = 0; i < numRuns; i++)
    {
        getIndex(index, endIndex, maxSize);

        // change indexes
        endIndex++;
        endIndex = endIndex % maxSize;

        index++;
        index = index % maxSize;
    }

    sw.Stop();
    Console.WriteLine(string.Format("{0}: {1}", label, ((double)sw.ElapsedTicks) / numRuns));
}
 
// plain wrap
public static int WrapIndex(int index, int endIndex, int maxSize)
{
    return (endIndex + index) > maxSize ? (endIndex + index) - maxSize : endIndex + index;
}

// wrap using mod
public static int WrapIndexMod(int index, int endIndex, int maxSize)
{
    return (endIndex + index) % maxSize;
}

// wrap my masking out the top bits
public static int WrapIndexMask(int index, int endIndex, int maxSize)
{
    return (endIndex + index) & (maxSize - 1);
}

EDIT: Also notice that I cast ElapsedTicks to double before dividing to avoid rounding. You should be careful to avoid rounding in your calculations because that just adds noise to your final result.

Your results do seem odd, so I tried to rewrite your test and see if I can get different results. I suspected that you might be starting and stopping the stopwatch too frequently, which can skew your results due to the loss of precision every time you start and stop. In my code, I only start and stop the stopwatch once per type of calculation.

Using the code below, I get results more in line with what you would expect, namely that masking is the fastest, followed by plain subtraction, followed by modulo division.

Here is a typical output from one of my test runs:

Plain: 0.01637454
Mod: 0.01698398
Mask: 0.01527284

And here is the code:

static void Main(string[] args)
{
    TestPerf(WrapIndex, "Plain");
    TestPerf(WrapIndexMod, "Mod");
    TestPerf(WrapIndexMask, "Mask");
}

public static void TestPerf(Func<int, int, int, int> getIndex, string label, int numRuns = 100000000)
{
    int index = 256;
    int endIndex = 0;
    int maxSize = 4096;

    Stopwatch sw = new Stopwatch();
    sw.Start();

    for (int i = 0; i < numRuns; i++)
    {
        getIndex(index, endIndex, maxSize);

        // change indexes
        endIndex++;
        endIndex = endIndex % maxSize;

        index++;
        index = index % maxSize;
    }

    sw.Stop();
    Console.WriteLine(string.Format("{0}: {1}", label, ((double)sw.ElapsedTicks) / numRuns));
}

EDIT: Also notice that I cast ElapsedTicks to double before dividing to avoid rounding. You should be careful to avoid rounding in your calculations because that just adds noise to your final result.

Added an additional point about rounding.
Source Link
Saeed
  • 846
  • 5
  • 6
Loading
Source Link
Saeed
  • 846
  • 5
  • 6
Loading