EDIT 2:
I played around with the code snippet in your pastie link and I was originally getting the same results as you, but I think I've finally gotten to the bottom of it.
What's happening is that without that Func wrapper, the jitter is able to do some serious optimizing. I know it's the jitter and not the C# compiler, because all of the code is intact in the ildasm output. In fact, for 2 out of your 3 methods (mod and mask), it's able to deduce that the methods aren't accomplishing anything at all (because they're doing a simple calculation, and the return value is just discarded), so it doesn't even bother calling them.
To validate that statement, simply comment out the line in WrapIndex() and return a plain 0, which is sure to get optimized out. Run your program again and you'll see that all three methods now report the same times. There's no way that doing a mod or a mask has the same cost as returning a constant 0, so that tells you that the code just isn't being executed.
This is another issue you have to be aware of when doing perf tests. If your test is too simple, the optimizer will just discard all of your code.
In my test, the Func wrapper was preventing the jitter from optimizing as much, because it doesn't know which code is going to be executed, so it can't inline and then discard the code, which is why you get more reasonable numbers.
So I think the results from my code, using the Func delegate, give a more accurate reflection of the relative costs of your three index methods.