Recently, on a project, I encountered the need for uniformly distributed integers within an arbitrary range [a, b] from random bytes, a problem that is quite common and is usually solved using rejection sampling. I was curious, however, if it would be possible to achieve this selection in constant O(1) time without the need for any branching. My thought process led me to the idea of using modular arithmetic.
Imagine a room with a clock that ticks through the range 0 to n - 1, where n is any positive integer that is >= 2. The clock ticks at a constant rate and never stops. Now, if you leave the room and come back after a random amount of time, chosen uniformly from the range 0 to m - 1 (where m >= n), and record the time shown on the clock each visit, you’ll be left with a list of random integers within the desired range.
I am not entirely sure why this works, or at least appears to, but empirical testing suggests that it produces level histograms with no modulo bias.
Both images were created with 10,000,000 random bytes fitted into the range 0 to 251.
Here is my implementation in Python. Please let me know if this has been considered before, and if there is any real mathematical backing to it.
Thanks.
import os
STATE = 0
def random_int(a: int, b: int):
global STATE
int_range = b - a + 1
bytes_needed = (int_range.bit_length() + 7) // 8
value = int.from_bytes(os.urandom(bytes_needed))
STATE = (STATE + value) % int_range
return STATE + a
random_intwith a small range will biasSTATEto a lower value. I plotted the distribution of your function when called with moduli 5 and 252 randomly interspersed, and there was a clear bias towards the low end on the mod-252 plot. The effect is even more marked if you move the higher modulus further away from a power of 2.return a + value % int_rangewould be enough, and I think that without STATE in the picture, successive calls with different ranges would be OK. Also, although I think the analogy with the ticking clock is intriguing, there doesn't seem to be a clear relation to the proposed solution. Hope this helps in some way.