Sunday, 16 February 2014

Why rand() / RAND_MAX * (user_high - user_low) + user_low is not good enough

The easiest (though certainly not the best) way to generate a random number is to call the rand() function, which is defined in stdlib.h. This function returns an integer in the range [0, RAND_MAX], where RAND_MAX is also defined in the same header (usually 32,767).

By itself, it is not very useful. A common way to produce a random number within a user specified range is

(double)rand() / RAND_MAX * (user_high - user_low) + user_low

where the double cast is used to avoid integer round down.

This approach works okay as long as the user range is smaller than RAND_MAX. What happens if the user wants to generate random numbers of a larger range, say 0-1,000,000?

If the approach above is used, some numbers may never be produced. To understand why, see that rand() returns an integer within [0, RAND_MAX]. So, there are only 32,768 possible discrete values that can be returned, out of the possible 1,000,001 values.

While this may not have drastic consequences for day-to-day uses, it can prove disastrous for algorithms that involve gambling, say, shuffling a deck. Some further reading should be useful, or if you're in a hurry, just use one of those good RNG libs.

No comments:

Post a Comment