A database table typically has an index column. Early on in computing, such an index might have been limited to a single byte, but bytes weren’t always 8 bits long, but that might have meant early databases were limited to 128, 256, 512, perhaps 2048 or 4096 records. Once we standardized on a byte length of 8 bits, and once we started storing more and more data in databases, we had 16-bit and then 32-bit numbers for indices. For some reason, most databases still use signed integers for these, limiting the number of unique numbers to 32,768 or 2,147,483,648 records. Now, 2.1 billion is probably more records than most people use, but Google or Facebook deals with databases on that scale all the time. They don’t use these types of database structures, but the scale is important. If you switch to use unsigned integers, you get nearly 4.3 billion numbers. MySQL allows you to specify a signed 64-bit number for an index: 9.2 quintillion. Even for Google or Facebook that’s probably enough. However, there are more possibilities. An unsigned 64-bit number is 18.4 quintillion. An unsigned 128-bit number is 340 undecillion. Now, one method of generating unique IDs is simply counting up: 0, 1, 2, … 340 undecillion, etc. There are, however, procedural ways of generating random(ish) 128-bit unique IDs. Though the output is 128-bits long, there are limitations, meaning it’s only got 5.3 undecillion possible combinations. The birthday problem gives a 50% probability of a collision after 2.7 quintillion randomly generated unique IDs.

Exact(ish) numbers:

2.7 quintillion: 2,714,922,669,395,445,311.9212030374141782570788350112707255…

5.3 undecillion: 5,316,911,983,139,663,491,615,228,241,121,378,304

340 undecillion: 340,282,366,920,938,463,463,374,607,431,768,211,456

9.2 quintillion: 9,223,372,036,854,775,808

18.4 quintillion: 18,446,744,073,709,551,616

4.3 billion: 4,294,967,296

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.