In lots of situations, we may need a unique identifier for an object, for example, when running a database transformation, we may want to create a unique key for each record, or when creating a database, we might want a unique key for each object in the database. In these situations, I’ve seen a lot of people use some variant of the following code:
import random
import string
def generate_random_unique_identifier(length=10):
return ''.join(random.choice(string.ascii_letters) for _ in range(length))
I’ve certainly been guilty of this myself. However, there are a few problems with this approach:
- It’s not guaranteed to be unique. It’s possible that two objects will be assigned the same identifier.
- It’s not very efficient. If we’re generating a lot of identifiers, we’re going to be wasting a lot of CPU cycles generating random strings.
- It depends on the random seed. If we’re using the same random seed, we’re going to get the same identifiers.
This is fine for a lot of situations, but sometimes we want an identifier that we know is unique. In these situations, we can use a UUID. UUID stands for Universally Unique Identifier. It’s a 128-bit number that is guaranteed to be unique across space and time. It’s a standard that was developed by the Open Software Foundation (OSF) as part of the Distributed Computing Environment (DCE).
There are several variants of UUIDs, which all alter different parts of the UUID (usually the node identifier):
- UUID(1): Generates a unique number based on the current date/time and the MAC address of the computer.
- UUID(2): A variant of UUID1 for DCE Security.
- UUID(3/5): Generates a unique number based on hashing a “namespace” identifer, and a “name”. Version 3 uses MD5, and version 5 uses SHA-1.
- UUID(4): Generates a unique number based on random numbers.
Future versions of UUIDs include:
- UUID(6): a field-compatible version of UUIDv1, reordered for improved DB locality
- UUID(7): a time-sortable version of UUID4
- UUID(8): an RFC compatible format for experimenal or vendor-specific use cases.
In python, generating a UUID is as simple as:
import uuid
unique_identifier = uuid.uuid4()
Using UUIDs generates identifiers which are guaranteed to be globally unique, and are also very efficient to generate, not only this, but it’s a lot easier to generate, and doesn’t depend on a source of randomness!