

GitHub - ulid/spec: The canonical spec for ulid
source link: https://github.com/ulid/spec
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

README.md
Universally Unique Lexicographically Sortable Identifier
UUID can be suboptimal for many uses-cases because:
- It isn't the most character efficient way of encoding 128 bits of randomness
- UUID v1/v2 is impractical in many environments, as it requires access to a unique, stable MAC address
- UUID v3/v5 requires a unique seed and produces randomly distributed IDs, which can cause fragmentation in many data structures
- UUID v4 provides no other information than randomness which can cause fragmentation in many data structures
Instead, herein is proposed ULID:
ulid() // 01ARZ3NDEKTSV4RRFFQ69G5FAV
- 128-bit compatibility with UUID
- 1.21e+24 unique ULIDs per millisecond
- Lexicographically sortable!
- Canonically encoded as a 26 character string, as opposed to the 36 character UUID
- Uses Crockford's base32 for better efficiency and readability (5 bits per character)
- Case insensitive
- No special characters (URL safe)
- Monotonic sort order (correctly detects and handles the same millisecond)
Implementations in other languages
From ourselves and the community!
Language Author Binary Implementation C++ suyash ✓ Clojure theikkila
Elixir Homepolish ✓ Elixir merongivian
Elixir omisego ✓ Elixir (Ecto) dcuddeback ✓ F# lucasschejtman
Go oklog ✓ Haskell steven777400 ✓ Java huxi ✓ Java azam
JavaScript aarondcohen ✓ Julia ararslan
Perl 5 bk ✓ PHP Lewiscowles1986
Python ahawker ✓ Python mdomke ✓ Ruby rafaelsales
Rust mmacedoeu ✓ Rust dylanhart ✓ SQL-Microsoft rmalayter ✓ Swift simonwhitehouse
Specification
Below is the current specification of ULID as implemented in this repository.
Note: the binary format has not been implemented in JavaScript as of yet.
01AN4Z07BY 79KA1307SR9X4MV3
|----------| |----------------|
Timestamp Randomness
48bits 80bits
Components
Timestamp
- 48 bit integer
- UNIX-time in milliseconds
- Won't run out of space till the year 10889 AD.
Randomness
- 80 bits
- Cryptographically secure source of randomness, if possible
Sorting
The left-most character must be sorted first, and the right-most character sorted last (lexical order). The default ASCII character set must be used. Within the same millisecond, sort order is not guaranteed
Canonical String Representation
ttttttttttrrrrrrrrrrrrrrrr
where
t is Timestamp (10 characters)
r is Randomness (16 characters)
Encoding
Crockford's Base32 is used as shown. This alphabet excludes the letters I, L, O, and U to avoid confusion and abuse.
0123456789ABCDEFGHJKMNPQRSTVWXYZ
Monotonicity
When generating a ULID within the same millisecond, we can provide some
guarantees regarding sort order. Namely, if the same millisecond is detected, the random
component is incremented by 1 bit in the least significant bit position (with carrying). For example:
import { monotonicFactory } from 'ulid' const ulid = monotonicFactory() // Assume that these calls occur within the same millisecond ulid() // 01BX5ZZKBKACTAV9WEVGEMMVRZ ulid() // 01BX5ZZKBKACTAV9WEVGEMMVS0
If, in the extremely unlikely event that, you manage to generate more than 280 ULIDs within the same millisecond, or cause the random component to overflow with less, the generation will fail.
import { monotonicFactory } from 'ulid' const ulid = monotonicFactory() // Assume that these calls occur within the same millisecond ulid() // 01BX5ZZKBKACTAV9WEVGEMMVRY ulid() // 01BX5ZZKBKACTAV9WEVGEMMVRZ ulid() // 01BX5ZZKBKACTAV9WEVGEMMVS0 ulid() // 01BX5ZZKBKACTAV9WEVGEMMVS1 ... ulid() // 01BX5ZZKBKZZZZZZZZZZZZZZZX ulid() // 01BX5ZZKBKZZZZZZZZZZZZZZZY ulid() // 01BX5ZZKBKZZZZZZZZZZZZZZZZ ulid() // throw new Error()!
Overflow Errors when Parsing Base32 Strings
Technically, a 26-character Base32 encoded string can contain 130 bits of information, whereas a ULID must only contain 128 bits. Therefore, the largest valid ULID encoded in Base32 is 7ZZZZZZZZZZZZZZZZZZZZZZZZZ
, which corresponds to an epoch time of 281474976710655
or 2 ^ 48 - 1
.
Any attempt to decode or encode a ULID larger than this should be rejected by all implementations, to prevent overflow bugs.
Binary Layout and Byte Order
The components are encoded as 16 octets. Each component is encoded with the Most Significant Byte first (network byte order).
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 32_bit_uint_time_high |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 16_bit_uint_time_low | 16_bit_uint_random |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 32_bit_uint_random |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 32_bit_uint_random |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Prior Art
Partly inspired by:
Recommend
-
44
-
74
README PyYAML - The next generation YAML parser and emitter for Python. To install, type 'python setup.py install'. By default, the setup.py script che...
-
13
cloud-init Cloud-init is the industry standard multi-distribution method for cross-platform cloud instance initialization. It is supported across all major public cloud providers, provisioning systems for private cloud infra...
-
7
Open Civic Data Division Identifiers (OCD-ID) The goal of this project is to assign somewhat predictable and globally unique identifiers to political divisions. The canonical documentation of OCD-IDs is expressed as
-
4
ULID vs UUID: Sortable Random ID Generators for JavaScriptWhat’s the right tool for generating universal identifiers?UUID
-
6
UUID是软件开发中最常用的通用标识符之一。然而,在过去的几年里,新的替代品挑战了它的存在。其中,ULID 是领先的竞争对手之一,因为它提供可排序的唯一 ID。在本文中,我将通过示例讨论 ULID 的特性,以便您更好地了解何时使用它。了解...
-
13
vredcloud commented
-
13
前提# 最近发现各个频道推荐了很多ULID相关文章,这里对ULID的规范文件进行解读,并且基于Java语...
-
9
ulid/spec: 全局唯一标识符ULID是传统UUID的替代 ULID 是 UUID 的替代品。它是可排序的并且基于时间戳+随机种子。有多种语言...
-
1
Historically, when I wanted to store data in a database, I (or the project/team I was on) used an incrementing integer to uniquely identify each row (e.g. the
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK