Waiting for PostgreSQL 15 – Add UNIQUE null treatment option

On 3rd of February 2022, Peter Eisentraut committed patch:

Add UNIQUE null treatment option 
 
The SQL standard has been ambiguous about whether null values in
unique constraints should be considered equal or not.  Different
implementations have different behaviors.  In the SQL:202x draft, this
has been formalized by making this implementation-defined and adding
an option on unique constraint definitions UNIQUE [ NULLS [NOT]
DISTINCT ] to choose a behavior explicitly.
 
This patch adds this option to PostgreSQL.  The default behavior
remains UNIQUE NULLS DISTINCT.  Making this happen in the btree code
is pretty easy; most of the patch is just to carry the flag around to
all the places that need it.
 
The CREATE UNIQUE INDEX syntax extension is not from the standard,
it's my own invention.
 
I named all the internal flags, catalog columns, etc. in the negative
("nulls not distinct") so that the default PostgreSQL behavior is the
default if the flag is false.
 
Reviewed-by: Maxim Orlov <[email protected]>
Reviewed-by: Pavel Borisov <[email protected]>
Discussion: https://www.postgresql.org/message-id/flat/[email protected]

This is interesting. Not a problem I encountered myself, but I've seen people on IRC or Slack complain about a version of it.

So, let's see the issue.

Consider this table:

=$ create table test (
    id serial primary key,
    codename text,
    unique (codename)
);

If I'll try to insert the same codename twice it will fail:

=$ insert into test (codename) values ('depesz');
INSERT 0 1
 
=$ insert into test (codename) values ('depesz');
ERROR:  duplicate key value violates unique constraint "test_codename_key"
DETAIL:  Key (codename)=(depesz) already exists.

This is perfectly as expected. What some people find problematic is that we can insert many rows with codename being NULL:

=$ insert into test (codename) values (NULL) returning *;
 id | codename 
----+----------
  3 | 
(1 row)
INSERT 0 1
 
=$ insert into test (codename) values (NULL) returning *;
 id | codename 
----+----------
  4 | 
(1 row)
INSERT 0 1
 
=$ insert into test (codename) values (NULL) returning *;
 id | codename 
----+----------
  5 | 
(1 row)
INSERT 0 1

The rationale is simple – we can't realistically compare NULLs. One NULL value is not equal to another NULL value. It's also not “inequal". The comparison with NULL value returns NULL result – basically meaning “no idea".

Now, with this new change, we can, though, make it so that we can have at most one NULL value:

=$ create table test2 (
    id serial primary key,
    codename text,
    unique NULLS NOT DISTINCT (codename)
);

With this, if I'd try to insert more than one NULL codename:

=$ insert into test2 (codename) values (NULL) returning *;
 id | codename 
----+----------
  1 | 
(1 row)
INSERT 0 1
 
=$ insert into test2 (codename) values (NULL) returning *;
ERROR:  duplicate key value violates unique constraint "test2_codename_key"
DETAIL:  Key (codename)=(null) already exists.

Currently, if you'd just say “UNIQUE" then original behavior is default, but you can always force it using:

UNIQUE NULLS DISTINCT (codename)

When tables are created you can see the difference in \doutput:

=$ \d test
                             Table "public.test"
  Column  |  Type   | Collation | Nullable |             Default              
----------+---------+-----------+----------+----------------------------------
 id       | integer |           | not null | nextval('test_id_seq'::regclass)
 codename | text    |           |          | 
Indexes:
    "test_pkey" PRIMARY KEY, btree (id)
    "test_codename_key" UNIQUE CONSTRAINT, btree (codename)
 
=$ \d test2
                             Table "public.test2"
  Column  |  Type   | Collation | Nullable |              Default              
----------+---------+-----------+----------+-----------------------------------
 id       | integer |           | not null | nextval('test2_id_seq'::regclass)
 codename | text    |           |          | 
Indexes:
    "test2_pkey" PRIMARY KEY, btree (id)
    "test2_codename_key" UNIQUE CONSTRAINT, btree (codename) NULLS NOT DISTINCT

Same way you can add NULLS DISTINCT or NULLS NOT DISTINCT to create unique index:

=$ create unique index t1 on test2 (codename) nulls distinct;
 
=$ create unique index t2 on test2 (codename) nulls not distinct;

That is interesting addition. Thanks to all involved.

Posted on 2022-02-06|Tags constraint, distinct, index, not, null, nulls, pg15, postgresql, unique, waiting|

Your email address will not be published. Required fields are marked *

Comment *

Name

Website

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Waiting for PostgreSQL 15 – Add UNIQUE null treatment option

Waiting for PostgreSQL 15 – Add UNIQUE null treatment option

Leave a Reply

Recommend

When null is not enough: an option type for C#

Unique PostgreSQL index for a set of specific values

Waiting for PostgreSQL 15 – Add support for MERGE SQL command

【PostgreSQL 15】PostgreSQL 15对UNIQUE和NULL的改进 - abce

How Is IT Staff Augmentation A Unique Talent Option For The Startups?

PostgreSQL 15 將可以對透過 UNIQUE 限制 NULL 的唯一性了

Miniot Wheel 2 review: a unique vertical turntable worth waiting for

Leaked Nvidia RTX 4070 may show the mid-range GPU option we've been waiting for

Mastering the Iron Butterfly Option Strategy: A Unique Approach to Options Tradi...

Hybrid testing using PostgreSQL on SAP BTP, hyperscaler option

About Joyk