3

Serial type versus identity columns in PostgreSQL and TypeORM

 2 years ago
source link: https://wanago.io/2022/02/21/serial-type-identity-columns-postgresql-typeorm/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Serial type versus identity columns in PostgreSQL and TypeORM

Node.js SQL

February 21, 2022

So far on this blog, we’ve used the serial type to define autoincrementing ids for our tables. However, TypeORM started fully supporting identity columns very recently. Since PostgreSQL official recommends using them, it is a good moment to go through both approaches and compare them.

Serial type

The serial data type allows us to generate unique integer numbers automatically. Therefore, it comes in handy when declaring a primary key for our table.

CREATE TABLE posts (
  id serial PRIMARY KEY,
  title text,
  content text

The id serial PRIMAR KEY above does a few things for us. First, it creates a sequence.

Sequences

A sequence is a kind of database object that generates a series of integers based on a specification.

When we open the pgAdmin tool, we can find our sequence under Sequences on the left sidebar.

Screenshot-from-2022-02-19-16-09-18.png

We could define such a sequence ourselves by running the following query:

CREATE SEQUENCE IF NOT EXISTS posts_id_seq
    INCREMENT 1
    START 1
    MINVALUE 1
    MAXVALUE 2147483647
    CACHE 1
    NO CYCLE
    OWNED BY posts.id;

There are a few things to note above. Each time we get a new value from the sequence, our sequence increments by one. We could specify a negative value to define a descending sequence instead.

By default, using the serial types gives us a sequence with a maximum value of 2147483647, which equals 231 − 1. However, we can use the bigserial type instead if we want a more extensive sequence. The above gives us a sequence with a maximum value of 9223372036854775807, which equals 2⁶³−1.

The cache option specifies how many numbers should be preallocated in memory for faster access. The default value is 1.

Because the sequence is not cycled, getting a new value from the sequence after reaching its maximum results in error. We could prevent that by using the CYCLE option. When a cycled sequence reaches its limits, the next generated value is the minimum.

Of our cycled sequence is descending, the next generated value after the limit is the maximum.

An important part of the sequence created by the serial keyword is the OWNED BY posts.id clause. Because of that, the sequence is also dropped if we drop the associated column or the whole table.

Manipulating sequences

There are a few important functions that we need to know when manipulating our sequences.

The most important function is the nextval. It advances our sequence and returns the new value. Because of that, using id serial PRIMARY KEY results in the following:

id integer NOT NULL DEFAULT nextval('posts_id_seq')

The above can result in some issues, though. For example, nothing stops us from defining the id manually when inserting the entity.

INSERT INTO posts (
  id, title, content
VALUES (
  'Hello world',
  'Lorem ipsum'

If we get back to relying on the default sequence value, at some point, the current value of our sequence might equal 10.

We can always check the current value of our sequence using currval('posts_id_seq').

INSERT INTO posts (
  title, content
VALUES (
  'Hello world',
  'Lorem ipsum'

If the value of our sequence at this point equals 10, we get an error:

ERROR: duplicate key value violates unique constraint “posts_pkey”
DETAIL: Key (id)=(10) already exists.

We can also manually change the current value of our sequence using the setval function. For example, we can do the following:

SELECT setval('posts_id_seq', 100);

When we do the above and insert a post, it will have an id of 101.

Using the serial data type with TypeORM

To use the serial data type with TypeORM, we can use the @PrimaryGeneratedColumn() without additional arguments.

import { Column, Entity, PrimaryGeneratedColumn } from 'typeorm';
@Entity()
class Post {
  @PrimaryGeneratedColumn()
  id: number;
  @Column()
  title: string;
  @Column()
  content: string;

Doing the above sets the strategy to 'increment' by default and causes TypeORM to use the serial data type.

The identity columns

We can deal with many issues of the serial type using the identity columns. As opposed to the serial type, the identity columns are compliant with the SQL standard.

CREATE TABLE posts (
  id int GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
  title text,
  content text

Doing the above causes the id to be the identity column. Moreover, it has an implicit sequence attached to it:

id integer NOT NULL GENERATED BY DEFAULT AS IDENTITY (
  INCREMENT 1 START 1 MINVALUE 1 MAXVALUE 2147483647 CACHE 1

We can provide different properties of the sequence when defining our identity column. For example, we can make it increment by 2 instead of 1.

With the way we’ve defined the identity column, we can still omit the default value and pass an id.

INSERT INTO posts (
  id, title, content
VALUES (
  'Hello world',
  'Lorem ipsum'

We need to adjust our table definition a bit to fix the above issue.

CREATE TABLE posts (
  id int GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  title text,
  content text

Thanks to using GENERATED ALWAYS instead of GENERATED BY DEFAULT, trying to define an id explicitly results in an error:

ERROR: cannot insert a non-DEFAULT value into column “id”
DETAIL: Column “id” is an identity column defined as GENERATED ALWAYS.

We still could force an id using the OVERRIDING SYSTEM VALUE statement, though:

INSERT INTO posts (
  id, title, content
OVERRIDING SYSTEM VALUE
VALUES (
  'Hello world',
  'Lorem ipsum'

Using identity columns created with the GENERATED ALWAYS clause makes us less likely to do the above by mistake and mess with our ids.

Identity columns with TypeORM

To use identity columns that have the GENERATED ALWAYS clause with TypeORM, we need to use the @PrimaryGeneratedColumn() decorator with additional arguments.

Make sure to have a TypeORM version of at least 0.2.42 from 16th February of 2022.

import { Column, Entity, PrimaryGeneratedColumn } from 'typeorm';
@Entity()
class Post {
  @PrimaryGeneratedColumn('identity', {
    generatedIdentity: 'ALWAYS'
  public id: number;
  @Column()
  public title: string;
  @Column()
  public content: string;

The above code defines the id as an identity column with the GENERATED ALWAYS clause. If we want to use GENERATED BY DEFAULT instead, we need to use generatedIdentity: 'BY DEFAULT'.

Summary

We’ve gone through both the serial data type and the identity columns in this article. We’ve also had to learn more about sequences to understand how they work. We can safely assume that we should use the identity columns over the serial data type with all that knowledge. We need to remember that we need PostgreSQL 10+ and the latest version of the TypeORM library. Using identity columns instead of the serial data type helps us care more about the integrity of our database.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK