Inserting JSON Data with Default Values in PostgreSQL

2020-04-02 (posted in blog)

So you have a table, and you have some JSON data you want to insert into that table, but the JSON either contains null or no definition at all for some columns in the table that have default values defined. This is the first thing you try:

create table my_table (
  value1 text not null,
  value2 integer not null default 100,
  value3 text[] not null default '{}'::text[],
  created_date timestamptz not null default now()
);

insert into my_table
select (json_populate_record(null::my_table,
  '{"value1":"my text"}'::json)).*;

In theory, value1 is the only column you should need to specify a value for, since it’s the only non-nullable column without a default specified. However, executing the above query results in the following error:

1
2

ERROR:  null value in column "value2" violates not-null constraint
DETAIL:  Failing row contains (my text, null, null, null).

Even though the JSON only has value1 specified, calling json_populate_record throws null into all the other columns causing the insert to fail because value2 can’t be null. One way to fix that is to select the specific columns from the JSON that you know to be populated, like this:

with my_json as (
  select (json_populate_record(null::my_table,
    '{"value1":"my text"}'::json)).*
)
insert into my_table (value1)
select value1 from my_json;

That works, and the default values are populated to the unspecified columns, but this approach may not be flexible enough. What if you don’t know which properties of the JSON will be populated ahead of time? For example, maybe it has value2 specified, in which case that’s what you want to insert, but maybe it doesn’t in which case you want to insert the default.

You can’t just select value2 as part of the insert, because that will try to insert null and fail as in the first example. One way to do this would be to use coalesce in the select:

with my_json as (
  select (json_populate_record(null::my_table,
    '{"value1":"my text"}'::json)).*
)
insert into my_table (value1, value2, value3)
select
  value1,
  coalesce(value2, 100),
  coalesce(value3, '{}'::text[])
from my_json;

But that’s lame because it doesn’t actually use the default values–you have to manually specify them here in whatever function you’re writing. Now you have to maintain the default values for that table in multiple different places. Gross.

Now you decide to get clever. Your PostgreSQL knowledge includes the fact that you can determine what the default value for a column is by querying the information_schema.columns table. From this you can build a JSON object containing all of those default values and then merge the data you’re trying to insert on top of that before inserting:

select json_object_agg(column_name, column_default)
from information_schema.columns where
  table_schema='public' and
  table_name='my_table';

Results in:

{
  "value1": null,
  "value2": "100",
  "value3": "'{}'::text[]",
  "created_date": "now()"
}

Let’s get rid of those null values so we just have the columns that actually have defaults specified:

select json_strip_nulls(
  json_object_agg(column_name, column_default))
from information_schema.columns where
  table_schema='public' and
  table_name='my_table';

This gives us:

{
  "value2": "100",
  "value3": "'{}'::text[]",
  "created_date": "now()"
}

You may have already guessed that there could be a problem with this, but let’s give it a try. We have to switch over to using jsonb instead of json so that we can use the || operator to concatenate our values over top of the defaults:

with dflt as (
  select jsonb_strip_nulls(
    jsonb_object_agg(column_name, column_default)) as vals
  from information_schema.columns where
    table_schema='public' and
    table_name='my_table'
)
insert into my_table
select (jsonb_populate_record(null::my_table,
  dflt.vals || jsonb_strip_nulls('{"value1":"my text"}'::jsonb)
)).*
from dflt;

You’ll note that I’m passing my object through jsonb_strip_nulls before concatenating to the defaults; that’s so any properties that are specified with a null value in the object I’m inserting won’t overwrite the defaults. Running the above results in the following error:

1
2

ERROR:  malformed array literal: "'{}'::text[]"
DETAIL:  Array value must start with "{" or dimension information.

So close! The problem seems to be that the default empty array value is being evaluated as a string instead of an array. This is the only remaining problem. In fact, if you were to drop the value3 column and retry the above query, it would work! I wasn’t expecting it to, but apparently if you cast the literal string 'now()' to a timestamp it actually works as you might expect, which is why the created_date column doesn’t cause the same problem. Apparently you can cast other non-date strings to timestamps in PostgreSQL too:

# select 'yesterday'::timestamptz;
      timestamptz
------------------------
 2020-04-01 00:00:00+00

Weird! I didn’t know that before.

If you don’t have any default values specified that can’t be directly cast from strings (like the text[] array in my example), then the above trick is probably good enough. Build your jsonb object of default values, and then concatenate the object you want to insert on top of it.

If you do have default values that can’t be cast from strings, here is a little stored procedure that runs through all the columns in a table and builds the defaults object by actually evaluating them:

create function column_defaults_jsonb(
  p_schema  text,
  p_table   text
) returns jsonb
as $$
declare
  t_defaults  jsonb;
  t_value     jsonb;
  t_column    text;
  t_default   text;
begin
  t_defaults := '{}'::jsonb;

for t_column, t_default in
    select column_name, column_default
    from information_schema.columns
    where
      table_schema=p_schema and
      table_name=p_table
  loop
    if t_default is not null then
      execute 'select to_jsonb(' || t_default || ')' into t_value;
      t_defaults := jsonb_set(t_defaults, array[t_column], t_value);
    end if;
  end loop;

return t_defaults;
end;
$$ language plpgsql;

By running the above procedure against our table, you can see that the defaults object comes back populated with the values already evaluated:

select column_defaults_jsonb('public', 'my_table');

The integer becomes an actual number, the empty array becomes an actual empty JSON array, and today’s date is already populated:

{
  "value2": 100,
  "value3": [],
  "created_date": "2020-04-02T22:40:13.803923+00:00"
}

Now you can use that function in your insert like so:

insert into my_table
select (jsonb_populate_record(null::my_table,
  column_defaults_jsonb('public', 'my_table')
  || '{"value1":"my text"}'::jsonb)).*;

And that’s how you do it. It even seems to work for tables with auto-incrementing primary keys, because evaluating the default for that actually generates the new value and increments the sequence. Behold:

create table my_table2 (
  id serial primary key,
  value1 text not null,
  value2 integer not null default 100,
  value3 text[] not null default '{}'::text[],
  created_date timestamptz not null default now()
);

insert into my_table2
select (jsonb_populate_record(null::my_table2,
  column_defaults_jsonb('public', 'my_table2')
  || '{"value1":"my text"}'::jsonb)).*;

However, if you run that insert on the new table a few times, you’ll see something weird with the data:

# select * from my_table2;
 id | value1  | value2 | value3 |         created_date
----+---------+--------+--------+-------------------------------
  1 | my text |    100 | {}     | 2020-04-02 22:51:59.407879+00
  6 | my text |    100 | {}     | 2020-04-02 22:52:05.849629+00
 11 | my text |    100 | {}     | 2020-04-02 22:52:20.624157+00

The id column is incrementing by 5 each time, which is conspicuously the exact same number of columns that are in the table. It would appear as though the column_defaults_jsonb procedure is evaluating the new id multiple times with each call. I’m not sure why that is. Maybe you can figure it out.

Short Permalink for Attribution: rdsm.ca/4ewjy

Comments

Inserting JSON Data with Default Values in PostgreSQL

Inserting JSON Data with Default Values in PostgreSQL

Recommend

Git Source Editor

I replaced the aging battery on my MacBook Pro

李志刚：生鲜赛道激战正酣，农贸市场正待升级改造

王子健荣获"2019年度十大自媒体作者"称号，写作对人有多大帮助？

BFCM

Fixing Audit Won’t Fix Fraud: An Interview with Kris Bennatti of Bedrock AI

It's Supposed to Be Hard. You're Changing the World.

Top 10 Learnings from Startups to a $100M+ ARR General Manager, Linda Tong

This Too Shall Pass

来了！彦祖的双12 电脑配件/显示器/零食推荐

About Joyk