7

HandlingNULL Values in PostgreSQL

 4 years ago
source link: https://www.percona.com/blog/2020/03/05/handling-null-values-in-postgresql/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Mz632qa.png!web What is NULL?

There is often some confusion about NULL value, as it is treated differently in different languages. So there is an obvious need to clarify what NULL is, how it works in different languages, and what the actual value is behind the NULL. Before going into details, there is also a need to understand the concept of Three-valued logic[1] and Two-valued logic known as bivalent[2]. The bivalent is a  concept of boolean value where value can be true or false, but contrary to bivalent the Three-valued logic can be true, false, or (intermediate value) unknown. Now, back to NULL. In some languages NULL acts as bivalent, and in others, Three-valued logic (especially in databases).

C/C++

In “C/C++” the NULL is defined as 0 in the “stddef.h” which is included <cstddef> in case of C++ and stdlib.h in case of C.

#if defined (_STDDEF_H) || defined (__need_NULL)
#undef NULL     /* in case <stdio.h> has defined it. */
#ifdef __GNUG__
#define NULL __null
#else   /* G++ */
#ifndef __cplusplus
#define NULL ((void *)0)
#else   /* C++ */
#define NULL 0
#endif  /* C++ */
#endif  /* G++ */
#endif  /* NULL not defined and <stddef.h> or need NULL.  */
#undef  __need_NULL

The value can be tested against NULL directly by using the equality operators “==” or !=. Now take an example program and try to check the values against NULL in C.

#include <stddef.h>
#include <stdio.h>
void main()
{
    if ('0' == NULL)
        printf("NULL is '0' \n");
    if ("" == NULL)
        printf("NULL is empty string \n");
    if (' ' == NULL)
        printf("NULL is space \n");
    if (0 == NULL)
        printf("NULL is 0 \n");
}

The output of the above program will be “NULL is 0”, so it is quite evident that NULL is defined as “0” in C language.

Java

Contrary to C where NULL is 0, in Java NULL means the variable references do have value. The value can be tested against NULL by equality operators. When we print the null value, it will print the null. In Java, null is case sensitive and it must be in all small letters as “null”.

public class Test
{ 
    public static void main (String[] args) throws java.lang.Exception
    {
        System.out.println("Null is: " + null);
    }
}
Null is: null

PostgreSQL

In PostgreSQL, NULL means no value. In other words, the NULL column does not have any value. It does not equal 0, empty string, or spaces. The NULL value cannot be tested using any equality operator like “=” “!=” etc. There are some special statements to test the value against NULL, but other than that, no statement can be used to test the NULL value.

Let’s do some interesting comparisons, which will clear up the concept of NULL in PostgreSQL. In the following code snippet, we are comparing 1 with 1 and the obvious result is “t” (TRUE). This leads us to understand that the PostgreSQL equality operator gives us true when two values match. Similarly, the equality operator works for the textual value.

postgres=# SELECT 1 = 1 result;
 result 
--------
 t
(1 row)
postgres=# SELECT 'foo' = 'foo' result;
 result 
--------
 t
(1 row)

Let’s do some more experiments, comparing NULL with NULL. If NULL is a normal value, then the result should be “t”. But NULL is not a normal value, therefore, there is no result of that.

postgres=# SELECT NULL = NULL result;
 result 
--------
 
 
(1 row)

Let’s compare NULL with NULL using an in-equality operator.  The result is the same as what we got previously. That proves that we cannot compare NULL with NULL using equality and inequality operators.

postgres=# SELECT NULL != NULL result;
 result 
--------
 
 
(1 row)

Similarly, no mathematical operation can be performed on NULL. PostgreSQL produces nothing when any NULL is used as an operand.

postgres=# SELECT NULL * 10 is NULL result;
 result 
--------
 t
(1 row)

How to Use NULL

Therefore it is proved that NULL cannot be compared with any value using equality operators. Then how we can use the NULL if we cannot use any operator or mathematical operation? PostgreSQL provides special statements and functions to check and test values against NULL. There is the only way to use the NULL in PostgreSQL.

IS NULL/IS NOT NULL

postgres=# SELECT NULL is NULL result;
 result 
--------
 t
(1 row)
postgres=# SELECT NULL is NOT NULL result;
 result 
--------
 f
(1 row)

COALESCE

PostgreSQL has a function name “COALESCE” [3]. The function takes n number of arguments and returns the first, not null arguments. You can test your expression against NULL using the function.

 COALESCE (NULL, 2 , 1);

NULLIF

There is another function called “NULLIF”[3],  returns NULL if first and second arguments are equal, otherwise returns the first argument, here is the example where we are comparing 10 with 10 and we already know that these are equal so it will return NULL. In the second example, we are comparing 10 with 100 and in that case, it will return 10 the first value.

postgres=# SELECT NULLIF (10, 10);
nullif
--------
      
 
(1 row)
 
postgres=# SELECT NULLIF (10, 100);
nullif
--------
     10
(1 row)

Usage of NULL

If NULL does not have any value, then what is the advantage of NULL? Here are some examples of usage:

In case a field does not have any value, and for example, we have database fields with first/middle and last name. Does, in reality, everybody have a first/middle and last name? The answer is no, there shouldn’t be a field that cannot have any value.

postgres=# CREATE TABLE student(id INTEGER, fname TEXT, sname TEXT, lname TEXT, age INTEGER);
 
postgres=# SELECT * FROM STUDENT;
 id | fname | sname | lname | age 
----+-------+-------+-------+-----
  1 | Adams | Baker | Clark |  21
  2 | Davis |       | Evans | 22
  3 | Ghosh | Hills |       | 24
(3 rows)

Let’s select students who do have a middle name. Does that query work here? No, and the reason behind it is the same we have discussed in earlier queries.

postgres=# SELECT * FROM STUDENT WHERE sname = '';
 id | fname | sname | lname | age 
----+-------+-------+-------+-----
(0 rows)

Let’s select by using the proper statements and get the desired results.

postgres=# SELECT * FROM STUDENT WHERE sname IS NULL;
 id | fname | sname | lname | age 
----+-------+-------+-------+-----
  2 | Davis |       | Evans | 22
(1 row)

The field does not make sense, for example, because the spouse’s name of a single person or children’s details is not “KID”. Here is the example where KID in the divorced field does not make sense. We cannot put true or false, so NULL is the right value here.

postgres=# CREATE TABLE person(id INTEGER, name TEXT, type TEXT, divorced bool);
postgres=# SELECT * FROM person;
 id | name  | type | divorced 
----+-------+-------+---------
  1 | Alice | WOMAN | f
  3 | Davis | KID   | 
  2 | Bob   | MAN | t
(3 rows)

The other usage of NULL is to represent an empty string and empty numeric value. Numeric 0 has significance so it cannot be used to represent the empty numeric filed, the unknown value at a certain time.

In this example there are three students: Alice has 90 marks, Bob has 0 marks, and Davis does not have marks yet. In the case of Bob, we have inserted 0, and for Davis, we have inserted NULL. By doing this we can easily distinguish between who has 0 marks and who does not have results yet.

postgres=# SELECT * FROM students_mark;
 id | name  | marks 
----+-------+-------
  1 | Alex  | 90
  2 | Bob   | 0
  2 | Davis |      
(3 rows)
postgres=# SELECT * FROM students_mark WHERE marks IS NULL;
 id | name  | marks 
----+-------+-------
  2 | Davis |      
(1 row)
postgres=# SELECT * FROM students_mark WHERE marks = 0;
 id | name | marks 
----+------+-------
  2 | Bob  | 0
(1 row)

Conclusion

The purpose of this blog is to be clear about the fact that every language has its own meaning of NULL. Therefore, be careful when you are using NULL, otherwise, you will get erroneous results. Especially in databases (PostgreSQL), NULL has some different concepts, so be careful when writing queries involving NULL.

[1] – https://en.wikipedia.org/wiki/Three-valued_logic

[3] – https://www.postgresql.org/docs/current/functions-conditional.html


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK