PostgreSQL for MySQL DBAs Episode 11: A Survey of Indexes

First, a big thanks for the kind responses to this series, and, as requested, here is an overview of index types available in PostgreSQL. The supporting video with bonus material can be found here.

PostgreSQL has several popular index types including B-tree, Hash, GiST, SP-GiST, GIN, and BRIN. Actually, a couple of those are frameworks that allow you to take advantage of special cases someone else has engineered or allow you to make your own index handler. Each index type uses a different algorithm that is best suited to different types of queries.

B Tree

By default, the CREATE INDEX command creates B-tree (actually b+ trees) indexes. Yup, just like MySQL. B+ Trees are great for binary and range searches.

Shell

test=# CREATE TABLE staff (id) as SELECT 'Employee' || x FROM generate_series(1,500) as g(x);

SELECT 500

test=# CREATE INDEX staff_id ON staff(id);

CREATE INDEX

test=# EXPLAIN (verbose) SELECT * FROM staff WHERE id='Employee101';

QUERY PLAN

-----------------------------------------------------------------------------------

Index Only Scan using staff_id on public.staff (cost=0.27..8.29 rows=1 width=11)

Output: id

Index Cond: (staff.id = 'Employee101'::text)

(3 rows)

The above example shows a simple table created and populated with data. Then an index is created on the sole column in the table. Using EXPLAIN (with the VERBOSE option) shows us that the indexed is used to speed up the query.

Bitmap

Bitmaps are cool as they are used when an index lookup might generate multiple hits on the same heap (data) page or using multiple indexes for a single query is useful. Bitmap indexes allow heap pages to be visited only once for multiple matches, they can merge the results from several indexes with AND/OR filtering, and they are automatically enabled by the optimizer.

MySQL has had hash joins for a while now and you are probably already used to them. Hash indexes store a 32-bit hash code derived from the value of the indexed column. These values can only be compared by simple equality comparisons. The query planner will consider using a hash index whenever an indexed column is involved in a comparison using the equal operator.

GiST is an infrastructure within which many different tree-based indexing strategies can be implemented.
GIST Indexes are not a single kind of index.

Some of the contributed models:

rtree_gist, btree_gist – GiST implementation of R-tree and B-tree
intarray – index support for a one-dimensional array of int4’s
tsearch2 – a searchable (full text) data type with indexed access
ltree – data types, indexed access methods, and queries for data organized as tree-like structures
hstore – a storage for (key,value) data
cube – data type, representing multidimensional cubes

GiST indexes are also capable of optimizing “nearest-neighbor” searches, such as
SELECT * FROM places ORDER BY location point '(101,456)' LIMIT 10;

This will find the ten places closest to a given target point.

SP-GiST

SP-GiST indexes, like GiST indexes, offer an infrastructure that supports various kinds of searches. SP-GiST permits the implementation of a wide range of different non-balanced disk-based data structures, such as quadtrees, k-d trees, and radix trees (tries).
As an example, the standard distribution of PostgreSQL includes SP-GiST operator classes for two-dimensional points

GIN indexes are “inverted indexes” which are appropriate for data values that contain multiple component values, such as arrays. Think text or JSON data for this type of index. And GIN is similar to MySQL’s multi-valued indexes where a key is stored once with many row pointers.

BRIN or Block Range INdexes indexes store summaries of the values stored in consecutive physical block ranges of a table. Thus, they are most effective for columns whose values are well-correlated with the physical order of the table rows. If the data is truly random, or if there is much churn of the key values in a ‘hot’ database, the assumptions underlying BRIN may break down. Like GiST, SP-GiST and GIN, BRIN can support many different indexing strategies, and the particular operators with which a BRIN index can be used vary depending on the indexing strategy. For data types that have a linear sort order, the indexed data corresponds to the minimum and maximum values of the values in the column for each block range.

This supports indexed queries using these operators: < = >

In the following example note that the BRIN index is measured in KILObytes not MEGAbytes.

Shell

test=# create table demo as SELECT generate_series(1,1000000) as x;

SELECT 1000000

test=# create index btree_idx on demo(x);

CREATE INDEX

test=# create index brin_idx on demo using brin(x);

CREATE INDEX

test=# SELECT relname, pg_size_pretty(pg_relation_size(oid))

FROM pg_class WHERE relname LIKE ’brin_%’ OR relname = ’btree_idx’

ORDER BY relname;

relname | pg_size_pretty

-------------------+----------------

brin_idx | 24 kB

brin_random_x | 24 kB

btree_idx | 21 MB

Remember

B-tree is ideal for unique values
BRIN is ideal for the indexing of many columns
GIN is ideal for indexes with many duplicates
SP-GIST is ideal for indexes whose keys have many duplicate prefixes
GIST for everything else

Suggested Reading

I used these two sources in preparing this post. The PostgreSQL manual is an amazing resource and Bruce Momjian is a treasure (and you should be reading all his presentations):

https://www.postgresql.org/docs/current/indexes-types.html
https://momjian.us/main/writings/pgsql/indexing.pdf

The past videos for PostgreSQL for MySQL Database Administrators (DBA) can be found here: episode one, episode two, episode three, episode four, episode five, episode six, episode seven, episode eight, episode nine, and episode ten.

B Tree

Bitmap

SP-GiST

Remember

Recommend

Getting The Most Out Of Your PostgreSQL Indexes

Proposal for Global Indexes in PostgreSQL

PostgreSQL 101 for Non-Postgres DBAs (Simple Backup and Restore)

PostgreSQL for MySQL DBAs Episode 6 - Explaining EXPLAIN (And an Answer to a Bon...

PostgreSQL for MySQL DBAs Episode 7: Vacuuming Tables

PostgreSQL for MySQL DBAs Episode 8: Materialized Views

PostgreSQL For MySQL DBAs Episode 9: References

PostgreSQL for MySQL DBAs: Watch Command

PostgreSQL for MySQL DBAs Episode 12: Transactions

Top 5 MySQL Courses for Programmers and DBAs to Learn Online in 2024 - Best Of L...

About Joyk