

PostgreSQL for MySQL DBAs Episode 11: A Survey of Indexes
source link: https://www.percona.com/blog/postgresql-for-mysql-dbas-episode-11-a-survey-of-indexes/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

PostgreSQL for MySQL DBAs Episode 11: A Survey of Indexes
First, a big thanks for the kind responses to this series, and, as requested, here is an overview of index types available in PostgreSQL. The supporting video with bonus material can be found here.
PostgreSQL has several popular index types including B-tree, Hash, GiST, SP-GiST, GIN, and BRIN. Actually, a couple of those are frameworks that allow you to take advantage of special cases someone else has engineered or allow you to make your own index handler. Each index type uses a different algorithm that is best suited to different types of queries.
B Tree
By default, the CREATE INDEX command creates B-tree (actually b+ trees) indexes. Yup, just like MySQL. B+ Trees are great for binary and range searches.
test=# CREATE TABLE staff (id) as SELECT 'Employee' || x FROM generate_series(1,500) as g(x); SELECT 500 test=# CREATE INDEX staff_id ON staff(id); CREATE INDEX test=# EXPLAIN (verbose) SELECT * FROM staff WHERE id='Employee101'; QUERY PLAN ----------------------------------------------------------------------------------- Index Only Scan using staff_id on public.staff (cost=0.27..8.29 rows=1 width=11) Output: id Index Cond: (staff.id = 'Employee101'::text) (3 rows) |
The above example shows a simple table created and populated with data. Then an index is created on the sole column in the table. Using EXPLAIN (with the VERBOSE option) shows us that the indexed is used to speed up the query.
Bitmap
Bitmaps are cool as they are used when an index lookup might generate multiple hits on the same heap (data) page or using multiple indexes for a single query is useful. Bitmap indexes allow heap pages to be visited only once for multiple matches, they can merge the results from several indexes with AND/OR filtering, and they are automatically enabled by the optimizer.
MySQL has had hash joins for a while now and you are probably already used to them. Hash indexes store a 32-bit hash code derived from the value of the indexed column. These values can only be compared by simple equality comparisons. The query planner will consider using a hash index whenever an indexed column is involved in a comparison using the equal operator.
GiST is an infrastructure within which many different tree-based indexing strategies can be implemented.
GIST Indexes are not a single kind of index.
Some of the contributed models:
- rtree_gist, btree_gist – GiST implementation of R-tree and B-tree
- intarray – index support for a one-dimensional array of int4’s
- tsearch2 – a searchable (full text) data type with indexed access
- ltree – data types, indexed access methods, and queries for data organized as tree-like structures
- hstore – a storage for (key,value) data
- cube – data type, representing multidimensional cubes
GiST indexes are also capable of optimizing “nearest-neighbor” searches, such asSELECT * FROM places
ORDER BY location point '(101,456)'
LIMIT 10;
This will find the ten places closest to a given target point.
SP-GiST
SP-GiST indexes, like GiST indexes, offer an infrastructure that supports various kinds of searches. SP-GiST permits the implementation of a wide range of different non-balanced disk-based data structures, such as quadtrees, k-d trees, and radix trees (tries).
As an example, the standard distribution of PostgreSQL includes SP-GiST operator classes for two-dimensional points
GIN indexes are “inverted indexes” which are appropriate for data values that contain multiple component values, such as arrays. Think text or JSON data for this type of index. And GIN is similar to MySQL’s multi-valued indexes where a key is stored once with many row pointers.
BRIN or Block Range INdexes indexes store summaries of the values stored in consecutive physical block ranges of a table. Thus, they are most effective for columns whose values are well-correlated with the physical order of the table rows. If the data is truly random, or if there is much churn of the key values in a ‘hot’ database, the assumptions underlying BRIN may break down. Like GiST, SP-GiST and GIN, BRIN can support many different indexing strategies, and the particular operators with which a BRIN index can be used vary depending on the indexing strategy. For data types that have a linear sort order, the indexed data corresponds to the minimum and maximum values of the values in the column for each block range.
This supports indexed queries using these operators: < = >
In the following example note that the BRIN index is measured in KILObytes not MEGAbytes.
test=# create table demo as SELECT generate_series(1,1000000) as x; SELECT 1000000 test=# create index btree_idx on demo(x); CREATE INDEX test=# create index brin_idx on demo using brin(x); CREATE INDEX test=# SELECT relname, pg_size_pretty(pg_relation_size(oid)) FROM pg_class WHERE relname LIKE ’brin_%’ OR relname = ’btree_idx’ ORDER BY relname; relname | pg_size_pretty -------------------+---------------- brin_idx | 24 kB brin_random_x | 24 kB btree_idx | 21 MB |
Remember
- B-tree is ideal for unique values
- BRIN is ideal for the indexing of many columns
- GIN is ideal for indexes with many duplicates
- SP-GIST is ideal for indexes whose keys have many duplicate prefixes
- GIST for everything else
Suggested Reading
I used these two sources in preparing this post. The PostgreSQL manual is an amazing resource and Bruce Momjian is a treasure (and you should be reading all his presentations):
https://www.postgresql.org/docs/current/indexes-types.html
https://momjian.us/main/writings/pgsql/indexing.pdf
The past videos for PostgreSQL for MySQL Database Administrators (DBA) can be found here: episode one, episode two, episode three, episode four, episode five, episode six, episode seven, episode eight, episode nine, and episode ten.
Recommend
-
24
In the Postgres world, indexes are essential to efficiently navigate the table data storage (aka the “heap”). Postgres does not maintain a clustering for the heap, and the MVCC architecture leads to multiple versions of t...
-
40
A global index, by very definition, is a single index on the parent table that maps to many underlying table partitions. The par...
-
5
PostgreSQL 101 for Non-Postgres DBAs (Simple Backup and Restore) Back to the Blog It’s n...
-
8
Explaining EXPLAIN (And an Answer to a Bonus Question) Back to the Blog
-
5
PostgreSQL for MySQL DBAs Episode 7: Vacuuming Tables Back to the Blog
-
12
-
7
PostgreSQL For MySQL DBAs Episode 9: References Back to the Blog Thank you for the many kind responses to this series. There are many out there...
-
6
PostgreSQL for MySQL DBAs: Watch Command Back to the Blog Those new to the realm of PostgreSQL from other databases will find little gems sprink...
-
5
PostgreSQL for MySQL DBAs Episode 12: Transactions February 13, 2023
-
9
Top 5 MySQL Courses for Programmers and DBAs to Learn Online in 2024 Hello guys, if you are interested in learning SQL with MySQL database and looking for some awesome resources e.g. books, tutorials, and online courses then you have...
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK