search
HomeSystem TutorialLINUXPostgreSQL's journey of exploration

PostgreSQL's journey of exploration

Jan 17, 2024 am 08:15 AM
linuxlinux tutorialRed Hatlinux systemlinux commandlinux certificationred hat linuxlinux video

PostgreSQLs journey of exploration

Postgres has several index types, and every new version seems to add some new index types. Each index type is useful, but which type to use depends on (1) the type of data, sometimes (2) the underlying data in the table and (3) the type of lookup performed. In the following content, we will introduce the index types you can use in Postgres, and when you should use which index type. Before we get started, here's a list of index types we'll walk you through:

B-Tree
Generalized Inverted Index (GIN)
Generalized Inverted Seach Tree (GiST)
Space partitioned GiST (SP-GiST)
Block Range Index (BRIN)
Hash
Now let’s start with indexing.

In Postgres, B-Tree indexes are the most common indexes you use

If you have a computer science degree, then B-Tree indexing may be the first index you learn. B-tree indexes create a tree that always maintains its balance. When it looks for something based on the index, it walks the tree to find the key and returns the data you're looking for. Using an index is significantly faster than a sequential scan because it only needs to read a few pages (when you are returning only a few records) as opposed to sequentially scanning thousands of records.

If you run a standard CREATE INDEX statement, it will create a B-tree index for you. B-tree indexes are valuable on most data types, such as text, numbers, and timestamps. If you're just starting to use indexes in your database, and don't use too many of Postgres' advanced features on your database, using a standard B-Tree index is probably your best option.

GIN index for multi-valued columns

Generalized Inverted Index, generally called GIN, is mostly suitable for data types when a single column contains multiple values.

According to Postgres documentation:

"GIN is designed to handle situations where the entry being indexed is a compound value, and the query processed by the index needs to search for values ​​that occur in the compound entry. For example, this entry might be a document, and the query can search for the value contained in the document. Specify characters."

The most common data types included in this range are:

hStore
Array
Range
JSONB
One of the most satisfying things about GIN indexes is their ability to understand data stored in composite values. However, because a GIN index requires specific knowledge of the data structure for each individual type being added, not all data types are supported by the GIN index.

GiST index, for rows with overlapping values

Generalized Inverted Seach Tree (GiST) index is mostly suitable when your data overlaps with other rows of data in the same column. The best use of GiST indexes is if you declare a geometry data type and you want to know whether two polygons contain some points. In one case a particular point may be contained in a box, while at the same time, other points only exist in a polygon. Common data types indexed using GiST are:

Geometry type
Text type requiring full-text search
There are many fixed limits on the size of GiST indexes, otherwise, GiST indexes may become extremely large. At the cost of this, GiST indexes are lossy (inexact).

According to official documentation:

"GiST indexes are lossy, which means that the index may produce false matches, so it is necessary to check the real table rows to eliminate false matches. (PostgreSQL will automatically perform this action when necessary)"

This does not mean that you will get a false result, it just means that Postgres will do a small extra work to filter these false results before returning the data to you.

Special note: GIN and GiST indexes can often be used on the same data type. Usually one has good performance but takes up a lot of disk space, and vice versa. When it comes to GIN vs. GiST, there is no one-size-fits-all solution that will work in every situation, but the above rules should apply to most common situations.

SP-GiST index for larger data

The spatially partitioned GiST (SP-GiST) index adopts the spatially partitioned tree from Purdue Research. SP-GiST indexes are often used when your data has a natural clustering factor and is not a balanced tree. Phone numbers are a great example (at least US phone numbers are). They have the following format:

3-digit area code
3-digit prefix number (related to old telephone exchanges)
4-digit line number
This means that there is a natural clustering factor at the first three digits of the first set, followed by the second set of three digits, and then the numbers are evenly distributed. However, in some area codes of phone numbers, there is a higher saturation state than in others. The result can be a very unbalanced tree. Because there is a natural aggregation factor at the front and the data is not equally distributed, data like phone numbers might be a good case for SP-GiST.

BRIN index, for larger data

Block range indexes (BRIN) focus on some situations like SP-GiST, they are best used when the data has some natural ordering, and the data volume is often large. If you have a billion records in chronological order, BRIN may come in handy. If you are querying a large set of data that is naturally grouped, such as several zip codes, BRIN can help you ensure that similar zip codes are stored in close locations on disk.

When you have a very large database sorted by date or zip code, the BRIN index allows you to skip or exclude some unnecessary data very quickly. Additionally, BRIN indexes are relatively small compared to the overall data size, so when you have a large data set, BRIN indexes can perform better.

Hash index, finally not afraid of crash

Hash indexes have been present in Postgres for many years, however, until Postgres 10 was released, there was a huge caveat about their use, it was not WAL-logged. This means that if your server crashes and you can't failover to a standby or restore from an archive using something like wal-g, then you will lose that index until you rebuild it. With the release of Postgres 10, they are now WAL-logged, so you may consider using them again, but the real question is, should you?

Hash indexes sometimes provide faster lookups than B-Tree indexes, and are also fast to create. The biggest problem is that they are restricted to only "equality" comparison operations, so you can only use them for exact match lookups. This makes hash indexes much less flexible than commonly used B-Tree indexes, and you shouldn't think of them as a replacement, but as an index for special cases.

Which one should you use?

We have just introduced a lot, and it is normal if you are a little scared. If you know this before, CREATE INDEX will always create an index for you using a B-Tree, and the good news is that Postgres performs very well or very well for most databases. :) If you're considering using more Postgres features, here's a cheat sheet when you use other Postgres index types:

B-Tree - suitable for most data types and queries
GIN - for JSONB/hstore/arrays
GiST - suitable for full-text search and geometric data types
SP-GiST - suitable for large data sets that have natural aggregation factors but are unevenly distributed
BRIN - suitable for really large data sets with sequential order
Hash - good for equality operations, but usually a B-Tree index is still all you need.
If you have any questions or feedback about this article, feel free to join our slack channel.

The above is the detailed content of PostgreSQL's journey of exploration. For more information, please follow other related articles on the PHP Chinese website!

Statement
This article is reproduced at:Linux就该这么学. If there is any infringement, please contact admin@php.cn delete
What is the salary of Linux administrator?What is the salary of Linux administrator?Apr 17, 2025 am 12:24 AM

The average annual salary of Linux administrators is $75,000 to $95,000 in the United States and €40,000 to €60,000 in Europe. To increase salary, you can: 1. Continuously learn new technologies, such as cloud computing and container technology; 2. Accumulate project experience and establish Portfolio; 3. Establish a professional network and expand your network.

What is the main purpose of Linux?What is the main purpose of Linux?Apr 16, 2025 am 12:19 AM

The main uses of Linux include: 1. Server operating system, 2. Embedded system, 3. Desktop operating system, 4. Development and testing environment. Linux excels in these areas, providing stability, security and efficient development tools.

Does the internet run on Linux?Does the internet run on Linux?Apr 14, 2025 am 12:03 AM

The Internet does not rely on a single operating system, but Linux plays an important role in it. Linux is widely used in servers and network devices and is popular for its stability, security and scalability.

What are Linux operations?What are Linux operations?Apr 13, 2025 am 12:20 AM

The core of the Linux operating system is its command line interface, which can perform various operations through the command line. 1. File and directory operations use ls, cd, mkdir, rm and other commands to manage files and directories. 2. User and permission management ensures system security and resource allocation through useradd, passwd, chmod and other commands. 3. Process management uses ps, kill and other commands to monitor and control system processes. 4. Network operations include ping, ifconfig, ssh and other commands to configure and manage network connections. 5. System monitoring and maintenance use commands such as top, df, du to understand the system's operating status and resource usage.

Boost Productivity with Custom Command Shortcuts Using Linux AliasesBoost Productivity with Custom Command Shortcuts Using Linux AliasesApr 12, 2025 am 11:43 AM

Introduction Linux is a powerful operating system favored by developers, system administrators, and power users due to its flexibility and efficiency. However, frequently using long and complex commands can be tedious and er

What is Linux actually good for?What is Linux actually good for?Apr 12, 2025 am 12:20 AM

Linux is suitable for servers, development environments, and embedded systems. 1. As a server operating system, Linux is stable and efficient, and is often used to deploy high-concurrency applications. 2. As a development environment, Linux provides efficient command line tools and package management systems to improve development efficiency. 3. In embedded systems, Linux is lightweight and customizable, suitable for environments with limited resources.

Essential Tools and Frameworks for Mastering Ethical Hacking on LinuxEssential Tools and Frameworks for Mastering Ethical Hacking on LinuxApr 11, 2025 am 09:11 AM

Introduction: Securing the Digital Frontier with Linux-Based Ethical Hacking In our increasingly interconnected world, cybersecurity is paramount. Ethical hacking and penetration testing are vital for proactively identifying and mitigating vulnerabi

How to learn Linux basics?How to learn Linux basics?Apr 10, 2025 am 09:32 AM

The methods for basic Linux learning from scratch include: 1. Understand the file system and command line interface, 2. Master basic commands such as ls, cd, mkdir, 3. Learn file operations, such as creating and editing files, 4. Explore advanced usage such as pipelines and grep commands, 5. Master debugging skills and performance optimization, 6. Continuously improve skills through practice and exploration.

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
1 months agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
1 months agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
1 months agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Chat Commands and How to Use Them
1 months agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

PhpStorm Mac version

PhpStorm Mac version

The latest (2018.2.1) professional PHP integrated development tool

Safe Exam Browser

Safe Exam Browser

Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

SublimeText3 Linux new version

SublimeText3 Linux new version

SublimeText3 Linux latest version

MinGW - Minimalist GNU for Windows

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.