The role of database index-SQL-php.cn

Home

Database

SQL

The role of database index

hzc

Jul 03, 2020 pm 05:19 PM

database

The biggest role of a database index is to speed up queries. It can fundamentally reduce the number of record rows that need to be scanned. The database index is the data structure of the database. Furthermore, the data structure stores a All values of a column in the table, that is to say, the index is created based on a column in the data table.

The role of database index

Database index is an identifier attached to table fields in order to increase query speed. I have seen many people understand the concept of index mechanically and think that adding indexes only has benefits and no harm. Here I would like to summarize the previous index study notes:

First understand why the index will increase the speed. When the DB executes an Sql statement, the default method is to perform a full table scan based on the search conditions, and when a matching condition is encountered is added to the search result collection. If we add an index to a certain field, when querying, we will first locate the number of rows with a specific value in the index list, which greatly reduces the number of matching rows traversed, so the query speed can be significantly increased. So should indexing be added at any time? Here are a few counter-examples: 1. If you need to get all table records every time, and you must perform a full table scan anyway, then there is no point in adding an index. 2. For non-unique fields, such as "gender", which have a large number of repeated values, adding indexes is meaningless. 3. For tables with relatively few records, adding indexes will not bring about speed optimization but waste storage space, because indexes require storage space, and there is a fatal disadvantage that for each execution of update/insert/delete, the field All indexes must be recalculated for updates.

So when is it appropriate to add an index? Let's look at an example given in the Mysql manual. Here is a sql statement:

SELECT c.companyID, c.companyName FROM Companies c, User u WHERE c.companyID = u.fk_companyID AND c.numEmployees > = 0 AND c.companyName LIKE '%i%' AND u.groupID IN (SELECT g.groupID FROM Groups g WHERE g.groupLabel = 'Executive')

This statement involves the join of 3 tables. And includes many search conditions such as size comparison, Like matching, etc. The number of scan rows that Mysql needs to perform without an index is 77721876 rows. After we add indexes to the companyID and groupLabel fields, the number of scanned rows is only 134. In Mysql, you can view the number of scans through Explain Select. It can be seen that in the case of such joint tables and complex search conditions, the performance improvement brought by the index is far more important than the disk space it occupies.

So how is the index implemented? Most DB vendors implement indexes based on a data structure - B-tree. Because the characteristic of B-tree is that it is suitable for organizing dynamic lookup tables on direct storage devices such as disks. The definition of B-tree is as follows: A B-tree of order m(m>=3) is an m-ary tree that satisfies the following conditions:

1. Each node includes the following scope (j, p0 , k1, p1, k2, p2, ... ki, pi) where j is the number of keywords, p is the child pointer

2. All leaf nodes are on the same layer, and the number of layers is equal to the height of the tree h

3. The number of keywords contained in each non-root node satisfies [m/2-1]

4. If the tree is not empty , then the root has at least 1 keyword. If the root is not a leaf, there are at least 2 subtrees and at most m subtrees

Look at an example of a B-tree. For a B-tree with 26 English letters, this can be done structure:

It can be seen that the complexity of searching English letters in this B-tree is only O(m). When the amount of data is relatively large, such a structure can greatly increase the query speed. However, there is another data structure that performs queries faster than B-trees - hash tables. The definition of the Hash table is as follows: Let the set of all possible keywords be u, the actually stored keywords are denoted k, and |k| is much smaller than |u|. The hashing method is to map u to the subscript of the table T[0,m-1] through the hash function h, so that the keywords in u are variables, and the result of the function operation with h is the storage address of the corresponding node. . Thus, the search can be completed in O(1) time.
However, the hash table has a flaw, that is, hash conflict, that is, two keywords calculate the same result through the hash function. Let m and n represent the length of the hash table and the number of filled nodes respectively. n/m is the filling factor of the hash table. The larger the factor, the greater the chance of hash conflict.
Because of this flaw, the database will not use hash tables as the default implementation of indexes. Mysql claims that it will try to convert the disk-based B-tree index into a suitable hash index according to the execution query format in order to pursue further progress. Improve search speed. I think other database vendors will have similar strategies. After all, in the database battlefield, search speed and management security are very important competitive points.

The above is the detailed content of The role of database index. For more information, please follow other related articles on the PHP Chinese website!

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Getting Started with SQL: Essential Concepts and SkillsApr 22, 2025 am 12:01 AM

SQL is a language used to manage and operate relational databases. 1. Create a table: Use CREATETABLE statements, such as CREATETABLEusers(idINTPRIMARYKEY, nameVARCHAR(100), emailVARCHAR(100)); 2. Insert, update, and delete data: Use INSERTINTO, UPDATE, DELETE statements, such as INSERTINTOusers(id, name, email)VALUES(1,'JohnDoe','john@example.com'); 3. Query data: Use SELECT statements, such as SELEC

SQL: The Language, MySQL: The Database Management SystemApr 21, 2025 am 12:05 AM

The relationship between SQL and MySQL is: SQL is a language used to manage and operate databases, while MySQL is a database management system that supports SQL. 1.SQL allows CRUD operations and advanced queries of data. 2.MySQL provides indexing, transactions and locking mechanisms to improve performance and security. 3. Optimizing MySQL performance requires attention to query optimization, database design and monitoring and maintenance.

What SQL Does: Managing and Manipulating DataApr 20, 2025 am 12:02 AM

SQL is used for database management and data operations, and its core functions include CRUD operations, complex queries and optimization strategies. 1) CRUD operation: Use INSERTINTO to create data, SELECT reads data, UPDATE updates data, and DELETE deletes data. 2) Complex query: Process complex data through GROUPBY and HAVING clauses. 3) Optimization strategy: Use indexes, avoid full table scanning, optimize JOIN operations and paging queries to improve performance.

SQL: A Beginner-Friendly Approach to Data Management?Apr 19, 2025 am 12:12 AM

SQL is suitable for beginners because it is simple in syntax, powerful in function, and widely used in database systems. 1.SQL is used to manage relational databases and organize data through tables. 2. Basic operations include creating, inserting, querying, updating and deleting data. 3. Advanced usage such as JOIN, subquery and window functions enhance data analysis capabilities. 4. Common errors include syntax, logic and performance issues, which can be solved through inspection and optimization. 5. Performance optimization suggestions include using indexes, avoiding SELECT*, using EXPLAIN to analyze queries, normalizing databases, and improving code readability.

SQL in Action: Real-World Examples and Use CasesApr 18, 2025 am 12:13 AM

In practical applications, SQL is mainly used for data query and analysis, data integration and reporting, data cleaning and preprocessing, advanced usage and optimization, as well as handling complex queries and avoiding common errors. 1) Data query and analysis can be used to find the most sales product; 2) Data integration and reporting generate customer purchase reports through JOIN operations; 3) Data cleaning and preprocessing can delete abnormal age records; 4) Advanced usage and optimization include using window functions and creating indexes; 5) CTE and JOIN can be used to handle complex queries to avoid common errors such as SQL injection.

SQL and MySQL: Understanding the Core DifferencesApr 17, 2025 am 12:03 AM

SQL is a standard language for managing relational databases, while MySQL is a specific database management system. SQL provides a unified syntax and is suitable for a variety of databases; MySQL is lightweight and open source, with stable performance but has bottlenecks in big data processing.

SQL: The Learning Curve for BeginnersApr 16, 2025 am 12:11 AM

The SQL learning curve is steep, but it can be mastered through practice and understanding the core concepts. 1. Basic operations include SELECT, INSERT, UPDATE, DELETE. 2. Query execution is divided into three steps: analysis, optimization and execution. 3. Basic usage is such as querying employee information, and advanced usage is such as using JOIN connection table. 4. Common errors include not using alias and SQL injection, and parameterized query is required to prevent it. 5. Performance optimization is achieved by selecting necessary columns and maintaining code readability.

SQL: The Commands, MySQL: The EngineApr 15, 2025 am 12:04 AM

SQL commands are divided into five categories in MySQL: DQL, DDL, DML, DCL and TCL, and are used to define, operate and control database data. MySQL processes SQL commands through lexical analysis, syntax analysis, optimization and execution, and uses index and query optimizers to improve performance. Examples of usage include SELECT for data queries and JOIN for multi-table operations. Common errors include syntax, logic, and performance issues, and optimization strategies include using indexes, optimizing queries, and choosing the right storage engine.

See all articles