Home  >  Article  >  Database  >  Can database table partitioning improve insertion efficiency?

Can database table partitioning improve insertion efficiency?

青灯夜游
青灯夜游Original
2020-07-22 11:52:123111browse

Database table partitioning can improve insertion efficiency; database table partitioning can improve the efficiency of table additions, deletions, modifications, and queries. The principle of database table partition insertion operation: when writing a record, the partition layer opens and locks all underlying tables, then determines which partition accepts the record, and then writes the record to the corresponding underlying table.

Can database table partitioning improve insertion efficiency?

What is a partition?

Partitioning is to decompose the table according to rules and divide the data into multiple locations for storage, which can be on the same disk or on different machines. After partitioning, there is still one table on the surface, but the data is hashed to multiple locations. When the app reads and writes, it still operates on the big table name, and the db automatically organizes the partitioned data.

Partitions can be divided into two types:

1. Horizontal Partitioning

This form of partitioning is to partition the rows of the table. In this way, the data sets separated by physical columns in different groups can be combined to perform individual partitioning (single partition) or collective partitioning (one or more partitions). . All columns defined in the table can be found in every data set,

so the characteristics of the table are still maintained.

A simple example: a table containing ten years of invoice records can be partitioned into ten different partitions, each partition containing records for one year. (Note: We will talk about the specific partitioning method used here later. We can say one thing first, it must be divided by a certain attribute column. For example, the column used here is year)

2. Vertical partitioning ( Vertical Partitioning)

This partitioning method generally reduces the width of the target table by vertically dividing the table, so that certain columns are divided into specific partitions, and each partition Contains the rows corresponding to the columns.

Take a simple example: a table contains large text and BLOB columns. These text and BLOB columns are not frequently accessed. At this time, these infrequently used text and BLOB columns must be divided into another A partition can improve access speed while ensuring the correlation of their data.

The principle of partition table

The partition table is implemented by multiple related underlying tables. These underlying tables are also represented by handle objects, so We can also directly access each partition. The storage engine manages the underlying tables of the partition in the same way as managing ordinary tables (all underlying tables must use the same storage engine). The index of the partition table is just to add an identical index to each underlying table. From the perspective of the storage engine, the underlying table is no different from an ordinary table, and the storage engine does not need to know whether it is an ordinary table or part of a partitioned table.

The operation on the partition table is carried out according to the following operation logic:

select query:

When querying a partition table, the partition layer first opens and locks all The underlying table, the optimizer determines whether some partitions can be filtered, and then calls the corresponding storage engine interface to access the data of each partition

insert operation:

When writing a record, the partition layer Open and lock all underlying tables, then determine which partition accepts this record, and then write the record to the corresponding underlying table

delete operation:

When a record is deleted, the partition layer First open and lock all underlying tables, then determine the partition corresponding to the data, and finally delete the corresponding underlying table

update operation:

When updating a piece of data, the partition layer is opened first And lock all the underlying tables. MySQL first determines which partition the record that needs to be updated is located, then takes out the data and updates it, then determines which partition the updated data should be placed in, then writes to the underlying table, and updates the original data. Delete the underlying table where it is located

Although each operation will open and lock all underlying tables, this does not mean that the partition table locks the entire table during processing. If the storage engine can To implement row-level locks, such as innodb, the corresponding table lock will be released at the partition level. This locking and unlocking process is similar to queries on ordinary Innodb.

In the following scenario, partitioning can play a very important role:

A: The table is too large to fit all There is hot data in memory, or only in the last part of the table, and the rest is historical data

B: Partitioned table data is easier to maintain. For example, if you want to delete a large amount of data in batches, you can use the method of clearing the entire partition. . In addition, you can also optimize, check, repair and other operations on an independent partition

C: The data of the partition table can be distributed on different physical devices, thereby efficiently utilizing multiple hardware devices

D: You can use partition tables to avoid some special bottlenecks, such as: mutually exclusive access of a single index in innodb, inode lock competition in ext3 file system, etc.

E: If necessary, you can also backup and restore Independent partitions, which works very well in scenarios with very large data sets

F: Optimize the query. When the partition column is included in the where clause, you can only use the necessary partitions to improve the query efficiency. At the same time, when querying aggregate functions such as sum() and count(), you can add each The partitions are processed in parallel, and in the end only the results from all partitions need to be summarized.

The partition of mysql database always treats null as a smaller value than any non-null value. This is the same as the order by operation in the database that handles null values. When sorting in ascending order, null is always at the front. Therefore, for different partition types, the mysql database handles null differently.

For range partitioning, if null is inserted into the partition column, the MySQL database will put the value into the leftmost partition. Note that if the partition is deleted, all content under the partition will be deleted from the disk. , the partition where null is located is deleted, and the null value is also deleted.

To use null under the list partition, it must be explicitly defined in the hash value of the partition, otherwise an error will be reported when inserting null. Hash and key partitions handle null differently than range and list partitions. Any partition function will return null as 0.

Partition

Partitioning is to divide the database or its constituent elements into different independent parts

——It is a method of pre-organizing table storage

mysql supports horizontal partitioning

Distribute specific table rows as subsets of rows

The distribution of partitions is across physical storage

--As set by the user when needed Specification rules

——Each partition is stored as its own unit

Division of data

——The data is divided into Subset

--The partition type and expression are part of the table definition

--The expression can be an integer or a function that returns an integer value.

——This value determines which partition each record will be stored in according to the definition

1. The primary key and unique key must be included in part of the partition key, otherwise the primary key and unique key will be created. "ERROR 1503 (HY000)" will be reported when indexing

2. When adding a partition to a range partition, you can only append the partition after the maximum value

3. The engines of all partitions must be the same

4. Range partitioning partition fields: integer, numeric expression, date column, date function expression (such as year(), to_days(), to_seconds(), unix_timestamp())

Partition Management

Add Partition

ALTER TABLE sale_data
ADD PARTITION (PARTITION p201010 VALUES LESS THAN (201011));

Delete Partition

–When a partition is deleted Partition, all data in the partition is also deleted.

ALTER TABLE sale_data DROP PARTITION p201010;

Merge of partitions

The following SQL merges p201001 – p201009 into 3 partitions p2010Q1 – p2010Q3

ALTER TABLE sale_data
REORGANIZE PARTITION p201001,p201002,p201003,
p201004,p201005,p201006,
p201007,p201008,p201009 INTO
(
PARTITION p2010Q1 VALUES LESS THAN (201004),
PARTITION p2010Q2 VALUES LESS THAN (201007),
PARTITION p2010Q3 VALUES LESS THAN (201010)
);

Related recommendations:《 PHP tutorial》、《mysql tutorial

The above is the detailed content of Can database table partitioning improve insertion efficiency?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn