When the amount of database data increases to a certain amount, performance becomes an issue we must pay attention to. How to optimize it? There are only a few commonly used methods:
1. Split tables, that is, divide a large expression data into several tables, so that each table does not have much data.
Advantages: Increase concurrency and reduce lock granularity
Disadvantages: The code maintenance cost is high, and the relevant SQL needs to be modified
2. Partition, all data is still in one table, but the physical storage data is stored in different files according to certain rules, and the files can also be placed on another disk
Advantages: The amount of code maintenance is small, basically no changes are needed, and the IO throughput is improved
Disadvantages: The degree of concurrency of the table has not increased
3. Split the business. The essence of this is to split the table.
Advantages: better long-term support
Disadvantages: code logic reconstruction, heavy workload
Of course, each situation has a suitable application scenario, which needs to be selected according to the specific business. Since the table splitting and splitting business has little to do with MySQL itself and belongs to the business level, we will only talk about the method most closely related to the database: table partitioning. However, a prerequisite for using table partitioning is that your database must support it. So, how do I know whether my database supports table partitioning? Please execute the following command<🎜>
The code is as follows:
show plugins; ---Execute in mysql console
It is said that versions below 5.4 have another command, but I have not tested it
The code is as follows:
show variables like '%part%';
1. Create a partition table
If you want to take advantage of table partitioning, not only must the database version support partitioning, but the key is to build a partitioned table. This table is different from an ordinary table, and the partition must be specified when creating the table, otherwise the ordinary table cannot be changed to Partition table. So, what if we create a partitioned table? The rest is very simple, please see the table creation statement below
CREATE TABLE `T_part` ( `f_id` INT DEFAULT NULL, `f_name` VARCHAR (20) DEFAULT NULL, PRIMARY KEY (`f_id`) ) ENGINE = myisam DEFAULT CHARSET = utf8 PARTITION BY RANGE (f_id)( -----指定分区方式 PARTITION p0 VALUES less THAN (10),-- 分了两个区 PARTITION p1 VALUES less THAN (20) )The above statement creates a "T_part" table with two fields f_id and f_name, and divides the table into two partitions p0 and p1 according to the RANGE method. When f_id is less than 10, it is put into the p0 partition. When f_id is greater than 0 and less than 20 Put it into partition p1. Then which partition should the data with f_id greater than 20 be put into? You guessed it, the insert statement will report an error.
See, creating a partition table is so simple! Of course, you can add and delete partitions at any time, but be aware that when deleting a partition, all data under the current partition will be deleted.
The code is as follows:
alter table T_part add partition(partition p2 values less than (MAXVALUE)); ---Add new partition
alter table T_part DROP partition p2; ----Delete partition
2. Several ways of table partitioning
MySQL supports 5 partitioning methods: RANGE partition, LIST partition, HASH partition, LINEAR HASH partition and KEY partition. Each partition has its own usage scenarios.
A RANGE-partitioned table is partitioned in such a way that each partition contains rows whose partition expression values lie within a given continuous range. These intervals must be continuous and cannot overlap each other, and are defined using the VALUES LESS THAN operator.
The above example is the RANGE partition.
2) LIST partition:
The LIST partition in MySQL is similar to the RANGE partition in many ways. As with partitioning by RANGE, each partition must be clearly defined. The main difference between them is that the definition and selection of each partition in the LIST partition is based on the value of a column belonging to a value in a value list set, while the RANGE partition belongs to a set of continuous interval values. LIST partitioning is achieved by using "PARTITION BY LIST(expr)", where "expr" is a column value or an expression based on a column value and returns an integer value, and then through "VALUES IN (value_list)" way to define each partition, where "value_list" is a comma-separated list of integers.
CREATE TABLE `T_list` ( `f_id` INT DEFAULT NULL, `f_name` VARCHAR (20) DEFAULT NULL, PRIMARY KEY (`f_id`) ) ENGINE = myisam DEFAULT CHARSET = utf8 PARTITION by list(f_id) ( PARTITION p0 VALUES in(1,2,3), ----区间值不能重复 PARTITION p1 VALUES in(4,5,6) );3)HASH partition:
HASH分区主要用来确保数据在预先确定数目的分区中平均分布。在RANGE和LIST分区中,必须明确指定一个给定的列值或列值集合应该保存在哪个分区中;而在HASH分区中,MySQL 自动完成这些工作,你所要做的只是基于将要被哈希的列值指定一个列值或表达式,以及指定被分区的表将要被分割成的分区数量。要使用HASH分区来分割一个表,要在CREATE TABLE 语句上添加一个“PARTITION BY HASH (expr)”子句,其中“expr”是一个返回一个整数的表达式。它可以仅仅是字段类型为MySQL 整型的一列的名字。此外,你很可能需要在后面再添加一个“PARTITIONS num”子句,其中num 是一个非负的整数,它表示表将要被分割成分区的数量。
CREATE TABLE `T_hash` ( `f_id` INT DEFAULT NULL, `f_name` VARCHAR (20) DEFAULT NULL, PRIMARY KEY (`f_id`) ) ENGINE = myisam DEFAULT CHARSET = utf8 PARTITION BY HASH(f_id) ---可以指定多列 PARTITIONS 4;---分区个数
“expr”还可以是MySQL 中有效的任何函数或其他表达式,只要它们返回一个既非常数、也非随机数的整数。(换句话说,它既是变化的但又是确定的)。但是应当记住,每当插入或更新(或者可能删除)一行,这个表达式都要计算一次;这意味着非常复杂的表达式可能会引起性能问题,尤其是在执行同时影响大量行的运算(例如批量插入)的时候。最有效率的哈希函数是只对单个表列进行计算,并且它的值随列值进行一致地增大或减小,因为这考虑了在分区范围上的“修剪”。也就是说,表达式值和它所基于的列的值变化越接近,MySQL就可以越有效地使用该表达式来进行HASH分区。
4)LINEAR HASH分区:
MySQL还支持线性哈希功能,它与常规哈希的区别在于,线性哈希功能使用的一个线性的2的幂(powers-oftwo)运算法则,而常规 哈希使用的是求哈希函数值的模数。线性哈希分区和常规哈希分区在语法上的唯一区别在于,在“PARTITION BY” 子句中添加“LINEAR”关键字.
5)KEY分区:
按照KEY进行分区类似于按照HASH分区,除了HASH分区使用的用户定义的表达式,而KEY分区的 哈希函数是由MySQL 服务器提供。MySQL 簇(Cluster)使用函数MD5()来实现KEY分区;对于使用其他存储引擎的表,服务器使用其自己内部的 哈希函数,这些函数是基于与PASSWORD()一样的运算法则。
KEY分区的语法和HASH语法类似,只是把关键字改成KEY。
CREATE TABLE `T_key` ( `f_id` INT DEFAULT NULL, `f_name` VARCHAR (20) DEFAULT NULL, PRIMARY KEY (`f_id`) ) ENGINE = myisam DEFAULT CHARSET = utf8 PARTITION BY LINEAR key(f_id) PARTITIONS 3;
6)子分区:
子分区的意思就是在分区的基础上再次分区。且每个分区必须有相同个数的子分区。
CREATE TABLE `T_part` ( `f_id` INT DEFAULT NULL, `f_name` VARCHAR (20) DEFAULT NULL, PRIMARY KEY (`f_id`) ) PARTITION BY RANGE (f_id) SUBPARTITION BY HASH(F_ID) SUBPARTITIONS 2 ( PARTITION p0 VALUES less THAN (10), PARTITION p1 VALUES less THAN (20) )
上面语句的意思是,建立两个range分区,每个分区根据hash有分别有两个子分区,实际上整个表分成2×2=4个分区。当然,要详细定义每个分区属性也是可以的
CREATE TABLE `T_part` ( `f_id` INT DEFAULT NULL, `f_name` VARCHAR (20) DEFAULT NULL, PRIMARY KEY (`f_id`) ) PARTITION BY RANGE (f_id) SUBPARTITION BY HASH(F_ID) ( PARTITION p0 VALUES less THAN (10) ( SUBPARTITION s0 DATA DIRECTORY = '/disk0/data' INDEX DIRECTORY = '/disk0/idx', SUBPARTITION s1 DATA DIRECTORY = '/disk1/data' INDEX DIRECTORY = '/disk1/idx' ), PARTITION p1 VALUES less THAN (20) ( SUBPARTITION s2 DATA DIRECTORY = '/disk0/data' INDEX DIRECTORY = '/disk0/idx', SUBPARTITION s3 DATA DIRECTORY = '/disk1/data' INDEX DIRECTORY = '/disk1/idx' ) )
这样可以对每个分区指定具体存储磁盘。前提磁盘是存在的。
MySQL 中的分区在禁止空值(NULL)上没有进行处理,无论它是一个列值还是一个用户定义表达式的值。一般而言,在这种情况下MySQL 把NULL视为0。如果你希望回避这种做法,你应该在设计表时不允许空值;最可能的方法是,通过声明列“NOT NULL”来实现这一点。