The amount of data of the database reaches a certain degree, to avoid bringing system performance bottlenecks. Data needs to be processed by means of partitioning, sharding, databases, and tables.
Sharding is an effective way to scale out the database to multiple physical nodes. , its main purpose is to break through the I/O capacity limitations of single-node database servers and solve database scalability problems. The word shard means "fragment". If a database is treated as a large piece of glass and the glass is broken, then each small piece is called a fragment of the database (Database Shard). The process of breaking the entire database into pieces is called sharding, which can be translated as sharding.
Formally, sharding can be simply defined as a partitioning scheme that distributes a large database across multiple physical nodes. Each partition contains a certain part of the database, called a slice. The partitioning method can be arbitrary and is not limited to traditional horizontal partitioning and vertical partitioning. A shard can contain the contents of multiple tables or even multiple database instances. Each shard is placed on a database server. A database server can handle one or more shards of data. A server is required in the system for query routing and forwarding, and is responsible for forwarding the query to the shard or shard collection node containing the data accessed by the query for execution.
Mysql’s expansion solutions include Scale Out and Scale Up.
Scale Out (horizontal expansion) means that the Application can be expanded in the horizontal direction. Generally speaking, for data center applications, Scale out means that when more machines are added, the application can still make good use of the resources of these machines to improve its own efficiency and achieve good scalability.
Scale Up (vertical expansion) means that the Application can expand in the vertical direction. Generally speaking, for a single machine, Scale Up is worth it. When a computing node (machine) adds more CPU Cores, storage devices, and uses larger memory, the application can make full use of these resources to improve its efficiency. Thus achieving good scalability.
MySql’s Sharding strategy includes vertical sharding and horizontal sharding.
Vertical (vertical) split: refers to splitting by functional modules to solve the io competition between tables. For example, it is divided into order database, product database, user database... In this way, the table structures of multiple databases are different.
Horizontal (horizontal) split: Save the data of the same table in blocks and save it in different databases to solve the pressure of increasing data volume in a single table. The table structures in these databases are exactly the same.
Table structure design is divided vertically. Some common scenarios include
#Vertical segmentation of large fields. Separately build large fields in another table to improve the access performance of the basic table. In principle, in performance-critical applications, large fields of the database should be avoided.
Vertical segmentation according to usage . For example, enterprise material attributes can be vertically segmented according to basic attributes, sales attributes, purchasing attributes, manufacturing attributes, financial accounting attributes, etc.
Vertically segmented according to access frequency. For example, in e-commerce and Web 2.0 systems, if there are a lot of user attribute settings, you can vertically separate basic, frequently used attributes and infrequently used attributes.
Table structure design is divided horizontally . Some common scenarios include
For example, on an online e-commerce website, the amount of order table data is too large, and it is segmented at the annual and monthly levels
Web 2.0 If there are too many registered users and online active users on the website, horizontally segment the relevant users and the tables closely related to the user according to the user ID range. For example, the top posts of the forum, Because of the paging problem, each page needs to display the pinned post. In this case, the pinned post can be divided horizontally to avoid reading from the table of all posts when fetching the pinned post
The difference between sub-tables and partitions
After the data is divided into tables, the data is stored in the divided tables. The main table is just a shell, and data access occurs in each divided table. There is no concept of table partitioning in partitioning. Partitioning just divides the file storing data into many small blocks. The partitioned table is still one table, and the data processing is still completed by yourself.
After dividing the tables, the concurrency capability of a single table is improved, and the disk I/O performance is also improved. The partition breaks through the disk I/O bottleneck, and I want to improve the read and write capabilities of the disk to increase mysql performance.
At this point, the testing focus of partitions and sub-tables is different. The focus of sub-tables is how to improve the concurrency of MySQL when accessing data; and for partitions, how to break through the read and write capabilities of the disk to achieve The purpose of improving mysql performance.
There are many ways to divide tables. Using merge to divide tables is the simplest way. This method is about as easy as partitioning and can be transparent to the program code. If you use other table partitioning methods, it will be more troublesome than partitioning. The implementation of partitioning is relatively simple. Creating a partitioned table is no different from building an ordinary table, and it is transparent to the code side.
When the query speed of a table is slow enough to affect its use.
The data in the table is segmented
Operations on data often only involve part of the data, not all the data
CREATE TABLE sales ( id INT AUTO_INCREMENT, amount DOUBLE NOT NULL, order_day DATETIME NOT NULL, PRIMARY KEY(id, order_day)) ENGINE=InnodbPARTITION BY RANGE(YEAR(order_day)) ( PARTITION p_2010 VALUES LESS THAN (2010), PARTITION p_2011 VALUES LESS THAN (2011), PARTITION p_2012 VALUES LESS THAN (2012),PARTITION p_catchall VALUES LESS THAN MAXVALUE);
The query speed of a table has been slow enough to affect its use.
When inserting frequently or doing joint queries, the speed becomes slower.
The implementation of sub-tables requires a combination of business implementation and migration, which is relatively complex.
Table sharding can solve the problem of reduced query efficiency caused by excessive data volume in a single table. However, it cannot provide sufficient data to the database. The concurrent processing capabilities bring qualitative improvements. In the face of highly concurrent read and write access, when the database master server cannot bear the pressure of write operations, it is meaningless no matter how to expand the slave server. Therefore, we must change our thinking and split the database to improve the database writing capability. This is the so-called sub-database.
Similar to the table sharding strategy, sharding can use a keyword modulo to route data access, as shown in the figure below
Recommendation: "mysql video tutorial"
The above is the detailed content of A detailed explanation of MySql tables, databases, shards and partitions. For more information, please follow other related articles on the PHP Chinese website!