Home >Database >Mysql Tutorial >Summary of MySQL database design
Rule 1: In general, you can choose the MyISAM storage engine. If you need transaction support, you must use the InnoDB storage engine.
Note: MyISAM storage engine B-tree index has a big limitation: the sum of the lengths of all fields participating in an index cannot exceed 1000 bytes. In addition, MyISAM data and indexes are separated, while InnoDB's data storage is ordered by cluster index, and the primary key is the default cluster index. Therefore, although MyISAM's query performance is generally higher than InnoDB's , but InnoDB’s query performance based on the primary key is very high.
Rule 2: Naming rules.
The database and table names should be as consistent as possible with the name of the business module being served
The first-class table serving the same sub-module should be as consistent as possible Use the submodule name (or part of the word) as the prefix or suffix
The table name should try to contain the words corresponding to the stored data
field The name should also be kept consistent with the actual data as much as possible
The joint index name should try to include all index key field names or abbreviations, and the order of each field name in the index name should be consistent with the index key The index order in the index is consistent, and try to include a prefix or suffix similar to idx to indicate that the object type is an index.
Other objects such as constraints should also contain the names of the tables or other objects to which they belong as much as possible to indicate their respective relationships
Rules 3: Database field type definition
For fields that often require calculation and sorting that consume CPU, you should try to choose faster fields, such as TIMESTAMP
(4 bytes, minimum value 1970-01-01 00:00:00) instead of Datetime
(8 bytes, minimum value 1001-01-01 00:00:00), by integer Instead of floating point and character type
, use varchar
for variable-length fields. Do not use char
For binary multimedia data, pipeline data (such as logs), and very large text data, do not place it in database fields
Rule 4: The business logic execution process must The table read must have initial values. Avoid negative or infinite values read out by the business, causing program failure
Rule 5: It is not necessary to adhere to the paradigm theory, moderate redundancy, and let Query minimize Join
Rule 6: Split large fields with low access frequency into data tables. Some large fields take up a lot of space and are accessed much less frequently than other fields. In this case, by splitting the large fields, there is no need to read the large fields in frequent queries, resulting in a waste of IO resources.
Rule 7: Horizontal splitting can be considered for large tables. Large tables affect query efficiency. There are many splitting methods based on business characteristics. For example, data that increases based on time can be divided based on time. Data divided by id can be split according to id% number of databases.
Rule 8: The relevant indexes required by the business are determined according to the where condition of the SQL statement constructed according to the actual design. Do not build indexes that are not needed by the business, and are not allowed to be used in joint indexes ( or primary key) contains more than one field. In particular, the field will not appear in the conditional statement at all.
Rule 9: A primary key or a unique index must be established to uniquely determine one or more fields of a record. If a record cannot be uniquely determined, a common index must be built in order to improve query efficiency
Rule 10: Some tables used by the business have very few records, or even only one record. To meet the needs of constraints, indexes or primary keys must be created.
Rule 11: For fields whose values cannot be repeated and are often used as query conditions, a unique index should be built (the primary key defaults to a unique index), and the conditions for this field in the query conditions should be placed in First position. There is no need to create a joint index related to this field.
Rule 12: For frequently queried fields whose values are not unique, you should also consider establishing a normal index. Put the field condition in the first position in the query statement and process the joint index. The method is the same.
Rule 13: When the business accesses data through a non-unique index, the density of the records returned through the index value needs to be considered. In principle, the maximum possible density cannot be higher than 0.2. If it is dense If the degree is too large, it is not suitable to create an index.
When the amount of data retrieved through this index accounts for more than 20% of all data in the table, you need to consider the cost of establishing the index. At the same time, because index scanning generates random I/O, the resulting Its efficiency is much lower than sequential I/O of full table sequential scan. The database system may not use this index when optimizing the query.
Rule 14: Databases that require joint indexes (or joint primary keys) should pay attention to the order of the indexes. The matching conditions in the SQL statement must also be consistent with the order of the index.
Note: Incorrect indexing may also lead to serious consequences.
Rule 15: Multiple field queries in the table are used as query conditions, do not contain other indexes, and the joint values of the fields are not repeated. A unique joint index can be built on these multiple fields. Assume that the index The field is (a1, a2,...an), then the query condition (a1 op val1, a2 op val2,...am op valm)m<=n
, you can use the index, query condition The position of the field in is consistent with the position of the field in the index.
Rule 16: Principles for establishing joint indexes (the following assumes that a joint index (a, b, c) is established on fields a, b, c of the database table)
The fields in the joint index should try to satisfy the order of filtered data from most to least, that is to say, the field with the biggest difference should be the first field
Try to create an index that is consistent with the condition order of the SQL statement, so that the SQL statement is based on the entire index as much as possible, and try to avoid using a part of the index (especially when the first condition is inconsistent with the first field of the index) as the query condition
Where a=1,where a>=12 and a<15,where a=1 and b<5 ,where a=1 and b=7 and c>=40 are conditions This joint index can be used; however, these statements where b=10, where c=221, where b>=12 and c=2 cannot use this joint index.
When all the database fields that need to be queried are reflected in the index, the database can directly query the index to obtain the query information without scanning the entire table (this is the so-called key-only ), which can greatly improve query efficiency.
Indices can be used when a, ab, abc are queried related to other table fields
When a, ab, abc are in order instead of b, c, bc, ac. Indexes can be used when executing Order by or group.
In the following situations, table scanning and sorting may be more effective than using joint indexes
a. The table has been organized according to the index
b. A large proportion of all the data in the queried data station.
Rule 17: When important business accesses data tables. But when the data cannot be accessed through the index, you should ensure that the number of records accessed sequentially is limited, in principle no more than 10.
Rule 18: Reasonably construct the Query statement
In the Insert statement, according to the test, the efficiency is highest when inserting 1,000 items in a batch at a time. When there are more than 1,000 items, it needs to be split. If the same insertion is performed multiple times, it should be merged into batches. Note that the length of the query statement should be smaller than the mysqld parameter max_allowed_packet
The performance order of the various logical operators in the query conditions is and, or, in, so you should try to avoid using them in the query conditions. Use in
in a large set. Always use a small result set to drive a large record set, because in mysql, there is only one Join method, Nested Join, which means that the join of mysql is through a nested loop. to achieve. Use the principle of small result sets to drive large record sets to reduce the number of nested loop loops to reduce the total amount of IO and the number of CPU operations
Try to optimize the inner loop of Nested Join.
Only take the required columns, try not to use select *
Only use the most effective filter fields, there are few filter conditions in the where clause For the best
Try to avoid complex Joins and subqueries
Mysql is not very good at concurrency. When the amount of concurrency is too high, the overall performance will decrease. The sharp decline is mainly related to the contention lock control of Mysql's internal resources. MyIsam uses table locks, and InnoDB uses row locks.
Rule 19: Optimization of application system
Use cache reasonably, for parts that change less Active data is cached into memory through the application layer cache, which improves performance by orders of magnitude.
Merge the same query repeatedly to reduce the number of IOs.
c. Principle of Minimum Transaction Relevance
The above is the detailed content of Summary of MySQL database design. For more information, please follow other related articles on the PHP Chinese website!