


[MySQL database] Interpretation of Chapter 4: Schema and data type optimization (Part 2)
4.2 Traps in MySQL schema design
Because the mysql implementation mechanism causes some specific errors, how to avoid them, let’s take a look:
1. Too many columns
When the MySQL storage engine api works, it needs to copy the data in the row buffer format at the server layer and storage engine layer, and then decode the buffer content into For each column, the operation of converting the encoded columns into row data from the row buffer is expensive. Myisam fixed-length rows exactly match the server row structure and do not require conversion; however, the row structure of variable-length row structure InnoDB Conversion is always required, The conversion cost depends on the number of columns.
2. Too many associations
Entity-Attribute-ValueEAV: Poor design pattern, mysql limits each association operation to a maximum of 61 tables, but the EAV database requires many self-correlation; a rough rule of thumb, if you want the query to execute quickly and have good concurrency,It is best to do correlations in 12 tables for a single query;
3. Prevent overuse of enumerationsBe careful to prevent overuse of enumerations; use foreign keys to associate with dictionary tables or lookup tables to find specific values. In mysql, you need to useenumeration columns When adding a value to the table, you need to do an alter table; MySQL5.0 earlier alter tableblocking operation, in the 5.1 updated version, it will not be added at the end of the list. Also need to alter table
4. Null that is not invented here It is recommended to use 0, special value, empty string instead of null value. Try not to be null; but don’t go to extremes. , in some scenarios, using null will be better:create table ……( //全0 (不可能的日期)会导致很多问题 dt datetime not null default '0000-00-00 00:00:00' …… )
MySQL will store null values in the index, Oracle will not
4.3 Normal form and anti-normal form4.3.1 Advantages and Disadvantages1. Normalized update operations are faster2. When the data is well normalized, there is rarely duplicate data and only Modify less data3. The normalized table is smaller and can be better placed in the memory, and the operation is performed faster4. There is little redundant data and the list data is retrieved. Less distinct and group by statements are neededDisadvantages:
Requires association, which is costly and may invalidate the index
4.3.2 Advantages and disadvantages of anti-paradigm
Avoid association. Data larger than memory may be much faster than association (avoiding random I/O)4.4 Cache tables and summary tablesCache tables:
are very effective for optimizing search and retrieval query statements, store those that can be more easily retrieved from other The table that obtains data (each acquisition speed is relatively slow)Summary table:The table that saves the data aggregated using the group by statement
Determines whether to maintain the data in real time when using it Or rebuild regularly,Rebuild regularly: save resources, have less fragmentation, and sequentially organized indexes (efficient)
When rebuilding, ensure that the data is still available during operation, through"shadow table " To achieve, shadow table: a table created behind the real table. After completing the table creation operation, you can switch between the shadow table and the original table through the atomic rename operation
##4.4.1 Materialized View
Pre-calculated and stored on disk table
, can be refreshed and updated through various strategies, mysql does not support it natively , can be implemented using the Justin Swanhart tool flexviews:flexviews composition:
Change data capture, read the server binary log and parse the relevant lines Changes
- A series of stored procedures that can help create and manage view definitions
- Some can apply changes to materialized views in the database Tools
- flexviews can incrementally
the contents of a materialized view by extracting changes to the source table: No need to query the original data (Efficient) 4.4.2 Counter table Counter table: caches the number of friends of a user, the number of file downloads, etc. It is recommended to create a
independent table to store the counter, Avoid query cache failure;
Updates and transactions can only be executed serially. For higher concurrency, the counter can be saved in multiple rows, and one row is randomly selected to update each time. When the results are to be counted, aggregate query ;(I have read this two or three times, maybe it is stupid, it means that the same counter saves multiple points, one of them is selected to update each time, and the final sum is calculated. It seems that it is not easy to understand, so please read it a few more times) 4.5 Speed up the alter table operation
Most modifications to the table structure of mysql are: creating an empty table with new results, finding all data from the old table and inserting it into the new table, and deleting the old table
mysql5.1及更新包含一些类型的“在线”操作的支持,整个过程不需要全锁表,最新版的InnoDB(MySQL5.5和更新版本中唯一的InnoDB)支持通过排序来建索引,建索引更快且紧凑的布局;
一般而言,大部分alter table导致mysql服务中断,对常见场景,使用的技巧:
1、先在一台不提供服务的机器上执行alter table操作,然后和提取服务的主库进行切换
2、影子拷贝,用要求的表结构创建张和源表无关的新表,通过重命名、删表交换两张表(上有)
不是all的alter table都引起表重建,理论上可跳过创建表的步骤:列默认值实际上存在表的.frm文件中,so可直接修改这个文件不需要改动表本身,但mysql还没有采用这种优化方法,all的modify column将导致表重建;
alter column:通frm文件改变列默认值:alter table容许使用alter column、modify column change column修改列,三种操作不一样;
alter table sakila.film alter column rental_duration set default 5;
4.5.1只修改frm文件
mysql有时在没有必要的时候也重建表,如果愿冒一些风险,可做些其他类型的修改而不用重建表:下面操作可能不能正常工作,先备份数据
下面操作不需要重建表:
1、移除一个列的auto_increment
2、增加、移除、更改enum和set常量,如果移除的是被用到的常量、查询返回空字符串
基本技术为想要的表结果创建新的frm文件,然后用它替换掉已经存在的那张表的frm文件:
1、创建一张有相同结构的空表,进行所需的修改
2、执行flush tables with read lock:关闭all正在使用的表且禁止任何表被打开
3、交换frm文件
4、执行unlock tables释放第2步的读锁
示例略
4.5.2快速创建myISAM索引
1、为高效地载入数据到MyISAM表,常用技巧:先禁用索引、载入数据、重启索引:因为构建索引的工作延迟到数据载入后,此时可通过排序构建索引,快且使得索引树的碎片更少、更紧凑
但是对唯一索引无效(disable keys),myisam会在内存中构造唯一索引且为载入的每一行检查唯一性,一旦索引大小超过有效内存、载入操作会越来越慢;
2、在现代版InnoDB中,有个类似技巧:先删除all非唯一索引,然后增加新的列,最后重建删除掉的索引(依赖于innodb快速在线索引创建功能)Percona server可自动完成这些操作;
3、像前alter table 的骇客方法来加速这个操作,但需多做些工作且承担风险,这对从备份中载入数据很有用,如already know all data is effective ,and no need to do the unique check
用需要的表结构创建一张表,不包括索引(如用load data file 且载入的表是空的,myisam可排序建索引)
载入数据到表中以构建MYD文件
按需要的结构创建另外一张空表,这次要包含索引,会创建.frm .MYI文件
获读锁并刷新表
重命名第二张表的frm文件 MYI,让mysql认为这是第一张表的文件
释放读锁
使用repair table来重建表的索引,该操作会通过排序来构建all索引、包括唯一索引
4.6总结
良好的schema设计原则是普通使用的,但mysql有自己的实现细节要注意,概括来说:尽可能保持任何东西小而简单总是好的;mysql喜欢简单(好恰、我也是)
最好避免使用bit
使用小而简单的合适类型;
尽量使用整型定义标识列
Avoid over-design, such as schema design that will lead to extremely complex queries, or many columns;
You should avoid using null values as much as possible unless you have real data If there are exact needs in the model
Try to use the same type to store similar and related values, especially the columns used in the association conditions
Note Variable-length strings, which can lead to pessimistic max-length allocations when using temporary tables and sorting
Avoid using abandoned features, such as specifying the precision of floating point numbers, or the precision of integers Display width
Use enum and set carefully, although they are very convenient to use, don’t abuse them, sometimes they will become traps
Paradigm is good Yes, but denormalization is sometimes necessary; precomputing, caching or generating summary tables can also be of great benefit
alter table In most cases, the table will be locked and the entire table will be rebuilt ( Painful) This chapter provides some risky methods. Most scenarios must use other more conventional methods
Related articles:
[MySQL database 】Chapter 3 Interpretation: Server Performance Analysis (Part 1)
[MySQL Database] Chapter 3 Interpretation: Server Performance Analysis (Part 2)
The above is the detailed content of [MySQL database] Interpretation of Chapter 4: Schema and data type optimization (Part 2). For more information, please follow other related articles on the PHP Chinese website!

InnoDB uses redologs and undologs to ensure data consistency and reliability. 1.redologs record data page modification to ensure crash recovery and transaction persistence. 2.undologs records the original data value and supports transaction rollback and MVCC.

Key metrics for EXPLAIN commands include type, key, rows, and Extra. 1) The type reflects the access type of the query. The higher the value, the higher the efficiency, such as const is better than ALL. 2) The key displays the index used, and NULL indicates no index. 3) rows estimates the number of scanned rows, affecting query performance. 4) Extra provides additional information, such as Usingfilesort prompts that it needs to be optimized.

Usingtemporary indicates that the need to create temporary tables in MySQL queries, which are commonly found in ORDERBY using DISTINCT, GROUPBY, or non-indexed columns. You can avoid the occurrence of indexes and rewrite queries and improve query performance. Specifically, when Usingtemporary appears in EXPLAIN output, it means that MySQL needs to create temporary tables to handle queries. This usually occurs when: 1) deduplication or grouping when using DISTINCT or GROUPBY; 2) sort when ORDERBY contains non-index columns; 3) use complex subquery or join operations. Optimization methods include: 1) ORDERBY and GROUPB

MySQL/InnoDB supports four transaction isolation levels: ReadUncommitted, ReadCommitted, RepeatableRead and Serializable. 1.ReadUncommitted allows reading of uncommitted data, which may cause dirty reading. 2. ReadCommitted avoids dirty reading, but non-repeatable reading may occur. 3.RepeatableRead is the default level, avoiding dirty reading and non-repeatable reading, but phantom reading may occur. 4. Serializable avoids all concurrency problems but reduces concurrency. Choosing the appropriate isolation level requires balancing data consistency and performance requirements.

MySQL is suitable for web applications and content management systems and is popular for its open source, high performance and ease of use. 1) Compared with PostgreSQL, MySQL performs better in simple queries and high concurrent read operations. 2) Compared with Oracle, MySQL is more popular among small and medium-sized enterprises because of its open source and low cost. 3) Compared with Microsoft SQL Server, MySQL is more suitable for cross-platform applications. 4) Unlike MongoDB, MySQL is more suitable for structured data and transaction processing.

MySQL index cardinality has a significant impact on query performance: 1. High cardinality index can more effectively narrow the data range and improve query efficiency; 2. Low cardinality index may lead to full table scanning and reduce query performance; 3. In joint index, high cardinality sequences should be placed in front to optimize query.

The MySQL learning path includes basic knowledge, core concepts, usage examples, and optimization techniques. 1) Understand basic concepts such as tables, rows, columns, and SQL queries. 2) Learn the definition, working principles and advantages of MySQL. 3) Master basic CRUD operations and advanced usage, such as indexes and stored procedures. 4) Familiar with common error debugging and performance optimization suggestions, such as rational use of indexes and optimization queries. Through these steps, you will have a full grasp of the use and optimization of MySQL.

MySQL's real-world applications include basic database design and complex query optimization. 1) Basic usage: used to store and manage user data, such as inserting, querying, updating and deleting user information. 2) Advanced usage: Handle complex business logic, such as order and inventory management of e-commerce platforms. 3) Performance optimization: Improve performance by rationally using indexes, partition tables and query caches.


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Dreamweaver Mac version
Visual web development tools

SublimeText3 English version
Recommended: Win version, supports code prompts!

Notepad++7.3.1
Easy-to-use and free code editor

Atom editor mac version download
The most popular open source editor

SAP NetWeaver Server Adapter for Eclipse
Integrate Eclipse with SAP NetWeaver application server.