Home  >  Article  >  Database  >  [MySQL database] Interpretation of Chapter 4: Schema and data type optimization (Part 2)

[MySQL database] Interpretation of Chapter 4: Schema and data type optimization (Part 2)

php是最好的语言
php是最好的语言Original
2018-08-07 13:58:191412browse

4.2 Traps in MySQL schema design

Because the mysql implementation mechanism causes some specific errors, how to avoid them, let’s take a look:

1. Too many columns

When the MySQL storage engine api works, it needs to copy the data in the row buffer format at the server layer and storage engine layer, and then decode the buffer content into For each column, the operation of converting the encoded columns into row data from the row buffer is expensive. Myisam fixed-length rows exactly match the server row structure and do not require conversion; however, the row structure of variable-length row structure InnoDB Conversion is always required, The conversion cost depends on the number of columns.

2. Too many associations

Entity-Attribute-ValueEAV: Poor design pattern, mysql limits each association operation to a maximum of 61 tables, but the EAV database requires many self-correlation; a rough rule of thumb, if you want the query to execute quickly and have good concurrency,

It is best to do correlations in 12 tables for a single query;

3. Prevent overuse of enumerations

Be careful to prevent overuse of enumerations; use foreign keys to associate with dictionary tables or lookup tables to find specific values. In mysql, you need to use

enumeration columns When adding a value to the table, you need to do an alter table; MySQL5.0 earlier alter tableblocking operation, in the 5.1 updated version, it will not be added at the end of the list. Also need to alter table

4. Null that is not invented here

It is recommended to use 0, special value, empty string instead of null value. Try not to be null; but don’t go to extremes. , in some scenarios, using null will be better:

create table ……(
//全0 (不可能的日期)会导致很多问题
    dt datetime not null default '0000-00-00 00:00:00'
    ……
)

MySQL will store null values ​​in the index, Oracle will not

4.3 Normal form and anti-normal form

4.3.1 Advantages and Disadvantages

1. Normalized update operations are faster

2. When the data is well normalized, there is rarely duplicate data and only Modify less data

3. The normalized table is smaller and can be better placed in the memory, and the operation is performed faster

4. There is little redundant data and the list data is retrieved. Less distinct and group by statements are needed

Disadvantages:

Requires association, which is costly and may invalidate the index

4.3.2 Advantages and disadvantages of anti-paradigm

Avoid association. Data larger than memory may be much faster than association (avoiding random I/O)

4.4 Cache tables and summary tables

Cache tables:

are very effective for optimizing search and retrieval query statements,

store those that can be more easily retrieved from other The table that obtains data (each acquisition speed is relatively slow)

Summary table:The table that saves the data aggregated using the group by statement

Determines whether to maintain the data in real time when using it Or rebuild regularly,

Rebuild regularly: save resources, have less fragmentation, and sequentially organized indexes (efficient)

When rebuilding, ensure that the data is still available during operation, through

"shadow table " To achieve, shadow table: a table created behind the real table. After completing the table creation operation, you can switch between the shadow table and the original table through the atomic rename operation

##4.4.1 Materialized View[MySQL database] Interpretation of Chapter 4: Schema and data type optimization (Part 2)

Pre-calculated and stored on disk table

, can be refreshed and updated through various strategies, mysql does not support it natively , can be implemented using the Justin Swanhart tool flexviews:

flexviews composition:

Change data capture, read the server binary log and parse the relevant lines Changes
  • A series of stored procedures that can help create and manage view definitions
  • Some can apply changes to materialized views in the database Tools
  • flexviews can
  • incrementally
recalculate

the contents of a materialized view by extracting changes to the source table: No need to query the original data (Efficient) 4.4.2 Counter table Counter table: caches the number of friends of a user, the number of file downloads, etc. It is recommended to create a

independent table to store the counter

, Avoid query cache failure;

Updates and transactions can only be executed serially. For higher concurrency, the counter can be saved in multiple rows, and one row is randomly selected to update each time. When the results are to be counted, aggregate query ;(I have read this two or three times, maybe it is stupid, it means that the same counter saves multiple points, one of them is selected to update each time, and the final sum is calculated. It seems that it is not easy to understand, so please read it a few more times) 4.5 Speed ​​up the alter table operation

Most modifications to the table structure of mysql are: creating an empty table with new results, finding all data from the old table and inserting it into the new table, and deleting the old table

mysql5.1及更新包含一些类型的“在线”操作的支持,整个过程不需要全锁表,最新版的InnoDB(MySQL5.5和更新版本中唯一的InnoDB)支持通过排序来建索引,建索引更快且紧凑的布局;

一般而言,大部分alter table导致mysql服务中断,对常见场景,使用的技巧

1、先在一台不提供服务的机器上执行alter table操作,然后和提取服务的主库进行切换

2、影子拷贝,用要求的表结构创建张和源表无关的新表,通过重命名、删表交换两张表(上有)

不是all的alter table都引起表重建,理论上可跳过创建表的步骤:列默认值实际上存在表的.frm文件中,so可直接修改这个文件不需要改动表本身,但mysql还没有采用这种优化方法,all的modify column将导致表重建;

[MySQL database] Interpretation of Chapter 4: Schema and data type optimization (Part 2)

alter column:通frm文件改变列默认值:alter table容许使用alter column、modify column change column修改列,三种操作不一样;

alter table sakila.film alter column rental_duration set default 5;

4.5.1只修改frm文件

mysql有时在没有必要的时候也重建表,如果愿冒一些风险,可做些其他类型的修改而不用重建表:下面操作可能不能正常工作,先备份数据

下面操作不需要重建表:

     1、移除一个列的auto_increment

     2、增加、移除、更改enum和set常量,如果移除的是被用到的常量、查询返回空字符串

基本技术为想要的表结果创建新的frm文件,然后用它替换掉已经存在的那张表的frm文件:

     1、创建一张有相同结构的空表,进行所需的修改

     2、执行flush tables with read lock:关闭all正在使用的表且禁止任何表被打开

     3、交换frm文件

     4、执行unlock tables释放第2步的读锁

示例略 

4.5.2快速创建myISAM索引

1、为高效地载入数据到MyISAM表,常用技巧:先禁用索引、载入数据、重启索引:因为构建索引的工作延迟到数据载入后,此时可通过排序构建索引,快且使得索引树的碎片更少、更紧凑

[MySQL database] Interpretation of Chapter 4: Schema and data type optimization (Part 2)

但是对唯一索引无效(disable  keys),myisam会在内存中构造唯一索引且为载入的每一行检查唯一性,一旦索引大小超过有效内存、载入操作会越来越慢;

2、在现代版InnoDB中,有个类似技巧:先删除all非唯一索引,然后增加新的列,最后重建删除掉的索引(依赖于innodb快速在线索引创建功能)Percona server可自动完成这些操作;

3、像前alter table 的骇客方法来加速这个操作,但需多做些工作且承担风险,这对从备份中载入数据很有用,如already know all data is effective ,and no need to do the unique check

  •     用需要的表结构创建一张表,不包括索引(如用load data file 且载入的表是空的,myisam可排序建索引)

  • 载入数据到表中以构建MYD文件

  • 按需要的结构创建另外一张空表,这次要包含索引,会创建.frm .MYI文件

  • 获读锁并刷新表

  • 重命名第二张表的frm文件 MYI,让mysql认为这是第一张表的文件

  • 释放读锁

  • 使用repair table来重建表的索引,该操作会通过排序来构建all索引、包括唯一索引 

4.6总结

良好的schema设计原则是普通使用的,但mysql有自己的实现细节要注意,概括来说:尽可能保持任何东西小而简单总是好的;mysql喜欢简单(好恰、我也是)

  1. 最好避免使用bit

  2. 使用小而简单的合适类型;

  3. 尽量使用整型定义标识列

  4. Avoid over-design, such as schema design that will lead to extremely complex queries, or many columns;

  5. You should avoid using null values ​​as much as possible unless you have real data If there are exact needs in the model

  6. Try to use the same type to store similar and related values, especially the columns used in the association conditions

  7. Note Variable-length strings, which can lead to pessimistic max-length allocations when using temporary tables and sorting

  8. Avoid using abandoned features, such as specifying the precision of floating point numbers, or the precision of integers Display width

  9. Use enum and set carefully, although they are very convenient to use, don’t abuse them, sometimes they will become traps

  10. Paradigm is good Yes, but denormalization is sometimes necessary; precomputing, caching or generating summary tables can also be of great benefit

  11. alter table In most cases, the table will be locked and the entire table will be rebuilt ( Painful) This chapter provides some risky methods. Most scenarios must use other more conventional methods

Related articles:

[MySQL database 】Chapter 3 Interpretation: Server Performance Analysis (Part 1)

[MySQL Database] Chapter 3 Interpretation: Server Performance Analysis (Part 2)

The above is the detailed content of [MySQL database] Interpretation of Chapter 4: Schema and data type optimization (Part 2). For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn