Mysql Tutorial

MySQL系列：innodb引擎分析之文件IO_MySQL

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Jun 01, 2016 pm 01:04 PM

enginedocument

innodb作为数据库引擎，自然少不了对文件的操作，在innodb中所有需要持久化的信息都需要文件操作，例如：表文件、重做日志文件、事务日志文件、备份归档文件等。innodb对文件IO操作可以是煞费苦心，其主要包括两方面，一个是对异步io的实现，一个是对文件操作管理和io调度的实现。在MySQL-5.6版本的innodb还加入了DIRECT IO实现。做了这么多无非是优化io操作的性能。在innodb的文件IO部分中，主要实现集中在os_file.*和fil0fil.*两个系列的文件当中，其中os_file*是实现基本的文件操作、异步IO和模拟异步IO。fil0fil.*是对文件io做系统的管理和space结构化。下面依次来介绍这两个方面的内容.

1.系统文件IO

在innodb中，文件的操作是比较关键的，innodb封装了基本的文件操作，例如：文件打开与关闭、文件读写以及文件属性访问等。这些是基本的文件操作函数封装。在linux文件的读写方面，默认是采用pread/pwrite函数进行读写操作，如果系统部支持这两个函数，innodb用lseek和read、write函数联合使用来达到效果. 以下是innodb文件操作函数: os_file_create_simple 创建或者打开一个文件 os_file_create 创建或者打开一个文件，如果操作失败会重试，直到成功 os_file_close 关闭打开的文件 os_file_get_size 获得文件的大小 os_file_set_size 设置文件的大小并以0填充文件内容 os_file_flush 将写的内容fsync到磁盘 os_file_read 从文件中读取数据 os_file_write 将数据写入文件 innodb除了实现以上基本的操作以外，还实现了文件的异步IO模型，在Windows下采用的IOCP模型来进行处理（具体可以见网上的资料），在linux下是采用aio来实现的，有种情况，一种是通过系统本身的aio机制来实现，还有一种是通过多线程信号模拟来实现aio.这里我们重点来介绍，为了实现aio,innodb定义了slot和slot array,具体数据结构如下：

typedef struct os_aio_slot_struct
{
     ibool	 is_read;                             /*是否是读操作*/
     ulint	 pos;                                    /*slot array的索引位置*/
     ibool	 reserved;                           /*这个slot是否被占用了*/
     ulint	 len;                                     /*读写的块长度*/
     byte*	 buf;                                   /*需要操作的数据缓冲区*/
     ulint	 type;                                   /*操作类型：OS_FILE_READ OS_FILE_WRITE*/
     ulint	 offset;                                 /*当前操作文件偏移位置，低32位*/
     ulint	 offset_high;                        /*当前操作文件偏移位置，高32位*/
     os_file_t	 file;                               /*文件句柄*/
     char*	 name;                               /*文件名*/
     ibool	 io_already_done;             /*在模拟aio的模式下使用，TODO*/
     void*	 message1;
     void*	 message2;
#ifdef POSIX_ASYNC_IO
     struct aiocb	control;                 /*posix 控制块*/
#endif
}os_aio_slot_t;

typedef struct os_aio_array_struct
{
 os_mutex_t	 mutex;          /*slots array的互斥锁*/
 os_event_t	 not_full;         /*可以插入数据的信号，一般在slot数据被aio操作后array_slot有空闲可利用的slot时发送*/
 os_event_t	 is_empty;       /*array 被清空的信号，一般在slot数据被aio操作后array_slot里面没有slot时发送这个信号*/

 ulint	 n_slots;                     /*slots总体单元个数*/
 ulint	 n_segments;             /*segment个数，一般一个对应n个slot，n = n_slots/n_segments，一个segment作为aio一次的操作范围*/
 ulint	 n_reserved;              /*有效的slots个数*/
 os_aio_slot_t*	slots;         /*slots数组*/

 os_event_t*	 events;         /*slots event array，暂时没弄明白做啥用的*/
}os_aio_array_t;

内存结构关系图:

2.文件管理的内存结构

在innodb中定义三种文件类型：表空间文件(ibdata*)、重做日志文件(ib_logfile*)和归档文件(ib_arch_log*)。一般innodb在运行的过程中，会同时打开很多个文件，这就要求对文件进行系统的管理和控制。在innodb中定义了一套基于fil_system_t、fil_space_t和fil_node_t的内存管理结构。每个文件对应的是一个fil_node_t,fil_node是存储的最小单元，多个同一模块的fil_node组成一个fil_space_t，所有的space组成一个fil_system_t，在innodb引擎里，只有一个fil_system_t对象。

fil_system_t管理着全局的文件操作资源，例如：文件打开的数量、打开文件的信号控制、fil_space_t的管理和索引等。以下是fil_system_t的结构定义：

typedef struct fil_system_struct
{
     mutex_t	 mutex;              /*file system的保护锁*/
     hash_table_t*	spaces;     /*space的哈希表，用于快速检索space,一般是通过space id查找*/
     ulint	 n_open_pending;  /*当前有读写IO操作的fil_node个数*/
     ulint	 max_n_open;         /*最大允许打开的文件个数*/
     os_event_t	 can_open;    /*可以打开新的文件的信号*/
 
    UT_LIST_BASE_NODE_T(fil_node_t) LRU;       /*最近被打开操作过的文件,用于快速定位关闭的fil_node*/
    UT_LIST_BASE_NODE_T(fil_node_t) space_list;	 /*file space的对象列表*/
}fil_system_t;

值得注意的是space的哈希表和LRU,这里为什么会出现用hash table来索引space呢？因为在实际的数据库系统中，fil_space_t是会非常多的，用哈希表能快速定位到需要操作的fil_space_t。LRU是用于保存最近被打开和被操作过的fil_node,为了避免频发的关闭和打开文件，LRU保存一定数量（500）的最近打开过的文件，这样可以提高系统的效率。

fil_space_t是用于管理同一模块的file_node,上层模块操作文件不是以文件名来做操作关联的，而是用space_id，

也就是说，所有的文件操作是通过space为单位进行操作的。fil_space支持三种类型，分别是：
FIL_TABLESPACE 表空间space
FIL_LOG 重做日志space
FIL_ARCHI_LOG 归档日志space

fil_space_t的定义如下：

struct fil_space_struct
{
     char*	 name;                     /*space名称*/
     ulint	 id;                            /*space id*/
     ulint	 purpose;                 /*space的类型，主要有space table, log file和arch file*/
     ulint	 size;                         /*space包含的页个数*/
     ulint	 n_reserved_extents; /*预留的页个数*/
     hash_node_t	 hash;          /*chain node的HASH表*/
     rw_lock_t	 latch;               /*space操作保护锁,用于多线程并发*/
     ibuf_data_t*	ibuf_data;   /*space 对应的insert buffer*/
     ulint	 magic_n;                 /*魔法校验字*/

     UT_LIST_BASE_NODE_T(fil_node_t) chain;
     UT_LIST_NODE_T(fil_space_t)	 space_list;
};

fil_space通常是由一组文件组成，例如重做日志，一般是有3个文件组成一个group space用于重做日志记录。space通过成员latch可以支持多线程并发的。在innodb文件操作中，主要是通过space来做控制，以下是它的控制函数：
fil_space_create 创建一个fil_space
fil_space_free 销毁一个fil_space
fil_space_truncate_start 从space中删除fil_node，删除的总数据长度为trunc_len
fil_node_create 创建一个fil_node并加入到对应的space当中
fil_space_get_size 获得space的空间大小，以page为单位记
fil_io 指定space的io操作
fil_aio_wait aio异步方式的io操作等待,并根据完成状态更新space状态
fil_flush 指定space进行数据刷盘
fil_node_t是对单个文件进行管理，主要是管理文件的打开状态、文件句柄信息、文件的page数量和更新状态等。

其结构定义如下：

struct fil_node_struct
{
     char*	 name;                         /*文件路径名*/
     ibool	 open;                         /*文件是否被打开*/
     os_file_t	handle;                  /*文件句柄*/
     ulint	 size;                             /*文件包含的页个数，一个页是16K*/
     ulint	 n_pending;                 /*等待读写IO操作的个数*/
     ibool	 is_modified;               /*是否有脏也存在，flush是根据这个标志进行刷盘的*/
     ulint	 magic_n;                     /*魔法校验字*/
     UT_LIST_NODE_T(fil_node_t) chain;
     UT_LIST_NODE_T(fil_node_t) LRU;
};

值得注意的是当外部调用了fil_flush时，判断一个fil_node是否需要刷盘的必要条件是：
文件必须是打开的 open = TRUE
文件存在内存和硬盘数据不一致 is_modified = TRUE

了解了他们三者的基本定义后，那他们之间的关系是怎么的？不用文字叙述，看下面的内存结构关系图：

在了解了他们之间的基本关系后，那么一个io操作是怎么进行的？在这个模型里，一个io操作提交和被运行是比较复杂的。具体流程如下： 1.外部模块提交一个fil_io, 先会进行基本的io操作类型的判断和文件打开方式的判断。 2.然后进行对正在进行io操作的计数做判断，如果正在进行的io数量 > 最大文件打开数量的四分之三，唤醒所有aio的操作线程进行io处理，并进行sleep等待。 3.如果正在进行的io数量 = 最大文件打开数量,唤醒所有的aio操作线程进行io处理，并等待fil_system_t的can_open信号。 4.如果不满足2和3，找到需要受理io操作的space和node,并打开node对应的文件，打开文件时会对打开文件数量限制做判断，如果当前打开文件操作io的数量 + LRU里已经打开文件的数量>= 最大文件打开数量时，会取出LRU中最后一个fil_node进行文件关闭。然后在对新的io操作的fil_node文件进行打开。 5.fil_node文件打开后，调用os_aio进行io操作提交，然后等待io操作完成 6. io操作完成后，将完成io操作的fil_node放入LRU的第一个位置，并更改对应的fil_system/fil_space/fil_node的状态，最后触发一个fil_system的can open信号。 7.监听can_open的线程收到这个信号后，会跳到第4步进行自己的io操作提交。流程图如下：

3总结

总体来说，innodb的文件IO涉及到知识面很多，可以能短时间无法完全理解透彻，一般在阅读源码的时候可以做一些基本的单元测试，这样有助于理解。弄清楚innodb的文件IO操作是非常有必要的，因为文件IO操作模块直接影响对innodb的日志系统的理解、表空间系统的理解。而且Innodb在文件IO模块的改进还是比较大的，尤其是引入Direct IO后。Direct IO很多数据库都在用这个技术，除了innodb,oracle和淘宝的oceanbase都使用了这个技术, 关于Direct IO网络上资料很多，可以自行结合MySQL-5.6的innodb来做研究。

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

MySQL String Types: Storage, Performance, and Best PracticesMay 10, 2025 am 12:02 AM

MySQLstringtypesimpactstorageandperformanceasfollows:1)CHARisfixed-length,alwaysusingthesamestoragespace,whichcanbefasterbutlessspace-efficient.2)VARCHARisvariable-length,morespace-efficientbutpotentiallyslower.3)TEXTisforlargetext,storedoutsiderows,

Understanding MySQL String Types: VARCHAR, TEXT, CHAR, and MoreMay 10, 2025 am 12:02 AM

MySQLstringtypesincludeVARCHAR,TEXT,CHAR,ENUM,andSET.1)VARCHARisversatileforvariable-lengthstringsuptoaspecifiedlimit.2)TEXTisidealforlargetextstoragewithoutadefinedlength.3)CHARisfixed-length,suitableforconsistentdatalikecodes.4)ENUMenforcesdatainte

What are the String Data Types in MySQL?May 10, 2025 am 12:01 AM

MySQLoffersvariousstringdatatypes:1)CHARforfixed-lengthstrings,2)VARCHARforvariable-lengthtext,3)BINARYandVARBINARYforbinarydata,4)BLOBandTEXTforlargedata,and5)ENUMandSETforcontrolledinput.Eachtypehasspecificusesandperformancecharacteristics,sochoose

How to Grant Permissions to New MySQL UsersMay 09, 2025 am 12:16 AM

TograntpermissionstonewMySQLusers,followthesesteps:1)AccessMySQLasauserwithsufficientprivileges,2)CreateanewuserwiththeCREATEUSERcommand,3)UsetheGRANTcommandtospecifypermissionslikeSELECT,INSERT,UPDATE,orALLPRIVILEGESonspecificdatabasesortables,and4)

How to Add Users in MySQL: A Step-by-Step GuideMay 09, 2025 am 12:14 AM

ToaddusersinMySQLeffectivelyandsecurely,followthesesteps:1)UsetheCREATEUSERstatementtoaddanewuser,specifyingthehostandastrongpassword.2)GrantnecessaryprivilegesusingtheGRANTstatement,adheringtotheprincipleofleastprivilege.3)Implementsecuritymeasuresl

MySQL: Adding a new user with complex permissionsMay 09, 2025 am 12:09 AM

ToaddanewuserwithcomplexpermissionsinMySQL,followthesesteps:1)CreatetheuserwithCREATEUSER'newuser'@'localhost'IDENTIFIEDBY'password';.2)Grantreadaccesstoalltablesin'mydatabase'withGRANTSELECTONmydatabase.TO'newuser'@'localhost';.3)Grantwriteaccessto'

MySQL: String Data Types and CollationsMay 09, 2025 am 12:08 AM

The string data types in MySQL include CHAR, VARCHAR, BINARY, VARBINARY, BLOB, and TEXT. The collations determine the comparison and sorting of strings. 1.CHAR is suitable for fixed-length strings, VARCHAR is suitable for variable-length strings. 2.BINARY and VARBINARY are used for binary data, and BLOB and TEXT are used for large object data. 3. Sorting rules such as utf8mb4_unicode_ci ignores upper and lower case and is suitable for user names; utf8mb4_bin is case sensitive and is suitable for fields that require precise comparison.

MySQL: What length should I use for VARCHARs?May 09, 2025 am 12:06 AM

The best MySQLVARCHAR column length selection should be based on data analysis, consider future growth, evaluate performance impacts, and character set requirements. 1) Analyze the data to determine typical lengths; 2) Reserve future expansion space; 3) Pay attention to the impact of large lengths on performance; 4) Consider the impact of character sets on storage. Through these steps, the efficiency and scalability of the database can be optimized.

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Roblox: Grow A Garden - Complete Mutation Guide

3 weeks agoByDDD

Roblox: Bubble Gum Simulator Infinity - How To Get And Use Royal Keys

3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

How to fix KB5055612 fails to install in Windows 10?

3 weeks agoByDDD

Blue Prince: How To Get To The Basement

1 months agoByDDD

Nordhold: Fusion System, Explained

3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

Zend Studio 13.0.1

Powerful PHP integrated development environment

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

MantisBT

Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

DVWA

Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software

Hot Topics

1664

1423

1318

1269

1248