Home >Backend Development >PHP Tutorial >A brief analysis of nginx HTTP processing flow

A brief analysis of nginx HTTP processing flow

不言
不言forward
2018-10-16 11:51:313728browse

This article brings you a brief analysis of the nginx HTTP processing process. It has certain reference value. Friends in need can refer to it. I hope it will be helpful to you.

1. Initialize the server

The server command is used to configure the virtual server. We usually configure multiple virtual servers on one machine to listen to different port numbers. , mapped to different file directories; nginx parses user configuration, creates sockets on all ports and starts listening.

nginx parsing configuration files are shared and processed by each module. Each module registers and processes the configuration it cares about, which is implemented through the field ngx_command_t *commands of the module structure ngx_module_t;

For example, ngx_http_module is A core module, its commands field is defined as follows:

struct ngx_command_s {
    ngx_str_t             name;
    ngx_uint_t            type;
    char               *(*set)(ngx_conf_t *cf, ngx_command_t *cmd, void *conf);
};
 
static ngx_command_t  ngx_http_commands[] = {
 
    { ngx_string("http"),
      NGX_MAIN_CONF|NGX_CONF_BLOCK|NGX_CONF_NOARGS,
      ngx_http_block,
     },
};
  • name command name, which can be searched according to the name when parsing the configuration file;

  • type Instruction type, NGX_CONF_NOARGS, this configuration has no parameters, NGX_CONF_BLOCK, this configuration is a configuration block, NGX_MAIN_CONF indicates which bits the configuration can appear in (NGX_MAIN_CONF, NGX_HTTP_SRV_CONF, NGX_HTTP_LOC_CONF);

  • set command processing function pointer ;

You can see that the processing function for parsing http instructions is ngx_http_block, which is implemented as follows:

static char * ngx_http_block(ngx_conf_t *cf, ngx_command_t *cmd, void *conf)
{
    //解析main配置
    //解析server配置
    //解析location配置
 
    //初始化HTTP处理流程所需的handler
 
    //初始化listening
    if (ngx_http_optimize_servers(cf, cmcf, cmcf->ports) != NGX_OK) {
        return NGX_CONF_ERROR;
    }
}

ngx_http_optimize_servers method loops through all configuration ports, creates an ngx_listening_t object, and adds it Go to conf->cycle->listening (subsequent operations will traverse this array, create sockets and listen). The main operation of the method is as follows:

A brief analysis of nginx HTTP processing flow

Notice that the handler of ngx_listening_t is set to ngx_http_init_connection. When a socket connection request is received, this handler will be called.

So when will the monitoring be started? Global search keyword cycle->listening can be found. The main method calls ngx_init_cycle, which completes most of the server initialization work, including starting listening (ngx_open_listening_sockets).

Assuming that nginx uses epoll to handle all socket events, when will the listening events be added to epoll? Global search keyword cycle->listening can be found. The ngx_event_core_module module is the core module of event processing. When initializing this module, the ngx_event_process_init function will be executed, in which the listening events will be added to epoll.

static ngx_int_t ngx_event_process_init(ngx_cycle_t *cycle)
{
    ls = cycle->listening.elts;
    for (i = 0; i listening.nelts; i++) {
        //设置读事件处理handler
        rev->handler = ngx_event_accept;
         
        ngx_add_event(rev, NGX_READ_EVENT, 0);
    }
}

Notice that the processing function for receiving the client socket connection request event is ngx_event_accept.

2.HTTP request analysis

2.1 Basic structure

The structure ngx_connection_t stores socket connection related information; nginx creates several ngx_connection_t objects in advance, Stored in the global variable ngx_cycle->free_connections, it is called a connection pool; when a new socket is generated, it will try to obtain an idle connection from the connection pool. If the acquisition fails, the socket will be closed directly.

The instruction worker_connections is used to configure the maximum number of connections in the connection pool. It is configured in the events instruction block and parsed by ngx_event_core_module.

vents {
   use epoll;
   worker_connections  60000;
}

When nginx serves as an HTTP server, the maximum number of clients maxClient=worker_processesworker_connections/2; when nginx serves as a reverse proxy server, the maximum number of clients maxClient=worker_processesworker_connections/4 . Its worker_processes is the number of worker processes configured by the user.

The structure ngx_connection_t is defined as follows:

struct ngx_connection_s {
    //空闲连接池中,data指向下一个连接,形成链表;取出来使用时,data指向请求结构体ngx_http_request_s
    void               *data;
    //读写事件结构体,两个关键字段:handler处理函数、timer定时器
    ngx_event_t        *read;
    ngx_event_t        *write;
 
    ngx_socket_t        fd;   //socket fd
 
    ngx_recv_pt         recv; //socket接收数据函数指针
    ngx_send_pt         send; //socket发送数据函数指针
 
    ngx_buf_t          *buffer; //输入缓冲区
 
    struct sockaddr    *sockaddr; //客户端地址
    socklen_t           socklen;
 
    ngx_listening_t    *listening; //监听的ngx_listening_t对象
 
    struct sockaddr    *local_sockaddr; //本地地址
    socklen_t           local_socklen;
 
    …………
}

The structure ngx_http_request_t stores all the information required for the entire HTTP request processing process. There are many fields. Here is only a brief description:

struct ngx_http_request_s {
 
    ngx_connection_t                 *connection;
 
    //读写事件处理handler
    ngx_http_event_handler_pt         read_event_handler;
    ngx_http_event_handler_pt         write_event_handler;
 
    //请求头缓冲区
    ngx_buf_t                        *header_in;
 
    //解析后的请求头
    ngx_http_headers_in_t             headers_in;
     
    //请求体结构体
    ngx_http_request_body_t          *request_body;
 
    //请求行
    ngx_str_t                         request_line;
    //解析后请求行若干字段
    ngx_uint_t                        method;
    ngx_uint_t                        http_version;
    ngx_str_t                         uri;
    ngx_str_t                         args;
 
    …………
}

Request line and request body parsing is relatively simple. Here we focus on the parsing of request headers. The parsed request header information is stored in the ngx_http_headers_in_t structure.

All HTTP headers are defined in the ngx_http_request.c file and are stored in the ngx_http_headers_in array. Each element of the array is an ngx_http_header_t structure, which mainly contains three fields, the header name and the header parsed field. The offset stored in ngx_http_headers_in_t, the processing function for parsing headers.

ngx_http_header_t  ngx_http_headers_in[] = {
    { ngx_string("Host"), offsetof(ngx_http_headers_in_t, host),
                 ngx_http_process_host },
 
    { ngx_string("Connection"), offsetof(ngx_http_headers_in_t, connection),
                 ngx_http_process_connection },
    …………
}
 
typedef struct {
    ngx_str_t                         name;
    ngx_uint_t                        offset;
    ngx_http_header_handler_pt        handler;
} ngx_http_header_t;

When parsing the request header, search the request header ngx_http_header_t object from the ngx_http_headers_in array, call the processing function handler, and store it in the corresponding field of r->headers_in. Taking parsing of the Connection header as an example, ngx_http_process_connection is implemented as follows:

static ngx_int_t ngx_http_process_connection(ngx_http_request_t *r, ngx_table_elt_t *h, ngx_uint_t offset)
{
    if (ngx_strcasestrn(h->value.data, "close", 5 - 1)) {
        r->headers_in.connection_type = NGX_HTTP_CONNECTION_CLOSE;
 
    } else if (ngx_strcasestrn(h->value.data, "keep-alive", 10 - 1)) {
        r->headers_in.connection_type = NGX_HTTP_CONNECTION_KEEP_ALIVE;
    }
 
    return NGX_OK;
}

The input parameter offset has no effect here. Notice that the second input parameter ngx_table_elt_t stores the key-value pair information of the current request header:

typedef struct {
    ngx_uint_t        hash;  //请求头key的hash值
    ngx_str_t         key;
    ngx_str_t         value;
    u_char           *lowcase_key;  //请求头key转为小写字符串(可以看到HTTP请求头解析时key不区分大小写)
} ngx_table_elt_t;

Think about another problem. When looking for the ngx_http_header_t object corresponding to the request header from the ngx_http_headers_in array, you need to traverse each element. Both require string comparison, which is inefficient. Therefore, nginx converts the ngx_http_headers_in array into a hash table. The key of the hash table is the key of the request header. The method ngx_http_init_headers_in_hash implements the conversion of the array to the hash table. The converted hash table is stored in the cmcf->headers_in_hash field.

A brief analysis of nginx HTTP processing flow

2.2 解析HTTP请求

第1节提到,在创建socket启动监听时,会添加可读事件到epoll,事件处理函数为ngx_event_accept,用于接收socket连接,分配connection连接,并调用ngx_listening_t对象的处理函数(ngx_http_init_connection)。

void ngx_event_accept(ngx_event_t *ev)
{
    s = accept4(lc->fd, (struct sockaddr *) sa, &socklen, SOCK_NONBLOCK);
 
    //客户端socket连接成功时,都需要分配connection连接,如果分配失败则会直接关闭此socket。
    //而每个worker进程连接池的最大连接数目是固定的,当不存在空闲连接时,此worker进程accept的所有socket都会被拒绝;
    //多个worker进程通过竞争执行epoll_wait;而当ngx_accept_disabled大于0时,会直接放弃此次竞争,同时ngx_accept_disabled减1。
    //以此实现,当worker进程的空闲连接过少时,减少其竞争epoll_wait次数
    ngx_accept_disabled = ngx_cycle->connection_n / 8 - ngx_cycle->free_connection_n;
 
    c = ngx_get_connection(s, ev->log);
 
    ls->handler(c);
}

socket连接成功后,nginx会等待客户端发送HTTP请求,默认会有60秒的超时时间,即60秒内没有接收到客户端请求时,断开此连接,打印错误日志。函数ngx_http_init_connection用于设置读事件处理函数,以及超时定时器。

void ngx_http_init_connection(ngx_connection_t *c)
{
    c->read = ngx_http_wait_request_handler;
    c->write->handler = ngx_http_empty_handler;
 
    ngx_add_timer(rev, c->listening->post_accept_timeout);
}

全局搜索post_accept_timeout字段,可以查找到设置此超时时间的配置指令,client_header_timeout,其可以在http、server指令块中配置。

函数ngx_http_wait_request_handler为解析HTTP请求的入口函数,实现如下:

static void ngx_http_wait_request_handler(ngx_event_t *rev)
{
    //读事件已经超时
    if (rev->timedout) {
        ngx_log_error(NGX_LOG_INFO, c->log, NGX_ETIMEDOUT, "client timed out");
        ngx_http_close_connection(c);
        return;
    }
 
    size = cscf->client_header_buffer_size;   //client_header_buffer_size指令用于配置接收请求头缓冲区大小
    b = c->buffer;
 
    n = c->recv(c, b->last, size);
 
    //创建请求对象ngx_http_request_t,HTTP请求整个处理过程都有用;
    c->data = ngx_http_create_request(c);
 
    rev->handler = ngx_http_process_request_line; //设置读事件处理函数(此次请求行可能没有读取完)
    ngx_http_process_request_line(rev);
}

函数ngx_http_create_request创建并初始化ngx_http_request_t对象,注意这赋值语句r->header_in =c->buffer。

解析请求行与请求头的代码较为繁琐,终点在于读取socket数据,解析字符串,这里不做详述。HTTP请求解析过程主要函数调用如下图所示:

A brief analysis of nginx HTTP processing flow

注意,解析完成请求行与请求头,nginx就开始处理HTTP请求,并没有等到解析完请求体再处理。处理请求入口为ngx_http_process_request。

3.处理HTTP请求

3.1 HTTP请求处理的11个阶段

nginx将HTTP请求处理流程分为11个阶段,绝大多数HTTP模块都会将自己的handler添加到某个阶段(将handler添加到全局唯一的数组phases中),注意其中有4个阶段不能添加自定义handler,nginx处理HTTP请求时会挨个调用每个阶段的handler;

typedef enum {
    NGX_HTTP_POST_READ_PHASE = 0, //第一个阶段,目前只有realip模块会注册handler,但是该模块默认不会运行(nginx作为代理服务器时有用,后端以此获取客户端原始ip)
  
    NGX_HTTP_SERVER_REWRITE_PHASE,  //server块中配置了rewrite指令,重写url
  
    NGX_HTTP_FIND_CONFIG_PHASE,   //查找匹配的location配置;不能自定义handler;
    NGX_HTTP_REWRITE_PHASE,       //location块中配置了rewrite指令,重写url
    NGX_HTTP_POST_REWRITE_PHASE,  //检查是否发生了url重写,如果有,重新回到FIND_CONFIG阶段;不能自定义handler;
  
    NGX_HTTP_PREACCESS_PHASE,     //访问控制,比如限流模块会注册handler到此阶段
  
    NGX_HTTP_ACCESS_PHASE,        //访问权限控制,比如基于ip黑白名单的权限控制,基于用户名密码的权限控制等
    NGX_HTTP_POST_ACCESS_PHASE,   //根据访问权限控制阶段做相应处理;不能自定义handler;
  
    NGX_HTTP_TRY_FILES_PHASE,     //只有配置了try_files指令,才会有此阶段;不能自定义handler;
    NGX_HTTP_CONTENT_PHASE,       //内容产生阶段,返回响应给客户端
  
    NGX_HTTP_LOG_PHASE            //日志记录
} ngx_http_phases;

nginx使用结构体ngx_module_s表示一个模块,其中字段ctx,是一个指向模块上下文结构体的指针(上下文结构体的字段都是一些函数指针);nginx的HTTP模块上下文结构体大多都有字段postconfiguration,负责注册本模块的handler到某个处理阶段。11个阶段在解析完成http配置块指令后初始化。

static char * ngx_http_block(ngx_conf_t *cf, ngx_command_t *cmd, void *conf)
{
    //解析http配置块
 
    //初始化11个阶段的phases数组,注意多个模块可能注册到同一个阶段,因此phases是一个二维数组
    if (ngx_http_init_phases(cf, cmcf) != NGX_OK) {
        return NGX_CONF_ERROR;
    }
 
    //遍历索引HTTP模块,注册handler
    for (m = 0; ngx_modules[m]; m++) {
        if (ngx_modules[m]->type != NGX_HTTP_MODULE) {
            continue;
        }
 
        module = ngx_modules[m]->ctx;
 
        if (module->postconfiguration) {
            if (module->postconfiguration(cf) != NGX_OK) {
                return NGX_CONF_ERROR;
            }
        }
    }
 
    //将二维数组转换为一维数组,从而遍历执行数组所有handler
    if (ngx_http_init_phase_handlers(cf, cmcf) != NGX_OK) {
        return NGX_CONF_ERROR;
    }
}

以限流模块ngx_http_limit_req_module模块为例,postconfiguration方法简单实现如下:

static ngx_int_t ngx_http_limit_req_init(ngx_conf_t *cf)
{
    h = ngx_array_push(&cmcf->phases[NGX_HTTP_PREACCESS_PHASE].handlers);
     
    *h = ngx_http_limit_req_handler;  //ngx_http_limit_req_module模块的限流方法;nginx处理HTTP请求时,都会调用此方法判断应该继续执行还是拒绝请求
  
    return NGX_OK;
}

GDB调试,断点到ngx_http_block方法执行所有HTTP模块注册handler之后,打印phases数组

p cmcf->phases[*].handlers
p *(ngx_http_handler_pt*)cmcf->phases[*].handlers.elts

11个阶段注册的handler如下图所示:

A brief analysis of nginx HTTP processing flow

3.2 11个阶段初始化

上面提到HTTP的11个处理阶段handler存储在phases数组,但由于多个模块可能注册handler到同一个阶段,使得phases是一个二维数组,因此需要转换为一维数组,转换后存储在cmcf->phase_engine字段,phase_engine的类型为ngx_http_phase_engine_t,定义如下:

typedef struct {
    ngx_http_phase_handler_t  *handlers;   //一维数组,存储所有handler
    ngx_uint_t                 server_rewrite_index;  //记录NGX_HTTP_SERVER_REWRITE_PHASE阶段handler的索引值
    ngx_uint_t                 location_rewrite_index; //记录NGX_HTTP_REWRITE_PHASE阶段handler的索引值
} ngx_http_phase_engine_t;
 
struct ngx_http_phase_handler_t {
    ngx_http_phase_handler_pt  checker;  //执行handler之前的校验函数
    ngx_http_handler_pt        handler;
    ngx_uint_t                 next;   //下一个待执行handler的索引(通过next实现handler跳转执行)
};
 
//cheker函数指针类型定义
typedef ngx_int_t (*ngx_http_phase_handler_pt)(ngx_http_request_t *r, ngx_http_phase_handler_t *ph);
//handler函数指针类型定义
typedef ngx_int_t (*ngx_http_handler_pt)(ngx_http_request_t *r);

数组转换函数ngx_http_init_phase_handlers实现如下:

static ngx_int_t ngx_http_init_phase_handlers(ngx_conf_t *cf, ngx_http_core_main_conf_t *cmcf)
{
    use_rewrite = cmcf->phases[NGX_HTTP_REWRITE_PHASE].handlers.nelts ? 1 : 0;
    use_access = cmcf->phases[NGX_HTTP_ACCESS_PHASE].handlers.nelts ? 1 : 0;
     
    n = use_rewrite + use_access + cmcf->try_files + 1 /* find config phase */; //至少有4个阶段,这4个阶段是上面说的不能注册handler的4个阶段
     
    //计算handler数目,分配空间
    for (i = 0; i phases[i].handlers.nelts;
    }
    ph = ngx_pcalloc(cf->pool, n * sizeof(ngx_http_phase_handler_t) + sizeof(void *));
 
    //遍历二维数组
    for (i = 0; i phases[i].handlers.elts;
 
        switch (i) {
 
        case NGX_HTTP_SERVER_REWRITE_PHASE:
            if (cmcf->phase_engine.server_rewrite_index == (ngx_uint_t) -1) {
                cmcf->phase_engine.server_rewrite_index = n;   //记录NGX_HTTP_SERVER_REWRITE_PHASE阶段handler的索引值
            }
            checker = ngx_http_core_rewrite_phase;
            break;
 
        case NGX_HTTP_FIND_CONFIG_PHASE:
            find_config_index = n;   //记录NGX_HTTP_FIND_CONFIG_PHASE阶段的索引,NGX_HTTP_POST_REWRITE_PHASE阶段可能会跳转回此阶段
            ph->checker = ngx_http_core_find_config_phase;
            n++;
            ph++;
            continue;   //进入下一个阶段NGX_HTTP_REWRITE_PHASE
  
        case NGX_HTTP_REWRITE_PHASE:
            if (cmcf->phase_engine.location_rewrite_index == (ngx_uint_t) -1) {
                cmcf->phase_engine.location_rewrite_index = n;   //记录NGX_HTTP_REWRITE_PHASE阶段handler的索引值
            }
            checker = ngx_http_core_rewrite_phase; 
            break;
 
        case NGX_HTTP_POST_REWRITE_PHASE:
            if (use_rewrite) {
                ph->checker = ngx_http_core_post_rewrite_phase;
                ph->next = find_config_index;
                n++;
                ph++;
            }
            continue;  //进入下一个阶段NGX_HTTP_ACCESS_PHASE
 
        case NGX_HTTP_ACCESS_PHASE:
            checker = ngx_http_core_access_phase;
            n++;
            break;
 
        case NGX_HTTP_POST_ACCESS_PHASE:
            if (use_access) {
                ph->checker = ngx_http_core_post_access_phase;
                ph->next = n;
                ph++;
            }
            continue;  //进入下一个阶段
 
        case NGX_HTTP_TRY_FILES_PHASE:
            if (cmcf->try_files) {
                ph->checker = ngx_http_core_try_files_phase;
                n++;
                ph++;
            }
            continue;
 
        case NGX_HTTP_CONTENT_PHASE:
            checker = ngx_http_core_content_phase;
            break;
 
        default:
            checker = ngx_http_core_generic_phase;
        }
 
        //n为下一个阶段第一个handler的索引
        n += cmcf->phases[i].handlers.nelts;
 
        //遍历当前阶段的所有handler
        for (j = cmcf->phases[i].handlers.nelts - 1; j >=0; j--) {
            ph->checker = checker;
            ph->handler = h[j];
            ph->next = n;
            ph++;
        }
    }
}

GDB打印出转换后的数组如下图所示,第一列是cheker字段,第二列是handler字段,箭头表示next跳转;图中有个返回的箭头,即NGX_HTTP_POST_REWRITE_PHASE阶段可能返回到NGX_HTTP_FIND_CONFIG_PHASE;原因在于只要NGX_HTTP_REWRITE_PHASE阶段产生了url重写,就需要重新查找匹配location。

A brief analysis of nginx HTTP processing flow

3.3 处理HTTP请求

2.2节提到HTTP请求的处理入口函数是ngx_http_process_request,其主要调用ngx_http_core_run_phases实现11个阶段的执行流程;

ngx_http_core_run_phases遍历预先设置好的cmcf->phase_engine.handlers数组,调用其checker函数,逻辑如下:

void ngx_http_core_run_phases(ngx_http_request_t *r)
{
    ph = cmcf->phase_engine.handlers;
 
    //phase_handler初始为0,表示待处理handler的索引;cheker内部会根据ph->next字段修改phase_handler
    while (ph[r->phase_handler].checker) {
 
        rc = ph[r->phase_handler].checker(r, &ph[r->phase_handler]);
 
        if (rc == NGX_OK) {
            return;
        }
    }
}

checker内部就是调用handler,并设置下一步要执行handler的索引;比如说ngx_http_core_generic_phase实现如下:

ngx_int_t ngx_http_core_generic_phase(ngx_http_request_t *r, ngx_http_phase_handler_t *ph)
{
    ngx_log_debug1(NGX_LOG_DEBUG_HTTP, r->connection->log, 0, "rewrite phase: %ui", r->phase_handler);
    rc = ph->handler(r);
    if (rc == NGX_OK) {
        r->phase_handler = ph->next;
        return NGX_AGAIN;
    }
}

3.4 内容产生阶段

内容产生阶段NGX_HTTP_CONTENT_PHASE是HTTP请求处理的第10个阶段,一般情况有3个模块注册handler到此阶段:ngx_http_static_module、ngx_http_autoindex_module和ngx_http_index_module。

但是当我们配置了proxy_pass和fastcgi_pass时,情况会有所不同;

使用proxy_pass配置上游时,ngx_http_proxy_module模块会设置其处理函数到配置类conf;使用fastcgi_pass配置时,ngx_http_fastcgi_module会设置其处理函数到配置类conf。例如:

static char * ngx_http_fastcgi_pass(ngx_conf_t *cf, ngx_command_t *cmd, void *conf)
{
    ngx_http_core_loc_conf_t   *clcf;
    clcf = ngx_http_conf_get_module_loc_conf(cf, ngx_http_core_module);
 
    clcf->handler = ngx_http_fastcgi_handler;
}

阶段NGX_HTTP_FIND_CONFIG_PHASE查找匹配的location,并获取此ngx_http_core_loc_conf_t对象,将其handler赋值给ngx_http_request_t对象的content_handler字段(内容产生处理函数)。

而在执行内容产生阶段的checker函数时,会执行content_handler指向的函数;查看ngx_http_core_content_phase函数实现(内容产生阶段的checker函数):

ngx_int_t ngx_http_core_content_phase(ngx_http_request_t *r,
    ngx_http_phase_handler_t *ph)
{
    if (r->content_handler) {  //如果请求对象的content_handler字段不为空,则调用
        r->write_event_handler = ngx_http_request_empty_handler;
        ngx_http_finalize_request(r, r->content_handler(r));
        return NGX_OK;
    }
 
    ngx_log_debug1(NGX_LOG_DEBUG_HTTP, r->connection->log, 0, "content phase: %ui", r->phase_handler);
 
    rc = ph->handler(r);  //否则执行内容产生阶段handler
}

总结

nginx处理HTTP请求的流程较为复杂,因此本文只是简单提供了一条线索:分析了nginx服务器启动监听的过程,HTTP请求的解析过程,11个阶段的初始化与调用过程。至于HTTP解析处理的详细流程,还需要读者去探索。

The above is the detailed content of A brief analysis of nginx HTTP processing flow. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:segmentfault.com. If there is any infringement, please contact admin@php.cn delete