Home  >  Article  >  Backend Development  >  Detailed explanation of Nginx load balancing and reverse proxy extension functions

Detailed explanation of Nginx load balancing and reverse proxy extension functions

小云云
小云云Original
2018-03-27 14:25:441789browse

This article mainly introduces the related functions of NGINX Plus, covering the updates of different versions such as NGINX Plus R5/R6/R7/R9. It involves more advanced usage of Nginx reverse proxy and load balancing. Mainly includes: HTTP load balancing, HTTP long connection, TCP and UDP load balancing, upstream connection limit, shortest time balancing algorithm, Session consistency, real-time health check, DNS re-analysis, access control, client connection limit, Client bandwidth limitations, buffer-free file upload, SSL/TLS optimization, cache optimization, API functions, configuration best practices, etc.

What is NGINX Plus?

As the name suggests, it is an enhanced or extended version of Nginx. We know that Nginx is open source and free, but many functions of NGINX Plus require charges. Nginx Plus can serve as a load balancer, a web server, and a content cache. Since it is an enhanced version of Nginx, it will undoubtedly be more powerful than Nginx. Based on the existing functions of open source Nginx, NGINX Plus provides many proprietary functions suitable for production environments, including session consistency, real-time update of API configuration, effective health checks, etc.

NGINX Plus version update

NGINX Plus R5 and newer versions can support load balancing of TCP-based applications (such as MySQL). This is not only limited to HTTP load balancing, but greatly expands the scope of Nginx as a load balancer. The TCP load balancing function in R6 has been greatly expanded, adding health checks, dynamically updating configurations, SSL terminals, etc. By the time R7 arrives, the TCP load balancing function is basically the same as HTTP load balancing. zIn R9, UDP can be supported. Through these updates, NGINX Plus has gone far beyond the web application level and has become a load balancer with a broader meaning. After all, protocols are at the basic level. The more protocols supported, the wider the applications. From the initial Http/SMTP to TCP and then to UDP, NGINX Plus becomes more and more powerful step by step.

Both open source Nginx and NGINX Plus support load balancing of HTTP, TCP, and UDP applications. However, NGINX Plus provides some enterprise-level features, which are chargeable, including session consistency, health checks, dynamically updating configurations, etc.

HTTP Load Balancing

NGINX Plus has made many functional optimizations for HTTP load balancing, such as HTTP upgrade, long connection optimization, content compression and response caching. The implementation of HTTP load balancing in NGINX Plus is also very simple:

http {
    upstream my_upstream {        server server1.example.com;        server server2.example.com;
    }    server {
        listen 80;
        location / {
            proxy_set_header Host $host;
            proxy_pass http://my_upstream;
        }
    }
}

You can set the Host through the proxy_set_header directive, and proxy_pass forwards the request to the upstream my_upstream in.

HTTP long connection

HTTP long connection——HTTP Keepalives refers to the long connection established by NGINX PLus and the upstream server. If the client establishes a long connection with NGINX PLus, you can specify the HTTP protocol as 1.1/2.0 on the client.

HTTP protocol uses the underlying TCP protocol to transmit requests and receive responses. HTTP1.1/2.0 supports TCP long connections or reuse to avoid the overhead caused by repeatedly creating and destroying TCP connections.

Let’s take a look at the Http long connection between the client and NGINX PLus:

Detailed explanation of Nginx load balancing and reverse proxy extension functions

NGINX is a complete reverse proxy on long connections. No ambiguity either. It manages all the long connections from the client to Nginx, and also manages the long connections from Nginx to the upstream server. The two are completely independent.

Long connections managed by Nginx:

Detailed explanation of Nginx load balancing and reverse proxy extension functions

NGINX "caches" idle connections to the upstream server and does not close them directly. If a request comes, NGINX will first use one from the cached active connection instead of immediately creating a new one. If the cache is empty, NGINX will then create a new connection. This operation maintains the minimum necessary number of connections to the upstream, thereby reducing the delay between NGINX and the upstream server and reducing the utilization of temporary ports, so NGINX can handle large concurrency. This technology, combined with other load balancing technologies, can sometimes be called connection pool, or connection reuse.

In order to configure the idle long connection cache, you need to specify several instructions: proxy_http_version,proxy_set_header,keepalive

server {
    listen 80;
    location / {
        proxy_pass http://backend;
        proxy_http_version 1.1; # 只有Http1.1/2.0才能支持长连接
        proxy_set_header Connection "";
    }
}

upstream backend {    server webserver1;    server webserver2;    # maintain a maximum of 20 idle connections to each upstream server
    keepalive 20; # 闲置长连接缓存时间为20}

Load balancing of TCP and UDP

As an extension to the HTTP protocol, NGINX Plus can directly support applications based on TCP and UDP protocols. TCP-based applications such as MySQL, and UDP-based applications such as DNS and RADIUS. For TCP requests, NGINX Plus receives the client's TCP request, and then creates a TCP request to initiate access to the upstream server.

stream {
    upstream my_upstream {        server server1.example.com:1234;        server server2.example.com:2345;
    }    server {
        listen 1123 [udp];
        proxy_pass my_upstream; #注意这里没有http://了
    }
}

Support for TCP requests appeared in NGINX Plus R5. The R6 and R7 versions are mainly optimizing this function. By R7, the load balancing of TCP requests has been powerful enough to rival Http load balancing. By R9, Then UDP can be supported. Here is an impression first, and the TCP load balancing function will be introduced in more detail later.

上游连接数限制

你还可以为负载均衡做连接数量限制。这里说的连接是指NGINX Plus发给上游服务器的HTTP/TCP/UDP请求连接(对于UDP则是会话)。有了连接数限制的功能,当上游服务器的Http/TCP连接数量,或者UDP的会话数量超过一定的值时,NGINX Plus就不再创建新的连接或者会话。客户端多出的请求连接可以被放进队列等候,也可以不被处理。可以通过max_conns,queue指令来实现这一点:

upstream backend {
    zone backends 64k;
    queue 750 timeout=30s;    server webserver1 max_conns=250;    server webserver2 max_conns=150;
}

server指令表示webserver1 最多承载250个连接而webserver2 最多150个,多出来的可以放在队列queue当中等候。在队列queue中等候的连接数量和等候时间也是有限制的。当webserver1 和webserver2 连接数降低到各自最大连接数以下时,等候在队列queue中的连接随时就补上去。  
queue 750 timeout=30s表示总共可以有750个连接排队等候,每个连接等候30s。

Limiting connections 是十分有用的,可以为客户端提供可持续可预见的服务——不必因为某台server负载过大导致挂掉。一般来说一台server大概能承载多少负荷是可以通过某些手段测试出来的,因此把这个可承受的上线作为max_conns指令的值便可以保证server的相对安全。

最少时间的均衡算法

在NGINX Plus R6中增加了一种新的均衡算法——Least Time,将相应时间也考虑进去,算得上对最少连接均衡算法(Least Connections)的扩展。

这种算法同时考虑当前连接数和连接池里各个节点的平均响应时间。目的是使得当前请求选择当下响应更快、连接更少的服务器,而不是选择响应更慢、连接更多的。

当连接池的各个服务器节点有着明显不同的响应延时时,这种算法就要优于其他的几种(round-robin/ip-hash/lease connections)。一个典型的应用场景是,如果有两个分布在不同的地域的数据中心,那么本地的数据中心就要比异地的数据中心延时要少得多,这个时候就不能仅仅考虑当下连接数了,这个响应的延时也要被计入考量。Least Time算法就更倾向于选择本地的,当然这只是“更倾向于”的问题,并不能代替Nginx最基本的错误转移功能,哪怕本地的数据中心响应再快,如果它挂掉了Nginx Plus也能马上切换到远端数据中心。

Detailed explanation of Nginx load balancing and reverse proxy extension functions

“最少时间”可以有两种计算方式,一种是从请求发出到上流服务器接返回响应头部算的时间,另一种是从请求发出到接收到全部请求体的时间,分别以header_timeresponse_time来表示。

Session一致性

Session一致性(Session Persistence)问题除了可以通过指定ip-hash的均衡算法来实现,还有更为通用的实现方式,这是在NGINX Plus 中实现的。

NGINX Plus可以识别用户Session,从而能够鉴别不同的客户端,并且可以将来自同一个客户端的请求发往同一个上游服务器。这在当应用保存了用户状态的情况下非常有用,可以避免负载均衡器按照某个算法将请求发到别的服务器上去。另外,在共享用户信息的集群服务器这种方式也非常有用。

session一致性的要求同一个客户端每次的请求都选择同一个服务器,而负载均衡要求我们利用一种算法去服务器连接池里面去选择下一个,那么这两种矛盾的方式可以共存么?可以的,NGINX Plus按照如下的步骤决策到底选用哪一种:

  • 如果request匹配某个Session一致性的规则,那么根据这个规则选取上游服务器;

  • 如果没有匹配上或者匹配的服务器无法使用,那么使用负载均衡算法选择上游服务器;

为了能保证session一致性,Nginx Plus提供了sticky cookie,sticky learn和sticky route几种规则。

sticky cookie 规则

对于 sticky cookie 规则,当客户端的第一个请求选择了某个上游服务器,并从这个上游服务器返回响应时,NGINX Plus 为这个响应添加一个session cookie,用来鉴别这个上游服务器。当后面的请求再过来时,NGINX Plus取出这个cookie,分析是哪一台服务器,再把请求发往这台相同的服务器。

使用指令sticky cookie,配置如下:

upstream backend {    server webserver1;    server webserver2;

    sticky cookie srv_id expires=1h domain=.example.com path=/; 
}

cookie的名字就叫srv_id,用来“记住”是哪一个server;过期时间1h,domain为.example.com;path为/
NGINX Plus在第一次响应中,插入一个名称为srv_idcookie,用来“记住”这第一次请求是发个哪个上游的,后面的请求带上这个cookie,同样再被NGINX Plus甄别一下,再发往同一个的服务器。这样就能保证session的一致了。

sticky route 规则

sticky cookie规则类似,只不过“记住”上游服务器的方式不同而已。
在客户端发起第一次请求时,接收它的服务器为其分配一个route,此后这个客户端发起的所有请求都要带上这个route信息,或者在cookie中或者在uri中。然后和server指令中的route参数做对比,决定选取哪个server。如果指定的服务器无法处理,那交给负载均衡算法去选择下一个服务器。

map $cookie_jsessionid $route_cookie {
    ~.+\.(?P<route>\w+)$ $route;
}

map $request_uri $route_uri {
    ~jsessionid=.+\.(?P<route>\w+)$ $route;
}

upstream backend {
    server backend1.example.com route=a;
    server backend2.example.com route=b;    # select first non-empty variable; it should contain either &#39;a&#39; or &#39;b&#39;
    sticky route $route_cookie $route_uri;
}

在这里,route在JSESSIONIDcookie中选择,如其包含a那么选择服务器backend1;如其包含b则选择backend2,如果都不包含那么在$request_uri 中再做类似的选择,以此类推。

不管是选哪种方式保持session一致,如果选择出的server无法使用,那么将会按照负载均衡算法(如round-robin)在服务器列表中的选择下一台server继续处理。

实时健康检查

前面提到过,Nginx有两大功能:一个是扩展,增加更多的server以满足更大的并发;二是检测失效server,以便及时排除。那么,如何定义一个“失效server”(failed server)就变得非常重要。这一节就是来讨论这个问题——实时健康检查(Active Health Checks)。这是NGINX Plus 才有的功能,并且是收费的。

开源版本NGINX 可以提供简单的健康检查,并且可以自动做故障转移。但是如何定义一个上游server“失效”开源NGINX 却做的很简单。NGINX Plus为此提供了一个可以自定义的、综合式的评判标准,除此之外NGINX Plus还可以平缓的添加新的服务器节点到集群当中。这个功能使得NGINX Plus可能甄别更为多元化的服务器错误,十分有效的增加了HTTP/TCP/UDP应用的可靠性。
这里要用到的指令有:health_check,match 等指令:

upstream my_upstream {
    zone my_upstream 64k;
    server server1.example.com slow_start=30s;}

server {    # ...
    location /health {
        internal;
        health_check interval=5s uri=/test.php match=statusok;
        proxy_set_header HOST www.example.com;
        proxy_pass http://my_upstream
    }
}

match statusok {    # 在/test.php 做健康检查
    status 200;
    header Content-Type = text/html;
    body ~ "Server[0-9]+ is alive";}

health_checkinterval=5s表示每隔5s检测一次;uri=/test.php表示在/test.php里进行健康检查,NGINX Plus自动发起uri的请求,uri可以自定义,你在里面具体执行检查的逻辑,比如mysql/redis这些是否正常,然后作出一定的响应;然后在match指令中,就通过一些规则来匹配/test.php的响应。/test.php的响应可以包括status,header,body这些,供后面这些指令做匹配。全部检查通过,就算健康,server被标记为活跃;如果一项匹配未通过,比如Content-Type = text/json或者status = 201那都算检测失败,server不健康,被标记为不活跃。

DNS重解析

Nginx Plus一启动就会进行DNS解析并且自动永久缓存解析出的域名和IP,但是某些情形下需要重新解析下,这时候可以使用下面的指令来实现:

resolver 127.0.0.11 valid=10s;upstream service1 {
    zone service1 64k;
    server www.example.com  service=http resolve;}

127.0.0.11是默认的DNS服务器的地址,此例中NGINX Plus每10s中DNS服务器发起一次重新解析的请求。

访问控制

NGINX Plus Release 7主要给增加了TCP负载均衡的安全性。比如访问控制(Access Controls)和DDoS保护。
你现在可以允许或者拒绝对做反向代理的或者做负载均衡的TCP服务器的访问,仅仅通过配置简单的IP或者一个IP范文就能实现:

server {    # ...
    proxy_set_header Host www.example.cn;    proxy_pass http://test;    deny 72.46.166.10;    deny 73.46.156.0/24;    allow all;
}

第一个deny指令拒绝一个IP的访问,第二个拒绝一个IP范围,除去这两个剩下的都是被允许访问的。被拒绝访问的IP,会被返回403错误。

客户端连接数限制

除了可以限定上游服务器连接数,还可以限定客户端连接数,NGINX Plus R7 中实现了这个功能。你可以限制客户端发往由NGINX Plus代理的TCP应用的请求数量,防止对TCP的请求数量过多。在你的应用中,可能一部分的比另一部分要慢一些。比如说,请求你的应用的某块,将会产生大量的MySQL请求,或者fork出一大堆的work进程。那么攻击者将会利用这点产生成千上万个请求,致使你的服务器负载过重而瘫痪。

但是有了连接数限制功能,你可以通过配置limit_conn my_limit_conn指令限制同一个客户端(IP)所能发起的最大请求数,以此将上述的攻击风险降到最低。

stream {
    limit_conn_zone $binary_remote_addr zone=my_limit_conn:10m;    # ...
    server {
        limit_conn my_limit_conn 1;        # ...
    }
}

这条指令限定了每个IP同时只能有一个连接。

客户端带宽限制

R7 还新增了一项功能——限制每个客户端连接的上传和下载的最大带宽(Bandwidth Limiting) 。

server {    # ...
    proxy_download_rate 100k;
    proxy_upload_rate  50k;
}

有了这个配置,客户端最多只能以100kbytes/s的速度下载,以50kbytes/s的速度上传。因为客户端可以开多个连接,因此如果要限制总的上传/下载速度,同时还得限制下单个客户端的连接数。

无缓冲上传文件

这是在R6中增加的功能。你可以在R6和以后的版本中使用无缓冲的上传,意味Nginx Plus可以通过更大的Http请求比如上传。无缓冲的上传可以在这些请求一过来便进行上传,而不是像之前那样先是缓冲所有的上传内容,再将其转发给你上游服务器。

默认情况下,Nginx 在上传时,接收到数据时会先放进缓冲区进行缓冲,以避免将资源和基于worker进程的后端脚本绑定,但是针对事件驱动的后端语言如Node.js,缓冲是几乎没有必要的。这个修改改进了服务器对上传大文件的响应性,因为应用可以一接收到数据就马上对做出响应,使得上传进度条变成实时的和准确的。同样,这个改进也减少了磁盘I/O。

SSL/TLS优化

在R6中,可以在和上游的HTTPS 或者 uwSGI 服务器打交道时为客户端提供一个证书。这大大提高了安全性,尤其是在和不受保护网络上的安全服务进行通信的时候。R6 支持IMAP, POP3, 和SMTP的SSL/TLS 客户端认证。

缓存优化

proxy_cache 指令可以支持变量了,这个简单的改进以为着你可以定义几个基于磁盘的缓存,并且根据请求数据做自由的选择。当你打算创建巨大的内容缓存,并且将其保存到不同的磁盘时是非常有用的。

API功能

upstreem模块的一些指令,不光可以通过手动去修改,还可以通过Restful Api的方式去修改,并且马上自动更新。有了这个功能,NGINX Plus的一些功能,你都可以通过API的方式去改变。应用性得到很大提升。当然这也是收费的:

upstream backend {
    zone backends 64k;    server 10.10.10.2:220 max_conns=250;    server 10.10.10.4:220 max_conns=150;
}server {
    listen 80;
    server_name www.example.org;

    location /api {
        api write=on;
    }
}

有了API,你就可以使用curl工具来动态修改配置了,比如用POST命令来增加一个集群的节点:

$ curl -iX POST -d &#39;{"server":"192.168.78.66:80","weight":"200","max_conns":"150"}&#39; http://localhost:80/api/1/http/upstreams/backend/servers/

相当于添加了一个这样的配置:

upstream backend {
    zone backends 64k;    server 10.10.10.2:220 max_conns=250;    server 10.10.10.4:220 max_conns=150;    #此处是通过api添加的
    server 192.168.78.66:80 weight=200 max_conns=150;
}

如果需要修改一个节点配置,你可以用服务器节点在连接池中的自然顺序(从0开始)作为它们各自唯一的ID,然后使用PATCH/DELETE方法去操作它们:

$ curl -iX PATCH -d &#39;{"server":"192.168.78.55:80","weight":"500","max_conns":"350"}&#39; http://localhost:80/api/1/http/upstreams/backend/servers/2

这条命令是修改以上连接池中的第三个server 192.168.78.66:80 max_conns=200;为:

server 192.168.78.55:80 weight=500  max_conns=350;

如果要返回所有的节点信息,可以使用:

$ curl -s http://localhost:80/api/1/http/upstreams/backend/servers/

返回的是一个JSON字符串。

 {      "backup": false,      "down": false,      "fail_timeout": "10s",      "id": 0,      "max_conns": 250,      "max_fails": 1,      "route": "",      "server": "10.10.10.2:220",      "slow_start": "0s",      "weight": 1
      },
      {      "backup": false,      "down": false,      "fail_timeout": "10s",      "id": 1,      "max_conns": 150,      "max_fails": 1,      "route": "",      "server": "10.10.10.4:220",      "slow_start": "0s",      "weight": 1
      },
      {      "backup": false,      "down": false,      "fail_timeout": "10s",      "id": 2,      "max_conns": 200,      "max_fails": 1,      "route": "",      "server": "192.168.78.66:80",      "slow_start": "0s",      "weight": 200
      }
  }

配置的最佳实践

为不同个应用配置创建各自的目录和文件,并用include指令再合并到一起是个非常好的习惯。标准的 NGINX Plus配置是将各个应用的配置文件放到各自的conf.d directory目录下:

http {    include /etc/nginx/conf.d/*.conf;}
stream {    include /etc/nginx/stream.d/*.conf;}

http 和 stream 模块的各自分属不同的目录,而在http 下的都是http请求的配置,stream 都是TCP/UDP请求的配置。没有统一的标准,主要是看开发者自己能便于识别和修改。

相关推荐:

nginx负载均衡器处理session共享的几种方法

Nginx负载均衡设置实例

nginx负载均衡配置

The above is the detailed content of Detailed explanation of Nginx load balancing and reverse proxy extension functions. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn