Home >Web Front-end >Front-end Q&A >Small and medium-sized website structure analysis and optimization
First look at the website architecture diagram:
The above website architecture is widely used in medium and large websites. This article analyzes the mainstream technologies and solutions used at each layer of the architecture, which will help friends who are new to website operation and maintenance to further understand the website. Architectural understanding, thereby forming a set of architectural concepts.
The first layer: CDN
Domestic network distribution is mainly from South China Telecom to North China Unicom, causing a large problem of cross-regional access delay. For websites with a certain amount of visits, adding a CDN (Content Distribution Network) layer can effectively improve this phenomenon, and it is also The best choice for website acceleration. CDN caches website pages to nodes distributed across the country, and obtains data from the nearest computer room when users access it, which greatly reduces the path of network access. If you want to build a CDN yourself, it is not recommended. Why? In fact, to put it bluntly, don’t block anything from operation and maintenance. The deployment of CDN architecture is not complicated, but there are many factors that affect the effect. Post-management and maintenance are also complicated. It is not easy to achieve the expected results. It is a thankless job. In the end, the boss still feels that you are not capable. It is recommended to find a company that specializes in CDN. The cost is not expensive, it has the ability to resist traffic attacks, the effect is very good, and the operation and maintenance is much less troublesome. Why not?
The second layer: Reverse proxy (web page cache)
If the CDN does not cache the data to be requested, a request will be made to this layer. The cache function is configured on the proxy server (local), and the proxy server will search the local cache to see if there is a CDN request. If the data is available, it will be returned directly to the CDN. If not, the backend load balancer will be requested and forwarded to the WEB server to return the data to the proxy server. The proxy server will then send the results to the CDN. Proxy servers generally cache static pages that do not change frequently, such as images, js, css, html, etc. The mainstream caching software includes Squid, Varnish, and Nginx.
The third layer: load balancing
Websites with a large number of visits will use load balancing, because this is the best way to solve the performance bottleneck of a single server. The reverse proxy forwards the request to the load balancer, and the load balancer hands it to the back-end WEB service for processing according to the algorithm (rotation training, load selection, etc.). After the WEB service processing is completed, the data is directly returned to the reverse proxy server. Load balancing reasonably distributes requests to multiple back-end WEB servers, reducing the concurrent load on a single server and ensuring service availability. Mainstream load balancing software includes LVS, HAProxy, and Nginx.
The fourth layer: WEB service
WEB service processes user requests. WEB service processing efficiency directly affects access speed. In order to avoid this layer of factors causing slow access, it should be optimized to maximize WEB service. Best condition. Common WEB services include Apache and Nginx.
Apache optimization:
1).mod_deflate compression module
Check whether it is loaded:
# apachectl –M |grep deflate
If not installed, use apxs to compile it:
# /usr/local/apache/bin/apxs –c –I –A apache源码目录/modules/mod_deflate.c
deflate configuration parameters:
<IfModulemod_deflate.c> DeflateCompressionLevel6 #压缩等级(1-9),数值越大效率越高,消耗CPU也就越高 SetOutputFilterDEFLATE #启用压缩 AddOutputFilterByTypeDEFLATE text/html text/plain text/xml #压缩类型 AddOutputFilterByTypeDEFLATE css js html htm xml php </IfModule>
2).mod _expires cache module
Check whether it is loaded:
# apachectl –M |grep expires
If it is not installed, use apxs to compile it:
# /usr/local/apache/bin/apxs –c –I –A apache源码目录/modules/mod_expires.c
Enable the module in httpd.conf: LoadModule expires_module modules/mod_expires.so
There are three ways to use the cache mechanism: global, directory And virtual host
Global configuration, add at the end of the configuration file:
<IfModulemod_expires.c> ExpiresActiveon #启用有效期控制,会自动清除已过期的缓存,然后从服务器获取新的 ExpiresDefault "accessplus 1 days" #默认任意格式的文档都是1天后过期 ExpiresByTypetext/html "access plus 12 months" ExpiresByTypeimage/jpg "access plus 12 months" #jpg格式图片缓存12月 </IfModule>
3). Working mode selection and optimization
apache有两种常见工作模式,worker和prefork,默认是worker,是混合型的MPM(多路处理模块),支持多进程和多线程,由线程来处理请求,所以可以处理更多请求,提高并发能力,系统资源开销也小于基于进程的MPM,由于线程使用进程内存空间,进程崩溃会导致其下线程崩溃。而prefork是非线程型MPM,进程占用系统资源也比worker多,由于进程处理连接,在工作效率上也比worker更稳定。可通过apache2 –l查看当前工作模式,在编译时使用—with-mpm参数指定工作模式。根据自己业务需求选择不同工作模式,再适当增加工作模式相关参数,可提高处理能力。
配置参数说明:
<IfModuleprefork.c> StartServers 8 #默认启动8个httpd进程 MinSpareServers 5 #最小的空闲进程数 MaxSpareServers 20 #最大的空闲进程数,如果大于这个值,apache会自动kill一些进程 ServerLimit 256 #服务器允许进程数的上限 MaxClients 256 #同时最多发起多少个访问,超过则进入队列等待 MaxRequestsPerChild 4000 #每个进程启动的最大线程 </IfModule>
Nginx优化:
1).gzip压缩模块
http { …… gzip on; gzip_min_length 1k; #允许压缩的页面最小字节数,默认是0,多大都压缩,小于1k的可能适得其反 gzip_buffers 4 16k; #gzip申请内存的大小,按数据大小的4倍去申请内存 gzip_http_version 1.0; #识别http协议版本 gzip_comp_level 2; #压缩级别,1压缩比最小,处理速度最快,9压缩比最大,处理速度最慢 gzip_types text/plainapplication/x-javascripttext/css application/xml image/jpg; #压缩数据类型 gzip_vary on; #根据客户端的http头来判断,是否需要压缩 }
2).expires缓存模块
server { location ~ .*\.(gif|jpg|png|bmp|swf)$ #缓存数据后缀类型 { expires 30d; #使用expires缓存模块,缓存到客户端30天 } location ~ .*\.( jsp|js|css)?$ { expires 1d; } }
3).fastcgi优化
nginx不支持直接调用或者解析动态程序(php),必须通过fastcgi(通用网关接口)来启动php-fpm进程来解析php脚本。也就是说用户请求先到nginx,nginx再将动态解析交给fastcgi,fastcgi启动php-fpm解析php脚本。所以我们有必要对fastcgi和php-fpm进行适当的参数优化。
http { …… fastcgi_cache_path/usr/local/nginx/fastcgi_cache levels=1:2 keys_zone=TEST:10m inactive=5m; # FastCGI缓存指定一个文件路径、目录结构等级、关键字区域存储时间和非活动删除时间 fastcgi_connect_timeout 300; #指定连接到后端FastCGI的超时时间 fastcgi_send_timeout 300; #指定向FastCGI传送请求的超时时间 fastcgi_read_timeout 300; #指定接收FastCGI应答的超时时间 fastcgi_buffer_size 64k; #指定读取FastCGI应答第一部分需要多大的缓冲区 fastcgi_buffers 4 64k; #指定本地需要用多少盒多大的缓冲区来缓冲FastCGI的应答请求 fastcgi_busy_buffers_size 128k; fastcgi_temp_file_write_size 128k; #表示在写入缓存文件时使用多大的数据块,默认值是fastcgi_buffers的两倍 fastcgi_cache TEST; #开启fastcgi_cache缓存并指定一个TEST名称 fastcgi_cache_valid 200 302 1h; #指定200、302应答代码的缓存1小时 fastcgi_cache_valid 301 1d; #将301应答代码缓存1天 fastcgi_cache_valid any 1m; #将其他应答均缓存1分钟 {
php-fpm.conf配置参数:
pm =dynamic #两种控制子进程方式(static和dynamic) pm.max_children= 5 #同一时间存活的最大子进程数 pm.start_servers= 2 #启动时创建的进程数 pm.min_spare_servers= 1 #最小php-fpm进程数 pm.max_spare_servers= 3 #最大php-fpm进程数
4).proxy_cache本地缓存模块
http { …… proxy_temp_path /usr/local/nginx/proxy_cache/temp; #缓存临时目录 proxy_cache_path /usr/local/nginx/proxy_cache/cache levels=1:2 keys_zone=one:10m inactive=1d max_size=1g; #缓存文件实际目录,levels定义层级目录,1:2说明1是一级目录,2是二级目录,keys_zone存储元数据,并分配10M内存空间。inctive表示1天没有被访问的缓存就删除,默认10分钟。max_size是最大分配磁盘空间 server { listen 80; server_name 192.168.1.10; location / { proxy_cache one; #调用缓存区 #proxy_cache_valid 200 304 12h; #可根据HTTP状态码设置不同的缓存时间 proxy_cache_valid any 10m; #缓存有效期为10分钟 } #清除URL缓存,允许来自哪个网段的IP可以清除缓存(需要安装第三方模块"ngx_cache_purge"),清除URL缓存方法:访问http://192.168.1.10/purge/文件名 location ~ /purge(/.*){ allow 127.0.0.1; allow 192.168.1.0/24; deny all; proxy_cache_purge cache_one$host$1$is_args$args; } }
小结:
启用压缩模块可以节省一部分带宽,会增加WEB端CPU处理,但在上图网站架构中,WEB端启用压缩模块并没有起到作用,因为传输到上层走的是局域网。对于直接面向用户的架构还是要启用的。WEB也不用启用expires模块,因为有了反向代理服务器和CDN,所以到不了用户浏览器,开启起不到作用。
如果反向代理使用nginx做代理,可开启expires模块,将静态文件缓存到用户浏览器,浏览器发起请求时,先判断本地缓存是否有请求的数据,如果有再判断是否过期,如果不过期就直接浏览缓存数据,哪怕服务器资源已经改变,所以要根据业务情况合理设置过期时间。
5). 利用PHP缓存器提高代码执行效率
php程序在没有使用缓存器情况下,每次请求php页面,php都会对此页面进行代码编译,这就意味着重复的编译工作会增加服务器负载。有了缓存器就会把每次编译后的数据缓存到共享内存中,下次访问直接使用缓冲区已编译好的代码,从而避免重复的编译过程,以加快其执行效率。因此PHP网站使用缓存器是完全有必要的!主流的PHP缓存器有:eAccelerator、XCache
第五层:动静分离
动静分离,顾名思义,是将动态页面和静态页面分离到不同服务器上处理,比如使用web是nginx,可以让fastcgi部署到单独一台服务器,专门解析php动态页面,静态页面默认由nginx处理,并做好缓存策略。再比如一个商城网站,会有大量的图片,可以考虑增加文件服务器组,将请求图片和上传图片的都交给文件服务器处理。文件服务器主流使用NFS,存在单点故障,可以DRBD+HeartBeat+NFS部署高可用,如果单台压力过大,考虑使用分布式文件系统,如GlusterFS、MooseFS等。
第六层:数据库缓存
Use caching technology to cache hot data in memory. If the requested data is in the cache, it will be returned directly. Otherwise, it will be fetched from the database and updated to the cache system to improve read performance and reduce database pressure. Caching implementations include local caching and distributed caching. Local caching caches data into local server memory or files. Distributed cache caches data into memory and is distributed. It can cache massive amounts of data and has good scalability. The mainstream distributed cache systems include Memcached and Redis. Memcached has stable performance and fast speed, with a QPS of about 8w. If you want data persistence, choose Redis, whose performance is not lower than Memcached.
Layer 7: Database
This layer plays a leading role in the entire website architecture and directly determines the user experience. The relative architecture optimization is also more complicated.