Home  >  Article  >  Web Front-end  >  Gzip compression issues in HTTP

Gzip compression issues in HTTP

亚连
亚连Original
2018-06-12 14:39:251500browse

Gzip is a popular file compression algorithm that is now widely used, especially on the Linux platform. The following article mainly introduces to you the relevant information about gzip compression in HTTP transmission. The introduction in the article is very detailed. Friends in need can refer to it. Let's learn together.

Preface

The benefits of speeding up web page loading are self-evident. In addition to saving traffic and improving the user’s browsing experience, there is another potential The benefit is that Gzip has a better relationship with search engine crawlers. For example, Google can crawl web pages faster than ordinary manual crawling by reading gzip files directly. In Google Webmaster Tools you can see that sitemap.xml.gz is submitted directly as a Sitemap.

And these benefits are not limited to static content. PHP dynamic pages and other dynamically generated content can be compressed by using the Apache compression module, plus other performance adjustment mechanisms and corresponding server-side caching rules. This Can greatly improve the performance of the website. Therefore, for PHP programs deployed on Linux servers, we recommend that you enable Gzip Web compression if the server supports it. Let’s take a look at the detailed introduction below.

Why should we enable gZip

When we send an email to someone, we Compress your own file, and then decompress and obtain the file after the recipient receives the file. This operation is already commonplace for us. The purpose of compressing files is to reduce the size of the transferred files and speed up the transfer speed. The same is true for the purpose of enabling gZip in http transmission, but generally when articles introduce gZip, they always combine it with some server-side configuration (nginx) or build tool plug-ins (webpack), listing a lot of configurations for people to see. It was so foggy that in the end I still didn’t understand why and how to use it.

http and gZip

Let’s discuss these issues

How to communicate with gZip files

When we transfer compressed files to others, they usually have the suffix name .rar, .zip, etc. After the other party gets the file, they choose different decompression methods based on the corresponding suffix name and then decompress the file. The browser we use plays the role of decompressing files during http transmission, but how does the browser tell what format the file is and what format should be used to decompress it?

In the http/1.0 protocol, a Content-Encoding field can be configured for the data sent by the server. This field is used to describe the compression method of the data.

Content-Encoding: gzip
Content-Encoding: compress
Content-Encoding: deflate

The client receives the returned data Then check the information of the corresponding field, and then do the corresponding decoding according to the corresponding format. When making a request, the client can use the Accept-Encoding field to indicate which compression methods it accepts.

Accept-Encoding: gzip, deflate

We can see the requested information in the browser console

Compatibility

When it comes to the browser as a front-end, I can't help but think about whether there are browsers that don't support it. HTTP/1.0 was released in May 1996. The good news is that there is basically no need to worry about compatibility issues, almost all browsers support it. It is worth mentioning that there was a bug in the early versions of ie6 that would destroy gZip. Later, ie6 itself fixed this problem in WinXP SP2, and the number of users using this version is also very small.

Who will compress the files

It seems that this can only be done on the server side. What we see most on the Internet are things like nginx enabling gZip configuration. article, but nowadays spa applications are popular on the front end. When using frameworks such as react and vue, they are always accompanied by their own set of scaffolding. Generally, webpack is used as the packaging tool, in which plug-ins such as compression-webpack-plugin can be configured, which allows us to generate The files are compressed by gZip and so on and generate corresponding compressed files. When building our application, we may also place a layer of node applications in the service area and front-end files for interface authentication and file forwarding. The express framework we are familiar with in nodejs also has a compression middleware, which can enable gZip. People are dazzled for a while. Who should use it and how to use it?

Compression when the server responds to the request

其实 nginx 压缩和 node 框架中用中间件去压缩都是一样的,当我们点击网页发送一个请求时候,我们的服务端会找到对应的文件,然后对文件进行压缩返回压缩后的内容【当然可以利用缓存减少压缩次数】,并配置好我们上面提到的 Content-Encoding 信息。对于一些应用在构架时候并没有上游代理层,比如服务端就一层 node 就可以直接用自己本身的压缩插件对文件进行压缩,如果上游配有有 nginx 转发处理层,最好交给 nginx 来处理这些,因为它们有专门为此构建的内容,可以更好的利用缓存并减小开销(很多使用c语言编写的)。

我们看一些 nginx 中开启 gZip 压缩的一部分配置

# 开启gzip
gzip on;
# 启用gzip压缩的最小文件,小于设置值的文件将不会压缩
gzip_min_length 1k;
# gzip 压缩级别,1-10,数字越大压缩的越好,也越占用CPU时间,后面会有详细说明
gzip_comp_level 2;
# 进行压缩的文件类型。javascript有多种形式。其中的值可以在 mime.types 文件中找到。
gzip_types text/plain application/javascript application/x-javascript text/css application/xml text/javascript;

应用构建时候压缩

既然服务端都可以做了为什么 webpack 在打包前端应用时候还有这样一个压缩插件呢,我们可以在上面 nginx 配置中看到 gzip_comp_level 2 这个配置项,上面也有注释写道 1-10 数字越大压缩效果越好,但是会耗费更多的CPU和时间,我们压缩文件除了减少文件体积大小外,也是为了减少传输时间,如果我们把压缩等级配置的很高,每次请求服务端都要压缩很久才回返回信息回来,不仅服务器开销会增大很多,请求方也会等的不耐烦。但是现在的 spa 应用既然文件都是打包生成的,那如果我们在打包时候就直接生成高压缩等级的文件,作为静态资源放在服务器上,接收到请求后直接把压缩的文件内容返回回去会怎么样呢?

webpack 的 compression-webpack-plugin 就是做这个事情的,配置起来也很简单只需要在装置中加入对应插件,简单配置如下

const CompressionWebpackPlugin = require('compression-webpack-plugin');
webpackConfig.plugins.push(
 new CompressionWebpackPlugin({
  asset: '[path].gz[query]',
  algorithm: 'gzip',
  test: new RegExp('\\.(js|css)$'),
  threshold: 10240,
  minRatio: 0.8
 })
)

webpack 打包完成后生成打包文件外还会额外生成 .gz 后缀的压缩文件

那么这个插件的压缩等级是多少呢,我们可以在源码中看到默认的 level 是 9

...
const zlib = require('zlib');
this.options.algorithm = zlib[this.options.algorithm];
...
this.options.compressionOptions = {
 level: options.level || 9,
 flush: options.flush
 ...
}

可以看到压缩使用的是 zlib 库,而 zlib 分级来说,默认是 6 ,最高的级别就是9 Best compression (also zlib.Z_BEST_COMPRESSION),因为我们只有在上线项目时候才回去打包构建一次,所以我们在构建时候使用最高级的压缩方式压缩多耗费一些时间对我们来说根本没任何损耗,而我们在服务器上也不用再去压缩文件,只需要找到相应已经压缩过的文件直接返回就可以了。

服务端怎么找到这些文件

在应用层面解决这个问题还是比较简单的,比如上述压缩文件会产生index.css, index.js的压缩文件,在服务端简单处理可以判断这两个请求然后给予相对应的压缩文件。以 node 的 express 为例

...
app.get(['/index.js','/index.css'], function (req, res, next) {
 req.url = req.url + '.gz'
 res.set('Content-Encoding', 'gzip')
 res.setHeader("Content-Type", generateType(req.path)) // 这里要根据请求文件设置content-type
 next()
})

上面我们可以给请求返回 gZip 压缩后的数据了,当然上面的局限性太强也不可取,但是对于处理这个方面需求也已经有很多库存在,express 有 express-static-gzip 插件 koa 的 koa-static 则默认自带对 gZip 文件的检测,基本原理就是对请求先检测 .gz后缀的文件是否存在,再去根据结果返回不同的内容。

哪些文件可以被 gZip 压缩

gZip 可以压缩所有的文件,但是这不代表我们要对所有文件进行压缩,我们写的代码(css,js)之类的文件会有很好的压缩效果,但是图片之类文件则不会被 gzip 压缩太多,因为它们已经内置了一些压缩,一些文件(比如一些已经被压缩的像.zip文件那种)再去压缩可能会让生成的文件体积更大一些。当然已经很小的文件也没有去压缩的必要了。

实践

能开启 gZip 肯定是要开启的,具体使用在请求时候实时压缩还是在构建时候去生成压缩文件,就要看自己具体业务情况。

参考资料

  • How are zlib, gzip and zip related? What do they have in common and how are they different?

  • webpack gzip vs express gzip

  • What is gZip compression?

  • HTTP 协议

上面是我整理给大家的,希望今后会对大家有帮助。

相关文章:

关于vue2.0中datepicker使用方法

JavaScript调停者模式(详细教程)

在jQuery中有关Dom元素使用方法?

在Vue中如何实现事件响应式进度条组件

How to implement the exchange method of two variable values ​​​​in JS

The above is the detailed content of Gzip compression issues in HTTP. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn