Home  >  Article  >  Web Front-end  >  A brief analysis of how to use Puppeteer library to generate posters in Node (implementation plan sharing)

A brief analysis of how to use Puppeteer library to generate posters in Node (implementation plan sharing)

青灯夜游
青灯夜游forward
2022-01-18 19:26:444063browse

How to use Node to generate posters? The following article will introduce to you how to use Node Puppeteer to generate posters. I hope it will be helpful to you!

A brief analysis of how to use Puppeteer library to generate posters in Node (implementation plan sharing)

I wrote in the previous article that I encountered a lot of compatibility issues when using html2canvas a few days ago, and almost ran away with the bucket. Then, through the guidance of the big guys in the comment area, I discovered a poster generation solution that is simple to operate and highly reusable - Node Puppeteer generates posters.

The main design idea is: access the interface for generating posters. The interface accesses the incoming address through Puppeteer and returns a screenshot of the corresponding element.

What are the advantages of Puppeteer generation of posters over Canvas generation:

  • There are no browser compatibility, platform compatibility and other issues.
  • The code is highly reusable and can be used to generate posters for h5, mini programs, and apps.
  • The optimization operation space is larger. Because it has been changed to the interface to generate posters, various server-side methods can be used to optimize the response speed, such as: adding servers and adding cache

puppeteer introduction

          Puppeteer is a Nodejs library that provides a high-level API to control Chromium or Chrome through the DevTools protocol. Puppeteer runs in headless mode by default, that is, "headless" mode, but you can run "headed" mode by modifying the configuration headless:false. Most of the things you would do manually in a browser can be done using Puppeteer! Here are some examples:

  • Generate page PDF or screenshot.
  • Crawl SPA (Single Page Application) and generate pre-rendered content (i.e. "SSR" (Server Side Rendering)).
  • Automatically submit forms, perform UI testing, keyboard input, etc.
  • Create an automated testing environment that is constantly updated. Execute tests directly in the latest version of Chrome using the latest JavaScript and browser features.
  • Capture the timeline trace of the website to help analyze performance issues.
  • Test the browser extension.

Solution implementation

1. Write a simple interface

Express is A simple and flexible node.js web application framework. Use express to write a simple node service, define an interface, and receive the configuration items required for screenshots and pass them to puppeteer.

const express = require('express')
const createError = require("http-errors")
const app = express()
// 中间件--json化入参
app.use(express.json())
app.post('/api/getShareImg', (req, res) => {
    // 业务逻辑
})
// 错误拦截
app.use(function(req, res, next) {
    next(createError(404));
});
app.use(function(err, req, res, next) {
    let result = {
        code: 0,
        msg: err.message,
        err: err.stack
    }
    res.status(err.status || 500).json(result)
})
// 启动服务监听7000端口
const server = app.listen(7000, '0.0.0.0', () => {
    const host = server.address().address;
    const port = server.address().port;
    console.log('app start listening at http://%s:%s', host, port);
});

2. Create a screenshot module

Open a browser=> Open a tab=> Screenshot=> Close Browser

const puppeteer = require("puppeteer");

module.exports = async (opt) => {
    try {
        const browser = await puppeteer.launch();
        const page = await browser.newPage();
        await page.goto(opt.url, {
            waitUntil: ['networkidle0']
        });
        await page.setViewport({
            width: opt.width,
            height: opt.height,
        });
        const ele = await page.$(opt.ele);
        const base64 = await ele.screenshot({
            fullPage: false,
            omitBackground: true,
            encoding: 'base64'
        });
        await browser.close();
        return 'data:image/png;base64,'+ base64
    } catch (error) {
        throw error
    }
};
  • puppeteer.launch([options]): Launch a browser
  • browser.newPage(): Create a tab page
  • page.goto (url[, options]): Navigate to a page
  • page.setViewport(viewport): Specify the window to open the page
  • page.$(selector): Element selection
  • elementHandle.screenshot([options]): Screenshot. The encoding attribute can specify that the return value is base64 or Buffer
  • browser.close(): Close the browser and tab page

##3. Optimization

1. Request time optimization

The configuration item waitUntil of the page.goto(url[, options]) method indicates the state under which execution is completed. By default It is when the load event is triggered. Events include:

 await page.goto(url, {
     waitUntil: [
         'load', //页面“load” 事件触发
         'domcontentloaded', //页面 “DOMcontentloaded” 事件触发
         'networkidle0', //在 500ms 内没有任何网络连接
         'networkidle2' //在 500ms 内网络连接个数不超过 2 个
     ]
 });

If you use the networkidle0 solution to wait for the page to be completed, you will find that the response time of the interface will be longer, because networkidle0 needs to wait for 500ms. In real business scenarios, there is no need to wait in many cases, so you can encapsulate a Delay, you can customize the waiting time. For example, our poster page only renders a background image and a QR code image. When the page triggers load, it has already been loaded. There is no waiting time. You can pass 0 to skip the waiting time.

 const waitTime = (n) => new Promise((r) => setTimeout(r, n));
 //省略部分代码
 await page.goto(opt.url);
 await waitTime(opt.waitTime || 0);

If this method is not satisfactory and the page needs to notify the puppeteer to end at a certain time, you can also use page.waitForSelector(selector[, options]) to wait for a specified element on the page to appear. For example: when the page completes an operation, insert an element with id="end", and puppereer waits for this element to appear.

 await page.waitForSelector("#end")

Similar methods include:

  • page.waitForXPath(xpath[, options]):等待 xPath 对应的元素出现在页面中。
  • page.waitForSelector(selector[, options]):等待指定的选择器匹配的元素出现在页面中,如果调用此方法时已经有匹配的元素,那么此方法立即返回。
  • page.waitForResponse(urlOrPredicate[, options]):等待指定的响应结束。
  • page.waitForRequest(urlOrPredicate[, options]):等待指定的响应出现。
  • page.waitForFunction(pageFunction[, options[, ...args]]):等待某个方法执行。
  • page.waitFor(selectorOrFunctionOrTimeout[, options[, ...args]]):此方法相当于上面几个方法的选择器,根据第一个参数的不同结果不同,比如:传入一个string类型,会判断是不是xpath或者selector,此时相当于waitForXPath或waitForSelector。

2. 启动项优化

        Chromium启动时还会开启很多不需要的功能,可以通过参数禁用某些启动项。

    const browser = await puppeteer.launch({
        headless: true,
        slowMo: 0,
        args: [
            '--no-zygote',
            '--no-sandbox',
            '--disable-gpu',
            '--no-first-run',
            '--single-process',
            '--disable-extensions',
            "--disable-xss-auditor",
            '--disable-dev-shm-usage',
            '--disable-popup-blocking',
            '--disable-setuid-sandbox',
            '--disable-accelerated-2d-canvas',
            '--enable-features=NetworkService',
        ]
    });

3. 复用浏览器

        因为每次接口被调用都启动了一个浏览器,截图之后关闭了这个浏览器,造成了资源的浪费,并且启动浏览器也需要耗费时间。并且同时启动的浏览器过多,程序还会抛出异常。所以使用了连接池:启动多个浏览器,在其中一个浏览器下创建标签页打开页面,截图完成后只关闭标签页,保留浏览器。下一次请求过来时直接创建标签页,达到复用浏览器的目的。当浏览器使用次数达到一定数目或者一段时间内没有被使用时就关闭这个浏览器。 有大佬已经对generic-pool这个连接池进行了处理,我就直接拿来用了。

const initPuppeteerPool = () => {
 if (global.pp) global.pp.drain().then(() => global.pp.clear())
 const opt = {
   max: 4,//最多产生多少个puppeteer实例 。
   min: 1,//保证池中最少有多少个puppeteer实例存活
   testOnBorrow: true,// 在将实例提供给用户之前,池应该验证这些实例。
   autostart: false,//是不是需要在池初始化时初始化实例
   idleTimeoutMillis: 1000 * 60 * 60,//如果一个实例60分钟都没访问就关掉他
   evictionRunIntervalMillis: 1000 * 60 * 3,//每3分钟检查一次实例的访问状态
   maxUses: 2048,//自定义的属性:每一个 实例 最大可重用次数。
   validator: () => Promise.resolve(true)
 }
 const factory = {
   create: () =>
     puppeteer.launch({
       //启动参数参考第二条
     }).then(instance => {
       instance.useCount = 0;
       return instance;
     }),
   destroy: instance => {
     instance.close()
   },
   validate: instance => {
     return opt.validator(instance).then(valid => Promise.resolve(valid && (opt.maxUses <= 0 || instance.useCount < opt.maxUses)));
   }
 };
 const pool = genericPool.createPool(factory, opt)
 const genericAcquire = pool.acquire.bind(pool)
 // 重写了原有池的消费实例的方法。添加一个实例使用次数的增加
 pool.acquire = () =>
   genericAcquire().then(instance => {
     instance.useCount += 1
     return instance
   })

 pool.use = fn => {
   let resource
   return pool
     .acquire()
     .then(r => {
       resource = r
       return resource
     })
     .then(fn)
     .then(
       result => {
         // 不管业务方使用实例成功与后都表示一下实例消费完成
         pool.release(resource)
         return result
       },
       err => {
         pool.release(resource)
         throw err
       }
     )
 }
 return pool;
}
global.pp = initPuppeteerPool()

4. 优化接口防止图片重复生成

        用同一组参数重复调用时每次都会开启一个浏览器进程去截图,可以使用缓存机制优化重复的请求。可以通过传入唯一的key作为标识位(比如用户id+活动id),将图片base64存入redis或者写入内存中。当接口被请求时先查看缓存里是否已经生成过,如果生成过就直接从缓存取。否则就走生成海报的流程。

结尾

        这个方案目前已经开始在项目里试运行了,这对于我一个前端开发来说简直太友好了,再也不用在小程序里一步一步去绘制canvas,不用考虑资源跨域,也不用考虑微信浏览器、各种自带浏览器的兼容问题。省下了时间可以让我写这篇文章。其次,我比较担心的还是性能问题,因为只有在分享的动作才会触发,并发较小,目前使用还未暴露出性能的问题,有了解的大佬们可以指导我一下可以进一步优化或者预防的点。

代码

完整代码查看:github

https://github.com/yuwuwu/markdown-code/tree/master/puppeteer%E6%88%AA%E5%9B%BE

更多node相关知识,请访问:nodejs 教程!!

The above is the detailed content of A brief analysis of how to use Puppeteer library to generate posters in Node (implementation plan sharing). For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:juejin.cn. If there is any infringement, please contact admin@php.cn delete