Home > Article > WeChat Applet > Detailed explanation of complex rich text parsing in WeChat mini programs
Recently I am writing a crawler, which needs to parse the web page for use by WeChat Mini Program. Both text and image analysis are easy to understand, and the mini program also has corresponding text and image tags for presentation. More complex ones, such as tables, are more difficult. Whether it is server-side parsing or mini program rendering, it is very laborious, and it is difficult to cover all situations. So I thought that converting the HTML code corresponding to the table into images would be a workaround.
Here we use the node-webshot module, which lightly encapsulates PhantomJS and can easily save web pages as screenshots.
First install Node.js and PhantomJS, then create a new js file and load the node-webshot module:
const webshot = require('webshot');
Define options:
const options = { // 浏览器窗口 screenSize: { width: 755, height: 25 }, // 要截图的页面文档区域 shotSize: { height: 'all' }, // 网页类型 siteType: 'html' };
Here, the width of the browser window should be set reasonably according to the situation of the web page, and the height can be set to a very small Value, then the height of the page document area must be set to all, and the width defaults to the window width, so that the table can be completely screenshotted at the smallest size.
Next, define the html string:
let html = "target rich text html code, eg: <table>...</table>";
webshot(html, 'demo.png', options, (err) => {if (err)console.log(`Webshot error: ${err.message}`);});
presentation, there is no difficulty.
The above is the detailed content of Detailed explanation of complex rich text parsing in WeChat mini programs. For more information, please follow other related articles on the PHP Chinese website!