Home  >  Article  >  Web Front-end  >  Node.js implements WeChat applet to capture web content

Node.js implements WeChat applet to capture web content

不言
不言forward
2018-10-20 17:17:242869browse

The content of this article is about node.js implementing WeChat applet to capture web content. It has certain reference value. Friends in need can refer to it. I hope it will be helpful to you.

Recently I am studying the cloud development function of WeChat applet. The biggest advantage of cloud development is that there is no need to build a server on the front end. You can use cloud capabilities to write a WeChat mini program from scratch, which avoids the cost of buying a server. For individuals who want to try to practice WeChat mini programs from the front end to the backend Development is still a good choice. It is possible to launch a WeChat mini program in one day.

Advantages of cloud development

Cloud development provides developers with complete cloud support, weakening the back-end and operation and maintenance concepts. There is no need to build a server and use the API provided by the platform. Core business development can achieve rapid launch and iteration. At the same time, this capability is compatible with the cloud services already used by developers and is not mutually exclusive.

Cloud development currently provides three basic capabilities to support:

  1. Cloud function: code running in the cloud, WeChat private protocol is naturally authenticated, and developers only need to write their own Business logic code

  2. Database: A JSON database that can be operated on the front end of the mini program and read and written in cloud functions

  3. Storage : Upload/download cloud files directly on the front end of the mini program, and manage them visually in the cloud development console

Okay, I have introduced so much knowledge about cloud development, students with perceptual knowledge can study it Research. Official document address: https://developers.weixin.qq....

Web content capture

The applet is about answering questions, so the source of the questions is a problem. Searching on the Internet, pasting one question one by one is one way to do it, but with such repetitive work, I would probably give up after about 10 posts. So I thought of web scraping. I just happened to pick up the node I learned before.

Must-have tools:

  1. Cheerio. A package similar to server-side JQuery. It is mainly used to analyze and filter the crawled content.

  2. fs module of node. This is the module that comes with node and is used to read and write files. This is used to write the parsed data into a json file.

  3. Axios (optional). Used to crawl the HTML pages of the website. Because the data I want is rendered after clicking a button on the web page, it cannot be captured by directly accessing this URL. I have no choice but to copy the desired content, save it as a string, and parse the string.

Next, you can use npm init to initialize a node project, and press Enter to generate a package.json file.
Then npm install --save axios cheerio installs the cheerio and axios packages.

The key is to use cheerio to implement a function similar to jquery. Just cheerio.load(quesitons) the captured content, and then you can follow the jquery operation to get the DOM and assemble the data you want.

Finally use fs.writeFile to save the data to the json file, and you're done.

The specific code is as follows

let axios = require("axios");

let cheerio = require("cheerio");

let fs = require("fs");

// 我的html结构大致如下,有很多条数据
const questions = `
  •       
            
    举头望明月,__________。
            
              回首白云低         
            
              低头思故乡         
            
              当春乃发生         
            
              红掌拨清波         
          
        
  •     
  •       
            
    __________,却话巴山夜雨时。
            
              何当共剪西窗烛         
            
              在天愿做比翼鸟         
            
              世味年来薄似纱         
            
              两岸青山相对出         
          
        
  •     ..........     `;      const $ = cheerio.load(quesitons); var arr = []; for (var i = 0; i  {    if (err) throw err;    console.log("json文件已成功保存!"); });

    The file format after saving to json is as follows, so that it can be uploaded to the cloud server through the json file.

    Node.js implements WeChat applet to capture web content

    Notes

    For the database developed by WeChat applet cloud, the data format for uploading json files is required Note that I was always prompted with a format error before, but later I discovered that the JSON data is not an array, but similar to JSON Lines, that is, each record object is separated by n instead of commas. Therefore, it is necessary to do a small process on the json file written by node before it can be uploaded successfully.

    The above is the detailed content of Node.js implements WeChat applet to capture web content. For more information, please follow other related articles on the PHP Chinese website!

    Statement:
    This article is reproduced at:segmentfault.com. If there is any infringement, please contact admin@php.cn delete