search

Home  >  Q&A  >  body text

javascript - Nodejs crawls website page turning judgment and speech judgment problems.

Website http://www.everlight.com/news...
Two questions 1: How to get the url of each page
2 is to click on the content of the news,
For example http:/ /www.everlight.com/news...
If it is an English operating system, English news will be displayed.
If it is a Chinese system, Chinese news will be displayed.
I want to capture it permanently in node How to retrieve English news.

淡淡烟草味淡淡烟草味2794 days ago650

reply all(3)I'll reply

  • 巴扎黑

    巴扎黑2017-05-16 13:44:31

    Question closed...

    When posting, there are several key data in the form, which are placed in hidden variables. Specifying these variables should solve the problem.

    reply
    0
  • 世界只因有你

    世界只因有你2017-05-16 13:44:31

    There is a language switch in the upper right corner. If you look at the code, this function is called:
    function __doPostBack(eventTarget, eventArgument) {

    if (!theForm.onsubmit || (theForm.onsubmit() != false)) {
        theForm.__EVENTTARGET.value = eventTarget;
        theForm.__EVENTARGUMENT.value = eventArgument;
        theForm.submit();
    }

    }

    In fact, you just submitted the form,
    and the form is the original page sent by post
    So, after you click, you will see that the page flashes, but the URL does not change.
    So, if you want the English version, pass the parameter in post method: __EVENTTARGET="ctl00$ctl00$lBtnUSA" to get the English version of the page.

    Get the url in the page and parse the dom.

    How to get the url in the page:

    var jsdom = require("jsdom");
     
    jsdom.env({
      url: "http://www.everlight.com/newsdetail.aspx?pcseq=4&cseq=7&seq=291",
      scripts: ["http://code.jquery.com/jquery.js"],
      done: function (err, window) {
        var $ = window.$;
        console.log("HN Links");
        $("a").each(function() {
          //console.log(" -", $(this).text());
          var tmp=$(this).text()+"---"+$(this).attr("href");
          console.log(tmp);
        });
      }
    });
    

    reply
    0
  • 某草草

    某草草2017-05-16 13:44:31

    Let’s analyze the header information in the request. There is an item in it that can be used to set the language

    reply
    0
  • Cancelreply