Home >Web Front-end >HTML Tutorial >How Python crawler handles the delayed loading part (delayload_url) in html_html/css_WEB-ITnose
Download the source code of the link "http://s.1688.com/selloffer/industry_offer_search.htm?mixWholesale=true&industryFlag=food&categoryId=1032913&from=industrySearch&n=y&filt=y#_fb_top", the result only contains part of the page; There are a total of 60 products on this page, but only 20 can be parsed from the source code, and the page turning link cannot be found;
Audit the element to find the data source link and directly use that link to obtain the data
Um. . . I don’t know if it’s too late to answer now! This can capture the delayed loading URL address through Firefox, and then you can find the pattern. I happened to be crawling 1688 data and encountered the problem of delayed loading. Then I captured the URL through Firefox and found that I only need to take out the URL in the div sw-delayload-url and add &callback=any character at the end. string, and then change &startIndex= this every time (startIndex=20, startIndex=40), this will return a json data
I tried the url you posted, but I don’t know why no data is returned. , maybe the product has been removed from the shelves. . . You can try what I said
. If you have solved it and have a better method, I hope you can share it with me. Thank you