search

Home  >  Q&A  >  body text

python - 爬取某网站时需要浏览器刷新一次才能返回真实的页面数据?

请求所需参数全部带上了,并且已带上cookies,已修改header,拿到的数据是提示刷新的HTML,如图:

原网页是动态加载的瀑布流,即往不断下拉就不断有内容呈现出来,静候大神,目前采用scrapy框架,暂时还不想上selenium+phantomjs,太重了

迷茫迷茫2820 days ago572

reply all(2)I'll reply

  • 黄舟

    黄舟2017-04-18 10:32:53

    For dynamically loaded data, you should ask for it through ajax api, not on the web page. If you want to do your job well, you must first sharpen your tools and make good use of F12.

    reply
    0
  • 大家讲道理

    大家讲道理2017-04-18 10:32:53

    This situation is much easier to solve than grabbing a proxy IP web page, using cookies but reporting an error===, you can completely judge the current page content and then perform a simulated refresh~ The important thing is to maintain the session.

    If it still doesn’t work, you can add a referer and try again

    reply
    0
  • Cancelreply