javascript - 为什么有些网站能显示内容但python.requests却爬不出数据呢？

Question

打算在http://app1.sfda.gov.cn/上获取一些数据整理起来，想用python.requests实现一个小爬虫来获取数据，但是python.requests会一直报('Connection aborted.', ConnectionResetError(54, 'Connection reset by p...

ringa_lee · Answer

import requests
def testLoadRequest():
    officialHeader = {
        'User-Agent':'Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Win64; x64; Trident/5.0)',
        'referer':'http://app1.sfda.gov.cn/datasearch/face3/dir.html',
        'Upgrade-Insecure-Requests':'1'
    }
    officialUrl = 'http://app1.sfda.gov.cn/datasearch/face3/dir.html';
    try:
        officialRequest = requests.get(officialUrl, headers= officialHeader)
        print(officialRequest.content)
    except Exception as e:
        print(e)
testLoadRequest()

以上代码成功
改下网址
你的user_agent写了两行？另外我换了ie9的
print(r.content)中的r是什么，写成r还不报错，我也不懂是因为try？

我狂点运行，控制台让我与网站管理员联系，我现在网站打不开了，不懂。难道被我一秒三次的请求攻陷了？
话说这种网站根本不需要headers吧

javascript - 为什么有些网站能显示内容但python.requests却爬不出数据呢？

membalas semua(1)saya akan balas