python - scrapy.Request请求地址返回400,但是用单独request模块请求同样的url返回正常

一问题描述

用scrapy.Request()方法请求一个url地址，发现返回400错误，我检查了我不是被封ip，把请求链接直接copy到浏览器上，是可以正常显示结果的，单独用python的requests模块post请求同样的url地址，获得的是正常的返回，我就不明白了，是我scrapy.Request的里面格式错误了？

二代码展示

scrapy的 spider文件，scrapy项目是用scrapy的命令行执行出来的，常规设置，其他不变

# -*- coding: utf-8 -*-
import scrapy


class PileSpider(scrapy.Spider):
    name = "pile"
    allowed_domains = ["www.evehicle.cn"]

    headers = {
        'Accept': 'application/json, text/javascript, */*; q=0.01',
        'Accept-Encoding': 'gzip, deflate',
        'Accept-Language': 'zh-CN,zh;q=0.8',
        'Connection': 'keep-alive',
        'Content-Length': '11',
        'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
        'Host': 'www.evehicle.cn',
        'Origin': 'http://www.evehicle.cn',
        'Referer': 'http://www.evehicle.cn/wp-content/themes/newsite/html/emap.html',
        'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36',
        'X-Requested-With': 'XMLHttpRequest',

    }
    url_all = 'http://www.evehicle.cn/wp-content/themes/newsite/tool/api.php?api=web/sites'

    def start_requests(self):
        print('1')
        yield scrapy.Request(url=self.url_all, callback=self.parse_all, headers=self.headers, method='POST')

    def parse_all(self, response):
        print(response)

三错误反馈

如图，请求scrapy后台提示出现了400返回，请求链接错误，可是我是按照chrome工具中显示的头部请求信息填写的scrapy.Request()的请求。我就不知道错在哪里了。

相反，我单独用requests模块请求这个url，相同的头部就是返回正确的信息，没用http抱错。

requests.post(url=all_url, headers=headers, timeout=5)

迷茫2876 天前1313

python - scrapy.Request请求地址返回400,但是用单独request模块请求同样的url返回正常

全部回复(0)我来回复