python - 关于 scrapy 的 pipeline 和 items 问题

Question

能不能实现这种： aItem的数据由aPipeline处理 bItem的数据由bPipeline处理

天蓬老师 · Answer

Is this the purpose?
For example, your items.py has the following items

Then you can do the following in the process_item function in pipelines.py

This way different data can be processed separately,

天蓬老师 · Answer

You can determine which crawler the result is in the pipeline:

def process_item(self, item, spider):
    if spider.name == 'news':
        #这里写存入 News 表的逻辑
        news = News()
        ...（省略部分代码）
        self.session.add(news)
        self.session.commit()
     elif spider.name == 'bsnews':
        #这里写存入 News 表的逻辑
        bsnews = BsNews()
        ...（省略部分代码）
        self.session.add(bsnews)
        self.session.commit()
        
      return item

For this kind of problem where multiple crawlers are in one project, different crawlers need to use different logic in the pipeline. The author of scrapy explained it this way.
Go and have a look

PHP中文网 · Answer

Yes, the process_item of pipelines has a spider parameter, which can filter the corresponding spider to use this pipeline

python - 关于 scrapy 的 pipeline 和 items 问题

reply all(3)I'll reply