So basically I'm scraping data from the web and I have a project file imported into my main spider file. Now, when I grab the data and store it in a container and save it as a csv, the linked column always ends up being the first column in the csv. How to set the position of a custom column?
pName = response.css('#search .a-size-medium').css('::text').extract() pPrice = response.css('#search .a-price-whole').css('::text').extract() imgs = response.css('.sbv-product-img , .s-image-fixed-height .s-image').css('::attr(src)').extract() for prod in zip(pName , pPrice , imgs): items['prodName'] = prod[0] items['price'] = prod[1] items['imgLink'] = prod[2] yield items
P粉3916779212024-04-05 10:51:21
Use the FEED_EXPORT_FIELDS
settings in the settings.py
file or spider custom_settings
properties. The columns will be arranged in the order you set in Settings Values.
For example:
class MySpider(scrapy.Spider): custom_settings = { "FEED_EXPORT_FIELDS": ["prodName", "price", "imgLink"] }
Or in settings.py
:
FEED_EXPORT_FIELDS=["prodName", "price", "imgLink"]
scrapy documentationLinks and link2