我对使用Scrapy的输出中的报价有问题。我试图废除包含逗号的数据,这样会在一些列中出现双引号,如下所示:
TEST,TEST,TEST,ON,TEST,TEST,"$2,449,000, 4,735 Sq Ft, 6 Bed, 5.1 Bath, Listed 03/01/2016"
TEST,TEST,TEST,ON,TEST,TEST,"$2,895,000, 4,975 Sq Ft, 5 Bed, 4.1 Bath, Listed 01/03/2016"只有带有逗号的列才会有双引号。如何将所有数据列双引号?
我想要Scrapy输出:
"TEST","TEST","TEST","ON","TEST","TEST","$2,449,000, 4,735 Sq Ft, 6 Bed, 5.1 Bath, Listed 03/01/2016"
"TEST","TEST","TEST","ON","TEST","TEST","$2,895,000, 4,975 Sq Ft, 5 Bed, 4.1 Bath, Listed 01/03/2016"有什么设置我可以改变来做吗?
发布于 2017-03-10 13:37:28
默认情况下,对于CSV输出,Scrapy使用使用缺省值。
对于字段引号,极小
指示编写器对象只引用那些包含特殊字符的字段,例如分隔符、勘探器或行终止符中的任何字符。
但是,您可以构建自己的CSV项目导出程序,并在默认的'excel'方言基础上设置新的方言。
例如,在exporters.py模块中,定义以下内容
import csv
from scrapy.exporters import CsvItemExporter
class QuoteAllDialect(csv.excel):
quoting = csv.QUOTE_ALL
class QuoteAllCsvItemExporter(CsvItemExporter):
def __init__(self, *args, **kwargs):
kwargs.update({'dialect': QuoteAllDialect})
super(QuoteAllCsvItemExporter, self).__init__(*args, **kwargs)然后,您只需对CSV输出进行在您的设置中引用此项导出程序,如下所示:
FEED_EXPORTERS = {
'csv': 'myproject.exporters.QuoteAllCsvItemExporter',
}像这样简单的蜘蛛:
import scrapy
class ExampleSpider(scrapy.Spider):
name = "example"
allowed_domains = ["example.com"]
start_urls = ['http://example.com/']
def parse(self, response):
yield {
"name": "Some name",
"title": "Some title, baby!",
"description": "Some description, with commas, quotes (\") and all"
}将输出以下内容:
"description","name","title"
"Some description, with commas, quotes ("") and all","Some name","Some title, baby!"https://stackoverflow.com/questions/42658875
复制相似问题