Ubuntu 12.04 LTS
Ruby ruby 1.9.3dev (33323-09-23修订版)
i686-linux
Rails 3.2.9
以下是我收到的CSV文件的内容:
"date/time","settlement id","type","order id","sku","description","quantity","marketplace","fulfillment","order city","order state","order postal","product sales","shipping credits","gift wrap credits","promotional rebates","sales tax collected","selling fees","fba fees","other transaction fees","other","total"
"Mar 1, 2013 12:03:54 AM PST","5481545091","Order","108-0938567-7009852","ALS2GL36LED","Solar Two Directional 36 Bright White LED Security Flood Light with Motion Activated Sensor","1","amazon.com","Amazon","Pasadena","CA","91104-1056","43.00","3.25","0","-3.25","0","-6.45","-3.75","0","0","32.80"
然而,当我试图解析CSV文件时,我得到了错误:
1.9.3dev :016 > options = { col_sep: ",", quote_char:'"' }
=> {:col_sep=>",", :quote_char=>"\""}
1.9.3dev :022 > CSV.foreach("/tmp/my_data.csv", options) { |row| puts row }
CSV::MalformedCSVError: Illegal quoting in line 1.
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1925:in `block (2 levels) in shift'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1887:in `each'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1887:in `block in shift'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1849:in `loop'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1849:in `shift'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1791:in `each'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1208:in `block in foreach'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1354:in `open'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1207:in `foreach'
from (irb):22
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/bin/irb:16:in `'
Then I tried simplifying the data i.e.
"name","age","email"
"jignesh","30","jignesh@example.com"
however still I am getting the same error:
1.9.3dev :023 > CSV.foreach("/tmp/my_data.csv", options) { |row| puts row }
CSV::MalformedCSVError: Illegal quoting in line 1.
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1925:in `block (2 levels) in shift'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1887:in `each'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1887:in `block in shift'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1849:in `loop'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1849:in `shift'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1791:in `each'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1208:in `block in foreach'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1354:in `open'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1207:in `foreach'
from (irb):23
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/bin/irb:16:in `'
Again I tried simplifying the data like this:
name,age,email
jignesh,30,jignesh@example.com
and it works.See the output below:
1.9.3dev :024 > CSV.foreach("/tmp/my_data.csv") { |row| puts row }
name
age
email
jignesh
30
jignesh@example.com
=> nil
But I will be receiving the CSV files having quoted data so removing quotes solution is not actually I am looking for.I am unable to figure out what is causing the error: CSV::MalformedCSVError: Illegal quoting in line 1. in my earlier examples.
I have verified that in the CSV there are no leading/trailing spaces by enabling "Show whitespace characters" and "Show Line Endings" in my text editor.Also I have verified the encoding using following.
1.9.3dev :026 > File.open("/tmp/my_data.csv").read.encoding
=> #
Note: I tried using CSV.read too but same error with that method.
Can anybody please help me getting out of the problem and make me understand where it is going wrong?
=====================
I just found following post at: http://www.ruby-forum.com/topic/448070 and tried following:
file_data = file.read
file_data.gsub!('"', "'")
arr_of_arrs = CSV.parse(file_data)
arr_of_arrs.each do |arr|
Rails.logger.debug "=======#{arr}"
end
and got the following output:
=======["\xEF\xBB\xBF'date/time'", "'settlement id'", "'type'", "'order id'", "'sku'", "'description'", "'quantity'", "'marketplace'", "'fulfillment'", "'order city'", "'order state'", "'order postal'", "'product sales'", "'shipping credits'", "'gift wrap credits'", "'promotional rebates'", "'sales tax collected'", "'selling fees'", "'fba fees'", "'other transaction fees'", "'other'", "'total'"]
=======["'Mar 1", " 2013 12:03:54 AM PST'", "'5481545091'", "'Order'", "'108-0938567-7009852'", "'ALS2GL36LED'", "'Solar Two Directional 36 Bright White LED Security Flood Light with Motion Activated Sensor'", "'1'", "'amazon.com'", "'Amazon'", "'Pasadena'", "'CA'", "'91104-1056'", "'43.00'", "'3.25'", "'0'", "'-3.25'", "'0'", "'-6.45'", "'-3.75'", "'0'", "'0'", "'32.80'"]
which messed up reading the data properly as the default col_sep used is a comma character.
However I tried using quote_char option like this:
arr_of_arrs = CSV.parse(file_data, :quote_char => "'")
but it ended up the following error:
CSV::MalformedCSVError (Illegal quoting in line 1.):
Thanks,
Jignesh
发布于 2013-09-27 12:09:36
quote_chars = %w(" | ~ ^ & *)
begin
@report = CSV.read(csv_file, headers: :first_row, quote_char: quote_chars.shift)
rescue CSV::MalformedCSVError
quote_chars.empty? ? raise : retry
end
它不是完美的,但在大多数情况下都是有效的。
注:
所采用的参数与
,因此可以使用内存中的文件或数据。
发布于 2015-04-03 04:37:50
Anand,谢谢你的编码建议。这为我解决了非法引用的问题。
注意:如果您希望迭代器跳过标题行,请添加
,如下所示:
CSV.foreach("test.csv", encoding: "bom|utf-8", headers: :first_row)
发布于 2013-10-19 02:30:08
我刚刚遇到了一个类似的问题,我发现CSV不喜欢在col-sep和引号字符之间有空格。一旦我移除了它们,一切都变得很好。所以我有:
12, "N", 12, "Pacific/Majuro"
但是一旦我使用
.gsub(/,\s+\"/,',\"')
导致
12,"N", 12,"Pacific/Majuro"
一切都很顺利。
https://stackoverflow.com/questions/16772830
复制相似问题