用csv.DictReader处理CSV文件是很棒的--但是我的CSV文件有注释行(由行首的散列表示),例如:
# step size=1.61853
val0,val1,val2,hybridisation,temp,smattr
0.206895,0.797923,0.202077,0.631199,0.368801,0.311052,0.688948,0.597237,0.402763
-169.32,1,1.61853,2.04069e-92,1,0.000906546,0.999093,0.241356,0.758644,0.202382
# adaptation finishedcsv模块doesn't include any way to skip such lines。
我可以很容易地做一些复杂的事情,但我想有一种很好的方法来将csv.DictReader包装在其他迭代器对象周围,该对象进行预处理以丢弃行。
发布于 2018-05-30 04:13:21
问得好。Python的CSV库缺乏对注释的基本支持(在CSV文件的顶部并不少见)。虽然Dan Stowell的解决方案适用于OP的特定情况,但它的局限性在于#必须作为第一个符号出现。更通用的解决方案是:
def decomment(csvfile):
for row in csvfile:
raw = row.split('#')[0].strip()
if raw: yield raw
with open('dummy.csv') as csvfile:
reader = csv.reader(decomment(csvfile))
for row in reader:
print(row)以下面的dummy.csv文件为例:
# comment
# comment
a,b,c # comment
1,2,3
10,20,30
# comment返回
['a', 'b', 'c']
['1', '2', '3']
['10', '20', '30']当然,这也适用于csv.DictReader()。
https://stackoverflow.com/questions/14158868
复制相似问题