我正在尝试读取CSV文件,并使用感兴趣的数据创建一个新文件。有一些行中的特定数据值(在年龄和性别列中)被标记为-1,因此在新的CSV表中不需要。我应该用Pandas库重写它吗?此外,我试图忽略以前的id (因为一些行将被忽略),并将新的行计为新的id。
import csv
data = []
def transform_row(row):
# id = new count
age = line[2]
gender = line[3]
url = line[4]
return [
#new count
age,
gender,
url
]
# read csv file line by line
with open('data_sample.csv', 'r') as f:
reader = csv.reader(f)
""" bad try at ignoring the line with value -1
for value in reader:
if value == '-1':
pass
else:
continue
"""
# loop through each line in csv and transform
for line in reader:
data.append(transform_row(line))
# write a new csv file
with open('data_test.csv', 'w', newline='') as f:
# define new csv writer
writer = csv.writer(f, delimiter=',')
# write a header row to our output.csv file
writer.writerow([
#'id', - new line count as id
'age',
'gender',
'url'
])
# write our data to the file
writer.writerows(data)
此外,欢迎任何其他建议。
https://stackoverflow.com/questions/54770325
复制相似问题