长话短说!我的输出csv文件在错误的列中包含数据!
我每天有成百上千的文件要处理。出于测试目的,我只使用了一个文件。
在每个文件中,我都有多个特性。每个文件都是零件的尺寸检查,每个零件都有多个被检查的特征。
我创建了带有标头的空白输出csv文件
我遍历了每个文件
-我遍历该文件中的每个功能
-我将每个要素的数据追加到数据框中
-我将该数据帧附加到csv文件
我转到下一个文件
除了在生成的csv文件中之外,数据都位于流程每个步骤的正确列中。该文件中的标头似乎是正确的,但数据顺序不正确,我不确定如何纠正这一点。
#=========================================================================================
#Start of program Setup
#=========================================================================================
consolidatedHeaders = ['Part_Number', 'Inspection_Routine', 'Inspection_Machine', 'Serial_Number', 'Timestamp', 'Feature', \
'LTL', 'Nominal', 'UTL', 'Observation', 'Tool', 'Machine', 'Rework', 'manual_intervention']
#=========================================================================================
#Creating the Consolidated file
#For my testing purposes, this file will not exist and is created with every execution of code
#=========================================================================================
consolidatedFileName = consolidatedPath + '\\' + YYYY + MM + '.csv'
if not os.path.exists(consolidatedFileName): #Need to create a consolidated File
with open(consolidatedFileName, 'w+') as c:
c.seek(0)
c.write(', '.join(consolidatedHeaders))
c.write('\n')
#c.truncate()
c.close
#=========================================================================================
#Bits of irrelevant code here finding files that are candidates for import and consolidation
#=========================================================================================
#=========================================================================================
#We've now found a file, lets go through it and add each feature and accompanying data
#to a dataframe
#=========================================================================================
df = pd.DataFrame()#consolidatedHeaders, header=0) # This will hold the ingested data, adding headers here caused problems downstream
for each do something: #not actually in code, just showing that this is a loop
#=========================================================================================
#Loop through every feature within the file and append their data to df
#=========================================================================================
observationInfo={'Part_Number' : partNumber, \
'Inspection_Routine' : inspectionRoutine, \
'Inspection_Machine' : inspectionMachine, \
'Serial_Number' : serialNumber, \
'Timestamp' : creationDate, \
'Feature' : nrpSplit[0][1:], \
'Nominal' : nrpSplit[1].strip(), \
'UTL' : nrpSplit[2].strip(), \
'LTL' : nrpSplit[3].strip(), \
'Observation' : nrpLines[whileIndex + 1].strip(), \
'Tool' : plhFeatureDF.get_value(0, 15, takeable=True), \
'Machine' : plhFeatureDF.get_value(0, 16, takeable=True), \
'Rework' : plhFeatureDF.get_value(0, 17, takeable=True), \
'manual_intervention' : plhFeatureDF.get_value(0, 18, takeable=True) \
}
df = df.append(observationInfo, ignore_index=True)
#=========================================================================================
#df is now complete with each feature from the prior loop
#The following 'print(df.head())' shows the data in the correct columns exactly as it
#should be in the output file
#=========================================================================================
print(df.head()) #this shows the data exactly the way it should be
#=========================================================================================
#add df with ingested data to our consolidated .csv file
#=========================================================================================
with open(consolidatedFile, 'a+') as consolidate:
df.to_csv(consolidate, header=False, index=False)
#=========================================================================================
#Problem!! When opening the resulting .csv file, The data is all in the wrong columns. The
#columns are not in the order of "consolidatedHeader" that i used to create the file or
#in the order of df which is appended to the file.
#=========================================================================================发布于 2020-04-23 02:28:09
我已经解决了这个问题。我需要改变
df = pd.DataFrame()#consolidatedHeaders, header=0)至
df = pd.DataFrame(columns=consolidatedHeaders)这会对数据帧进行排序,即使它是空的,因此当我向其追加序列时,会按正确的顺序追加数据。
根据记录,在第一行(不起作用的那一行),我最初尝试:
df = pd.DataFrame(consolidatedHeaders, header=0)这也不起作用。
https://stackoverflow.com/questions/61192360
复制相似问题