我已经创建了一个代码,它可以成功地读取和合并一个文件夹中的多个csv文件并绘制数据。所有文件都有相同的列和标题,但可以是不同的行。这是我的密码
import matplotlib.pyplot as plt
import glob
import pandas as pd
import os
def get_merged_csv(flist, **kwargs):
return pd.concat([pd.read_csv(f, **kwargs) for f in flist], ignore_index=True)
path = 'C:\\Users\C253271\Desktop\FTIR Data\Data Files\\' # define path
allfiles =glob.glob(os.path.join(path, "*.csv"))
column_names = ['Relative Time','Peakat2188', 'water']
data = get_merged_csv(allfiles, index_col=None)
data.columns = column_names
time_in_minutes = pd.to_timedelta(data['Relative Time']).dt.total_seconds() / 60
x=time_in_minutes
y1=data['Peakat2188']
y2=data['water']
fig=plt.figure()
ax1 = fig.add_subplot(111)
ax1.plot(x,y1,label='Peak at 2188 , color='b')
ax1.plot(x,y2, label='water', color='r')
ax1.set_ylabel('Volume Fraction',fontsize=10)
ax1.set_xlabel('Absolute time (mins)',fontsize=10)
plt.title('SVC-Evaporator Monitoring', fontsize=20)
ax1.legend(bbox_to_anchor=(0.8,1.02), loc=3, borderaxespad=0.)
这是我三个文件的数据。
FTIR Data1.csv
Relative Time,Peak at 2188 ,water
00:00:51,0.572157,0.179023
00:02:51,0.520037,0.171217
00:04:51,0.551843,0.221285
00:06:50,0.566279,0.209182
FTIR Data2.csv
Relative Time,Peak at 2188 ,water
00:00:45,0.522157,0.169023
00:02:31,0.470037,0.161217
00:04:36,0.501843,0.211285
00:06:20,0.516279,0.199182
00:08:45,-0.027304,0.0061351
FTIR Data3.csv
Relative Time,Peak at 2188,water
00:00:51,0.622157,0.199023
00:02:51,0.570037,0.191217
00:04:51,0.601843,0.241285
我想将所有csv文件的数据绘制在一个绘图上,并且在xaxis上有绝对的时间,这是我能够做到的。当我合并我的数据时,这里显示的是什么样子,但我想从第二个文件开始,将每一个新的时间添加到以前的结束时间中。对于exp,第3行中的时间是我想在第4行中添加的第一个文件的最后一次,这是第二个文件的第一次。所以现在开始的时间应该是( 00:06:50 + 00:00:45 = 00:07:35),然后将这一次添加到同一文件的第5行(00:07:35 + 00:02:31 = 00:10:06),以此类推。这样做的目的是将三个文件中的数据绘制为连续的。我希望这不是一个大问题,如果有人能在我的代码中快速添加一些东西来帮助我,我会很感激。多谢百万
Merged data from 3 files
Relative Time Peakat2188 water
0 00:00:51 0.572157 0.179023
1 00:02:51 0.520037 0.171217
2 00:04:51 0.551843 0.221285
3 00:06:50 0.566279 0.209182
4 00:00:45 0.522157 0.169023
5 00:02:31 0.470037 0.161217
6 00:04:36 0.501843 0.211285
7 00:06:20 0.516279 0.199182
8 00:08:45 -0.027304 0.006135
9 00:00:51 0.622157 0.199023
10 00:02:51 0.570037 0.191217
11 00:04:51 0.601843 0.241285
发布于 2017-07-14 10:36:39
你要找的是这样的东西吗?
from pathlib import Path
def read_csv_files(csv_files):
for file in csv_files:
df = pd.read_csv(file, index_col=None)
df.columns = ['Relative Time','Peakat2188', 'water']
yield df
def correct_dataframes(dfs):
last_time = pd.Timedelta(0)
for df in dfs:
df['Relative Time'] += last_time
last_time = df['Relative Time'].iloc[-1]
yield df, last_time
data_dir = Path(<data_dir>)
pattern = '*.csv'
files = data_dir.glob(pattern)
dfs = read_csv_files(files)
df_list, end_times = zip(*correct_dataframes(dfs))
df = pd.concat(df_list, ignore_index=True)
df
Relative Time Peakat2188 water
0 00:00:51 0.572157 0.179023
1 00:02:51 0.520037 0.171217
2 00:04:51 0.551843 0.221285
3 00:06:50 0.566279 0.209182
4 00:07:35 0.522157 0.169023
5 00:09:21 0.470037 0.161217
6 00:11:26 0.501843 0.211285
7 00:13:10 0.516279 0.199182
8 00:15:35 -0.027304 0.006135
9 00:16:26 0.622157 0.199023
10 00:18:26 0.570037 0.191217
11 00:20:26 0.601843 0.241285
end_times
((Timedelta('0 days 00:06:50'),
Timedelta('0 days 00:15:35'),
Timedelta('0 days 00:20:26'))
https://stackoverflow.com/questions/45079760
复制相似问题