我想从一个目录中读取几个csv文件到pandas中,并将它们连接到一个大的DataFrame中。不过,我还没能弄明白这一点。这是我到目前为止所知道的:
import glob
import pandas as pd
# get data file names
path =r'C:\DRO\DCL_rawdata_files'
filenames = glob.glob(path + "/*.csv")
dfs = []
for filename in filenames:
dfs.append(pd.read_csv(filename))
# Concatenate all data into one DataFrame
big_frame = pd.concat(dfs, ignore_index=True)
我想我在for循环中需要一些帮助?
发布于 2014-01-20 19:29:19
如果您的所有csv
文件,然后您可以尝试下面的代码。我已经添加了header=0
所以在读完之后csv
可以将第一行指定为列名。
import pandas as pd
import glob
path = r'C:\DRO\DCL_rawdata_files' # use your path
all_files = glob.glob(path + "/*.csv")
li = []
for filename in all_files:
df = pd.read_csv(filename, index_col=None, header=0)
li.append(df)
frame = pd.concat(li, axis=0, ignore_index=True)
发布于 2016-04-05 10:47:11
darindaCoder‘s答案的替代方法:
path = r'C:\DRO\DCL_rawdata_files' # use your path
all_files = glob.glob(os.path.join(path, "*.csv")) # advisable to use os.path.join as this makes concatenation OS independent
df_from_each_file = (pd.read_csv(f) for f in all_files)
concatenated_df = pd.concat(df_from_each_file, ignore_index=True)
# doesn't create a list, nor does it append to one
发布于 2017-02-22 00:25:57
import glob
import os
import pandas as pd
df = pd.concat(map(pd.read_csv, glob.glob(os.path.join('', "my_files*.csv"))))
https://stackoverflow.com/questions/20906474
复制相似问题