问在Python中高效地将numpy数组写入文件
EN

Stack Overflow用户

提问于 2018-07-18 07:58:36

回答 2查看 1.5K关注 0票数 0

我处理的数据量约为600万，写入文件需要很长时间。我该如何改进它？

下面是我尝试过的两种方法：

import numpy as np
import time
test_data = np.random.rand(6000000,12)
T1 = time.time()
np.savetxt('test',test_data, fmt='%.4f', delimiter=' ' )
T2 = time.time() 
print "Time:",T2-T1,"Sec"
file3=open('test2','w')
for i in range(6000000):
    for j in range(12):
        file3.write('%6.4f\t' % (test_data[i][j]))
    file3.write('\n')
T3 = time.time() 
print "Time:",T3-T2,"Sec"

时间: 56.6293179989秒

时间: 115.468323946秒

我正在处理至少100个这样的文件，总时间很多，请帮助。而且，我不是用.npy或压缩格式编写的，因为我需要在matlab中读取它们并做进一步的处理。

python

numpy

回答 2

Stack Overflow用户

发布于 2018-07-18 08:25:22

您是否考虑过h5py

下面是一个粗略的单运行时间比较：

>>> import numpy as np
>>> import time
>>> import h5py
>>> test_data = np.random.rand(6000000,12)
>>> file = h5py.File('arrays.h5', 'w')

>>> %time file.create_dataset('test_data', data=test_data, dtype=data.dtype)
CPU times: user 1.28 ms, sys: 224 ms, total: 225 ms
Wall time: 280 ms
<HDF5 dataset "test_data": shape (6000000, 12), type "<f8">

>>> %time np.savetxt('test',test_data, fmt='%.4f', delimiter=' ' )
CPU times: user 24.4 s, sys: 617 ms, total: 25 s
Wall time: 26.3 s

>>> file.close()

票数 3

Stack Overflow用户

发布于 2018-07-18 08:12:45

用泡菜怎么样？我发现它更快。

import numpy as np
import time
import pickle
test_data = np.random.rand(1000000,12)

T1 = time.time()
np.savetxt('testfile',test_data, fmt='%.4f', delimiter=' ' )
T2 = time.time()
print ("Time:",T2-T1,"Sec")

file3=open('testfile','w')
for i in range(test_data.shape[0]):
    for j in range(test_data.shape[1]):
        file3.write('%6.4f\t' % (test_data[i][j]))
    file3.write('\n')
file3.close()
T3 = time.time()
print ("Time:",T3-T2,"Sec")

file3 = open('testfile','wb')
pickle.dump(test_data, file3)
file3.close()
T4 = time.time()
print ("Time:",T4-T3,"Sec")

# load data
file4 = open('testfile', 'rb')
obj = pickle.load(file4)
file4.close()
print(obj)

输出为

Time: 9.1367928981781 Sec
Time: 16.366491079330444 Sec
Time: 0.41736602783203125 Sec

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/51391713

复制

相似问题

问在Python中高效地将numpy数组写入文件
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在Python中高效地将numpy数组写入文件EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在Python中高效地将numpy数组写入文件
EN