我试着用
numpy.fromfile()
因为它速度快,但它只读第一行。怎么看整张桌子?我不想使用熊猫或numpy.loadtext()
np.fromfile('abc.txt', count=-1, sep=",")发布于 2016-09-28 17:22:31
我可以读取一个空格分隔的多行文件:
In [312]: cat mytest.txt
0.26 0.63 0.97 1.01 0.42
1.66 1.54 1.07 2.13 1.44
2.57 2.73 2.45 2.47 2.29
3.75 3.91 3.37 3.32 4.32
4.27 4.33 4.05 4.21 4.48
0.37 0.58 0.07 0.59 0.48
2.17 1.99 1.61 1.30 2.09
2.82 2.08 2.39 2.48 2.51
3.12 3.36 2.76 3.62 3.25
4.24 4.97 4.51 4.25 4.65
0.42 0.03 0.29 0.10 0.46
1.11 2.05 1.40 1.86 1.36
2.07 2.16 2.81 2.47 2.37
3.65 3.25 3.60 3.23 3.80
4.23 3.75 4.67 4.34 4.78
In [313]: np.fromfile('mytest.txt',count=-1,dtype=float,sep=' ')
Out[313]:
array([ 0.26, 0.63, 0.97, 1.01, 0.42, 1.66, 1.54, 1.07, 2.13,
1.44, 2.57, 2.73, 2.45, 2.47, 2.29, 3.75, 3.91, 3.37,
3.32, 4.32, 4.27, 4.33, 4.05, 4.21, 4.48, 0.37, 0.58,
0.07, 0.59, 0.48, 2.17, 1.99, 1.61, 1.3 , 2.09, 2.82,
2.08, 2.39, 2.48, 2.51, 3.12, 3.36, 2.76, 3.62, 3.25,
4.24, 4.97, 4.51, 4.25, 4.65, 0.42, 0.03, 0.29, 0.1 ,
0.46, 1.11, 2.05, 1.4 , 1.86, 1.36, 2.07, 2.16, 2.81,
2.47, 2.37, 3.65, 3.25, 3.6 , 3.23, 3.8 , 4.23, 3.75,
4.67, 4.34, 4.78])换行符被看作是另一个空白。
但是,,分隔的文件不跨行边界。
In [315]: cat test.txt
-0.22424938, 0.16117005, -0.39249256
-0.22424938, 0.16050598, -0.39249256
-0.22424938, 0.15984190, -0.39249256
0.09214371, -0.26184322, -0.39249256
0.09214371, -0.26250729, -0.39249256
0.09214371, -0.26317136, -0.39249256
In [316]: np.fromfile('test.txt',count=-1,dtype=float,sep=',')
Out[316]: array([-0.22424938, 0.16117005, -0.39249256])loadtxt和genfromtxt是为表格数据设计的。是的,它们速度慢,逐行读取文件。但他们的灵活性要大得多。pandas有一个更快的csv阅读器。
对ws分隔的文件进行速度测试:
In [319]: timeit np.loadtxt('mytest.txt')
1000 loops, best of 3: 623 µs per loop
In [320]: timeit np.fromfile('mytest.txt',count=-1,dtype=float,sep=' ')
The slowest run took 4.90 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 174 µs per loophttps://stackoverflow.com/questions/39752250
复制相似问题