为什么numpy的einsum比numpy的内置函数快?

内容来源于 Stack Overflow,并遵循CC BY-SA 3.0许可协议进行翻译与使用

  • 回答 (2)
  • 关注 (0)
  • 查看 (209)

三个数组dtype=np.double。使用numpy 1.7.1在英特尔CPU上执行计时icc与英特尔的mkl。使用Numpy 1.6.1编译的AMD CPUgcc,mkl也用于验证时间。请注意,时间刻度与系统大小几乎是线性的,而不是由于numpy函数。if这些差异将以微秒而不是毫秒的形式显示:

arr_1D=np.arange(500,dtype=np.double)
large_arr_1D=np.arange(100000,dtype=np.double)
arr_2D=np.arange(500**2,dtype=np.double).reshape(500,500)
arr_3D=np.arange(500**3,dtype=np.double).reshape(500,500,500)

首先让我们看看np.sum函数

np.all(np.sum(arr_3D)==np.einsum('ijk->',arr_3D))
True

%timeit np.sum(arr_3D)
10 loops, best of 3: 142 ms per loop

%timeit np.einsum('ijk->', arr_3D)
10 loops, best of 3: 70.2 ms per loop

powers:

np.allclose(arr_3D*arr_3D*arr_3D,np.einsum('ijk,ijk,ijk->ijk',arr_3D,arr_3D,arr_3D))
True

%timeit arr_3D*arr_3D*arr_3D
1 loops, best of 3: 1.32 s per loop

%timeit np.einsum('ijk,ijk,ijk->ijk', arr_3D, arr_3D, arr_3D)
1 loops, best of 3: 694 ms per loop

Outer product::

np.all(np.outer(arr_1D,arr_1D)==np.einsum('i,k->ik',arr_1D,arr_1D))
True

%timeit np.outer(arr_1D, arr_1D)
1000 loops, best of 3: 411 us per loop

%timeit np.einsum('i,k->ik', arr_1D, arr_1D)
1000 loops, best of 3: 245 us per loop

以上所有的速度都是np.einsum

np.allclose(np.sum(arr_2D*arr_3D),np.einsum('ij,oij->',arr_2D,arr_3D))
True

%timeit np.sum(arr_2D*arr_3D)
1 loops, best of 3: 813 ms per loop

%timeit np.einsum('ij,oij->', arr_2D, arr_3D)
10 loops, best of 3: 85.1 ms per loop

DGEMM的完整性案例:

np.allclose(np.dot(arr_2D,arr_2D),np.einsum('ij,jk',arr_2D,arr_2D))
True

%timeit np.einsum('ij,jk',arr_2D,arr_2D)
10 loops, best of 3: 56.1 ms per loop

%timeit np.dot(arr_2D,arr_2D)
100 loops, best of 3: 5.17 ms per loop
提问于
用户回答回答于

为了执行测试,创建了一个新的python2.7安装-numpy 1.7和1.8是用icc在运行Ubuntu的AMD Opteron核心上使用标准选项。

这是在1.8升级之前和之后运行的测试:

import numpy as np
import timeit

arr_1D=np.arange(5000,dtype=np.double)
arr_2D=np.arange(500**2,dtype=np.double).reshape(500,500)
arr_3D=np.arange(500**3,dtype=np.double).reshape(500,500,500)

print 'Summation test:'
print timeit.timeit('np.sum(arr_3D)',
                      'import numpy as np; from __main__ import arr_1D, arr_2D, arr_3D',
                      number=5)/5
print timeit.timeit('np.einsum("ijk->", arr_3D)',
                      'import numpy as np; from __main__ import arr_1D, arr_2D, arr_3D',
                      number=5)/5
print '----------------------\n'


print 'Power test:'
print timeit.timeit('arr_3D*arr_3D*arr_3D',
                      'import numpy as np; from __main__ import arr_1D, arr_2D, arr_3D',
                      number=5)/5
print timeit.timeit('np.einsum("ijk,ijk,ijk->ijk", arr_3D, arr_3D, arr_3D)',
                      'import numpy as np; from __main__ import arr_1D, arr_2D, arr_3D',
                      number=5)/5
print '----------------------\n'


print 'Outer test:'
print timeit.timeit('np.outer(arr_1D, arr_1D)',
                      'import numpy as np; from __main__ import arr_1D, arr_2D, arr_3D',
                      number=5)/5
print timeit.timeit('np.einsum("i,k->ik", arr_1D, arr_1D)',
                      'import numpy as np; from __main__ import arr_1D, arr_2D, arr_3D',
                      number=5)/5
print '----------------------\n'


print 'Einsum test:'
print timeit.timeit('np.sum(arr_2D*arr_3D)',
                      'import numpy as np; from __main__ import arr_1D, arr_2D, arr_3D',
                      number=5)/5
print timeit.timeit('np.einsum("ij,oij->", arr_2D, arr_3D)',
                      'import numpy as np; from __main__ import arr_1D, arr_2D, arr_3D',
                      number=5)/5
print '----------------------\n'

Numpy 1.7.1:

Summation test:
0.172988510132
0.0934836149216
----------------------

Power test:
1.93524689674
0.839519000053
----------------------

Outer test:
0.130380821228
0.121401786804
----------------------

Einsum test:
0.979052495956
0.126066613197

Numpy 1.8:

Summation test:
0.116551589966
0.0920487880707
----------------------

Power test:
1.23683619499
0.815982818604
----------------------

Outer test:
0.131808176041
0.127472200394
----------------------

Einsum test:
0.781750011444
0.129271841049

用户回答回答于

例如,sum在检查输入的类型和使用适当的累加器时要更加小心。例如,考虑以下几点:

In [1]: x = 255 * np.ones(100, dtype=np.uint8)

In [2]: x
Out[2]:
array([255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
       255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
       255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
       255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
       255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
       255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
       255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
       255, 255, 255, 255, 255, 255, 255, 255, 255], dtype=uint8)

注意,sum是正确的:

In [3]: x.sum()
Out[3]: 25500

einsum会给出错误的结果:

In [4]: np.einsum('i->', x)
Out[4]: 156

但是如果我们用一个不那么有限的dtype,我们仍然会得到你期望的结果:

In [5]: y = 255 * np.ones(100)

In [6]: np.einsum('i->', y)
Out[6]: 25500.0

扫码关注云+社区