我想在共享内存中使用numpy数组,以便与多处理模块一起使用。困难之处在于将其用作numpy数组,而不仅仅是ctypes数组。
from multiprocessing import Process, Array
import scipy
def f(a):
a[0] = -a[0]
if __name__ == '__main__':
# Create the array
N = int(10)
unshared_arr = scipy.rand(N)
arr = Array('d', unshared_arr)
print "Originally, the first two elements of arr = %s"%(arr[:2])
# Create, start, and finish the child processes
p = Process(target=f, args=(arr,))
p.start()
p.join()
# Printing out the changed values
print "Now, the first two elements of arr = %s"%arr[:2]
这会产生如下输出:
Originally, the first two elements of arr = [0.3518653236697369, 0.517794725524976]
Now, the first two elements of arr = [-0.3518653236697369, 0.517794725524976]
可以以ctype方式访问数组,例如arr[i]
是有意义的。但是,它不是numpy数组,并且我不能执行诸如-1*arr
或arr.sum()
之类的操作。我认为一个解决方案是将ctypes数组转换为numpy数组。然而(除了不能让它工作之外),我不相信它会再被分享。
似乎有一个标准的解决方案来解决这个普遍存在的问题。
发布于 2011-10-27 04:36:17
添加到@unutbu(不再提供)和@Henry Gomersall的答案中。您可以在需要时使用shared_arr.get_lock()
同步访问:
shared_arr = mp.Array(ctypes.c_double, N)
# ...
def f(i): # could be anything numpy accepts as an index such another numpy array
with shared_arr.get_lock(): # synchronize access
arr = np.frombuffer(shared_arr.get_obj()) # no data copying
arr[i] = -arr[i]
示例
import ctypes
import logging
import multiprocessing as mp
from contextlib import closing
import numpy as np
info = mp.get_logger().info
def main():
logger = mp.log_to_stderr()
logger.setLevel(logging.INFO)
# create shared array
N, M = 100, 11
shared_arr = mp.Array(ctypes.c_double, N)
arr = tonumpyarray(shared_arr)
# fill with random values
arr[:] = np.random.uniform(size=N)
arr_orig = arr.copy()
# write to arr from different processes
with closing(mp.Pool(initializer=init, initargs=(shared_arr,))) as p:
# many processes access the same slice
stop_f = N // 10
p.map_async(f, [slice(stop_f)]*M)
# many processes access different slices of the same array
assert M % 2 # odd
step = N // 10
p.map_async(g, [slice(i, i + step) for i in range(stop_f, N, step)])
p.join()
assert np.allclose(((-1)**M)*tonumpyarray(shared_arr), arr_orig)
def init(shared_arr_):
global shared_arr
shared_arr = shared_arr_ # must be inherited, not passed as an argument
def tonumpyarray(mp_arr):
return np.frombuffer(mp_arr.get_obj())
def f(i):
"""synchronized."""
with shared_arr.get_lock(): # synchronize access
g(i)
def g(i):
"""no synchronization."""
info("start %s" % (i,))
arr = tonumpyarray(shared_arr)
arr[i] = -1 * arr[i]
info("end %s" % (i,))
if __name__ == '__main__':
mp.freeze_support()
main()
如果您不需要同步访问,或者您创建了自己的锁,那么mp.Array()
是不必要的。在这种情况下,您可以使用mp.sharedctypes.RawArray
。
发布于 2011-10-27 03:26:23
Array
对象有一个与之关联的get_obj()
方法,该方法返回ctypes数组,该数组提供一个buffer接口。我认为下面的方法应该行得通。
from multiprocessing import Process, Array
import scipy
import numpy
def f(a):
a[0] = -a[0]
if __name__ == '__main__':
# Create the array
N = int(10)
unshared_arr = scipy.rand(N)
a = Array('d', unshared_arr)
print "Originally, the first two elements of arr = %s"%(a[:2])
# Create, start, and finish the child process
p = Process(target=f, args=(a,))
p.start()
p.join()
# Print out the changed values
print "Now, the first two elements of arr = %s"%a[:2]
b = numpy.frombuffer(a.get_obj())
b[0] = 10.0
print a[0]
运行时,这会打印出a
的第一个元素,现在是10.0,显示a
和b
只是同一内存中的两个视图。
为了确保它仍然是多处理器安全的,我相信您必须使用存在于Array
对象、a
及其内置锁上的acquire
和release
方法,以确保安全地访问它(尽管我不是多处理器模块的专家)。
发布于 2015-10-22 18:22:08
我已经编写了一个小的python模块,它使用POSIX共享内存在python解释器之间共享numpy数组。也许你会发现它很方便。
https://pypi.python.org/pypi/SharedArray
下面是它的工作原理:
import numpy as np
import SharedArray as sa
# Create an array in shared memory
a = sa.create("test1", 10)
# Attach it as a different array. This can be done from another
# python interpreter as long as it runs on the same computer.
b = sa.attach("test1")
# See how they are actually sharing the same memory block
a[0] = 42
print(b[0])
# Destroying a does not affect b.
del a
print(b[0])
# See how "test1" is still present in shared memory even though we
# destroyed the array a.
sa.list()
# Now destroy the array "test1" from memory.
sa.delete("test1")
# The array b is not affected, but once you destroy it then the
# data are lost.
print(b[0])
https://stackoverflow.com/questions/7894791
复制相似问题