我的系统由12000个原子(3D点)组成,我的目的是计算这些原子的数量。
为此,我有关于dict对象中每个原子的信息(如类型、电荷和各种参数)、每个原子的邻居列表(在一个距离<阈值处给出所有原子的索引),以及所有原子之间的距离矩阵。
要计算的量是一个给定原子和它的所有邻居之间的势(从邻居列表中)。它需要原子之间的距离、信息和参数以及给定原子的邻居列表。
我继续执行以下代码:
def LJ(sii,sjj,Eii,Ejj,rij):
"""
Fonction that compute the LJ potential. one of the quantities
sii : sigma for atom i ( a parameter )
sjj : sigma for atom j ( a parameter )
Eii : epsilon for atom i ( a parameter )
Ejj : epsilon for atom j ( a parameter )
rij : squared distance between atom i and j
"""
sij = (sii+sjj)*0.5
Eij = (Eii*Ejj)**0.5
return ((4*Eij*(sij**12))/(rij**6)) - ((4*Eij*(sij**6))/(rij**3))
def Coul(qi,qj,rij):
"""
Fonction that compute the Coul potential. another quantities
qi : charge of atom i ( a parameter )
qj : charge of atom j ( a parameter )
rij : squared distance between atom i and j
"""
V = 138.935458*((qi*qj)/(rij**0.5)) # because rij is squared i take the square root
return V
def energies(NL,sP,D):
"""
NL : a vector of neighbors with as index the position and as value a list of indexes of neighbors
for example [[1,2,3],[0,2]...] the first atom at position 0 as 3 neighbors 1,2 and 3. The second atom at position 1 as 2 neighbors 0 and 2 etc...
sP : a distionnary giving the parameters for a given atoms, and also neighbors distant of 4 atoms in the mist 'dihedrals'. The key correspond to the index of the distance matric and the neighboring list, but it's a string
sample : {'key': '0_NH3', 'num': 0, 'type': 'NH3', 'resnr': 0, 'residu': 'ASP', 'atom': 'N', 'charge': -0.3, 'sigma': 0.329632525712, 'epsilon': 0.8368, 'voisins': [1, 2, 3, 4, 5, 6, 12], 'dihedrals': [7, 8, 9, 13, 14]}
D : the squared distance matrix between all atoms
"""
coul,ljk,ljk14,coul14 = [],[],[],[] # I want the potential for each atom and I want 4 types of potentials
# coul and ljk are potentials calculated on neighboring atoms in the NL neighbors list
# coul14 and ljk14 are potentials calculated on neighboring atoms in the sP[key]['dihedrals'] list
for k in sP.keys(): # For each atom
# k is a string that represent an int sP['0'] as the informations for D[0][:] or D[:][0] and NL[0]
ljk.append(sum([LJ(sP[k]['sigma'],sP[str(n)]['sigma'],sP[k]['epsilon'],sP[str(n)]['epsilon'],D[int(k)][n]) for n in NL[int(k)]]))
coul.append(sum([Coul(sP[k]['charge'],sP[str(n)]['charge'],D[int(k)][n]) for n in NL[int(k)]]))
ljk14.append(sum([LJ(sP[k]['sigma'],sP[str(n)]['sigma'],sP[k]['epsilon'],sP[str(n)]['epsilon'],D[int(k)][n]) for n in sP[k]['dihedrals']]))
coul14.append(sum([Coul(sP[k]['charge'],sP[str(n)]['charge'],D[int(k)][n]) for n in sP[k]['dihedrals']]))
# because I want the sum of all potentials for each atom I use a sum on a comprehensive list of potential between a given atom and it's neighbors在我的机器上,计算12134个原子需要大约40秒的时间。我真的要尽量减少这个数目。这是我得到的最好的版本,但是我觉得可以用numpy和矢量化来做一些事情,但是我什么也没找到。
谢谢你的帮助
发布于 2022-06-22 17:40:06
有一件事应该马上就会有所帮助,那就是从使用X**0.5切换到math.sqrt(X)。这个链接,Which is faster in Python: x**.5 or math.sqrt(x)?,显示了两者在功能和速度上的差异。
此外,您还可以查看多处理模块,特别是multiprocessing.Pool(),它可以帮助您在多个CPU核上运行代码,并减少总的运行时间。https://docs.python.org/3/library/multiprocessing.html
可能改善运行时的最后一件事是将正在读取的数组/数据集中的值保存到本地变量中。我不确定python中内存访问或访问速度是如何工作的,但这可能会减少从内存读取所需的CPU周期。需要注意的是,只有当您多次使用该值时,这才会有所帮助。另外,读取任何不依赖于k的变量都应该移到for循环之外。例如:
sp_n_sigma = sP[str(n)]['sigma']
for k in sP.keys(): # For each atom
sP_k_sigma = sP[k]['sigma']
ljk.append(sum([LJ(sP_k_sigma,sp_n_sigma, ...
ljk14.append(sum([LJ(sP_k_sigma, sp_n_sigma, ...https://stackoverflow.com/questions/72719290
复制相似问题