我已经创建了一个序列比对工具来比较两条DNA链(X和Y),以找到X和Y中子字符串的最佳比对。算法总结如下(https://en.wikipedia.org/wiki/Smith–Waterman\_algorithm)。我已经能够生成一个列表列表,用零填充它们来表示我的矩阵。我创建了一个评分算法,为每种碱基之间的比对返回一个数字分数(例如,匹配时加4)。然后我创建了一个对齐算法,它应该在我的“矩阵”的每个坐标上都有一个分数。但是,当我打印矩阵时,它只返回全为零的原始结果(而不是实际得分)。
我知道还有其他方法可以实现这个方法(例如使用numpy ),那么您能告诉我为什么这个特定的代码(如下所示)不能工作吗?有没有办法修改它,让它正常工作?
代码:
def zeros(X: int, Y: int):
lenX = len(X) + 1
lenY = len(Y) + 1
matrix = []
for i in range(lenX):
matrix.append([0] * lenY)
def score(X, Y):
if X[n] == Y[m]: return 4
if X[n] == '-' or Y[m] == '-': return -4
else: return -2
def SmithWaterman(X, Y, score):
for n in range(1, len(X) + 1):
for m in range(1, len(Y) + 1):
align = matrix[n-1, m-1] + (score(X[n-1], Y[m-1]))
indelX = matrix[n-1, m] + (score(X[n-1], Y[m]))
indelY = matrix[n, m-1] + (score(X[n], Y[m-1]))
matrix[n, m] = max(align, indelX, indelY, 0)
print(matrix)
zeros("ACGT", "ACGT")输出:
[[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]]发布于 2021-02-26 09:44:55
它只是打印出归零矩阵的原因是SmithWaterman函数不会被调用,因此矩阵永远不会更新。你需要做一些像这样的事情
# ...
SmithWaterman(X, Y, score)
print(matrix)
# ...然而,如果你这样做了,你会发现这段代码实际上在很多其他方面都被破坏了。我已经检查并注释了代码中的一些语法错误和其他问题:
def zeros(X: int, Y: int):
# ^ ^ incorrect type annotations. should be str
lenX = len(X) + 1
lenY = len(Y) + 1
matrix = []
for i in range(lenX):
matrix.append([0] * lenY)
# A more "pythonic" way of expressing the above would be:
# matrix = [[0] * len(Y) + 1 for _ in range(len(x) + 1)]
def score(X, Y):
# ^ ^ shadowing variables from outer scope. this is not a bug per se but it's considered bad practice
if X[n] == Y[m]: return 4
# ^ ^ variables not defined in scope
if X[n] == '-' or Y[m] == '-': return -4
# ^ ^ variables not defined in scope
else: return -2
def SmithWaterman(X, Y, score): # this function is never called
# ^ unnecessary function passed as parameter. function is defined in scope
for n in range(1, len(X) + 1):
for m in range(1, len(Y) + 1):
align = matrix[n-1, m-1] + (score(X[n-1], Y[m-1]))
# ^ invalid list lookup. should be: matrix[n-1][m-1]
indelX = matrix[n-1, m] + (score(X[n-1], Y[m]))
# ^ out of bounds error when m == len(Y)
indelY = matrix[n, m-1] + (score(X[n], Y[m-1]))
# ^ out of bounds error when n == len(X)
matrix[n, m] = max(align, indelX, indelY, 0)
# this should be nested in the inner for-loop. m, n, indelX, and indelY are not defined in scope here
print(matrix)
zeros("ACGT", "ACGT")https://stackoverflow.com/questions/66378093
复制相似问题