def depend(batches[i-1]: Batch, batches[i]: Batch) -> None:
batches[i-1][0], phony = fork(batches...本来示例代码中是:
depend(batches[i-1], batches[i])
为了和论文中的图对应,我们修改为:
depend(batches[i], batches[i+1])
depend...代码也变化为:
def depend(batches[i]: Batch, batches[i+1]: Batch) -> None:
batches[i][0], phony = fork(batches...重点说明的是:
batches[i] 这里是会变化的,比如 batches[0] 在经过 partitions[j] 的计算之后,会变成 batches[0][j]。...因此,在前向计算图上,通过这个赋值操作, batches[i, j+1] 就依赖 batches[i, j],所以反向计算时候,batches[i, j + 1] 就必须在 batches[i, j]