论文记录 - Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiogram...

caoqi95

发布于 2019-03-28 12:02:26

1.2K0

发布于 2019-03-28 12:02:26

斯坦福团队在 Nature Medicine 上发表了一篇论文《Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network》，他们开发了一种深度神经网络，用于对单导联心电图信号中的 10 种心律失常以及窦性心律和噪声，总共 12 种信号进行分类，并将其性能与心脏病专家的结果进行比较。

目前可能还看不了 nature 上的论文（不过发现使用实验室的校园网就可以查看并下载论文了），但是可以从这个网站能够了解一些信息。

该篇论文构建了一个深度神经网络，使用了 53549 名病人的单导联 ECG 数据来对 12 种节律类别进行分类。同时，在这项研究中，构建了一个大型，新颖的 ECG 数据集，该数据集经过专家注释，适用于广泛的心电图节律类别。数据收集主要使用 Zio 监测器连续监测，采样频率为 200Hz。

模型结构

该 DNN 旨在对 10 种心律失常以及窦性心律和噪声进行分类，总共 12 种输出节律类别，结构如下所示。

这个深度神经网络的架构共有 34 层。其将原始的 ECG 数据（以 200 Hz 采样，或者每秒 200 个样本）作为输入，不考虑其他与患者或者 ECG 相关的特征。并且每 256 个样本（或者每 1.28 秒）输出一个预测，这样的输出被称之为 输出间隔。同时为了网络的优化及易于处理，采用了类似残差网络的快捷连接的结构。该网络由 16 残差块组成，每个块中有 2 个卷积层。卷积层具有16 个滤波器和 32*2k 的滤波器宽度，其中 k 是超参数，其从 0 开始并且每 4 个残差块递增 1。每个残差块都对输入进行下采样操作。

In general, the hyper-parameters of the network architecture and optimization algorithm were chosen via a combination of grid search and manual tuning. For the architecture, we searched primarily over the number of convolutional layers, the size and number of the convolutional filters, as well as the use of residual connections. We found the residual connections useful once the depth of the model exceeded eight layers. We also experimented with recurrent layers including long short-term memory cells46 and bidirectional recurrence, but found no improvement in accuracy and a substantial increase in runtime; thus, we abandoned this class of models. We manually tuned the learning rate to achieve fastest convergence.

对于该模型结构，作者们主要改变卷积层的数量，卷积滤波器的大小和数量，以及残差连接的使用来调参，并且手动改变学习率以使模型快速收敛。在实验过程中，他们发现，一旦模型的深度超过 8 层，残差连接就很有用。还尝试了包括 LSTM 的 RNN 层和双向循环结构，但发现准确性并没有提高，运行时间也大幅增加。所以，就抛弃了循环网络这类模型。

算法评估

Since the DNN outputs one class prediction every output interval, it makes a series of 23 rhythm predictions for every 30-s record. The cardiologists annotated the start and end point of each rhythm class in the record. We used this to construct a cardiologist label at every output interval by rounding the annotation to the nearest interval boundary. Therefore, model accuracy can be assessed at the level of every output interval, which we call ‘sequence-level’, or at the record level, which we call ‘set-level’.

DNN 模型每个输出间隔输出一个预测，因此，每 30 秒输出 23（30/1.28，再向下取整）个预测。心脏病家在数据记录中标注了每个节奏类别的起点和终点。在实验中使用它并通过将标注四舍五入到最近的间隔边界来在每个输出间隔构建心脏病专家标签。因此，模型精度可以在每个输出间隔的级别进行评估，称之为“序列级别”，或者在整个记录级别中，称之为“数据集级别”。

To compare model predictions at the sequence level, the model predictions at each output interval were compared with the corresponding committee consensus labels for that same output interval. At the set level, the set of unique rhythm classes across a given ECG record that was predicted by the DNN was compared with the set of rhythm classes annotated across the record by the committee consensus. The set-level evaluation, unlike the sequence-level, does not penalize for time misalignment of a rhythm classification within a record.

为了在序列级别中比较输出预测，将每个输出间隔的模型预测与相同输出间隔的相应委员会的标注进行比较。在数据集级别上，由 DNN 预测的给定 ECG 记录中的一组独特节律类别与委员会标注的一组节奏类别进行比较。与序列级别不同，数据集级别评估不会对记录中的节奏分类的时间错位进行惩罚。

序列级别的算法评估允许在每个输出间隔与黄金标准进行比较，从而提供最全面的算法性能度量，所以将它用于大多数的度量中。序列级评估也类似于遥测或 Holter 监测分析的临床应用，因此识别节律的起始和偏移至关重要。在数据集级别上的评估是很有用的抽象，近似于如何将 DNN 算法应用于单个 ECG 记录以识别给定记录中存在哪些诊断。