CVPR2020医学影像相关论文(摘要+代码+链接)

Minerva

修改于 2021-01-11 17:02:02

2.7K0

修改于 2021-01-11 17:02:02

文章被收录于专栏：Python编程和深度学习

本文是在上面文章的基础上，整理了CVPR2020医学影像相关的论文摘要、代码及文章下载地址。根据上面文章，CVPR2020医学影像处理相关论文可以分为如下几类：

segmentation
classification
synthesis & reconstruction
CAD
motion & tracking
registration

Segmentation (11)

SyntheticLearning: Learn From Distributed Asynchronized Discriminator GAN WithoutSharing Medical Image Data

Qi Chang, Hui Qu, Yikai Zhang, Mert Sabuncu,Chao Chen, Tong Zhang, Dimitris N. Metaxas; The IEEE/CVFConference on Computer Vision and Pattern Recognition (CVPR), 2020, pp.13856-13866

Abstract

In this paper, we propose a dataprivacy-preserving and communication efficient distributed GAN learningframework named Distributed Asynchronized Discriminator GAN (AsynDGAN). Ourproposed framework aims to train a central generator learns from distributeddiscriminator, and use the generated synthetic image solely to train thesegmentation model. We validate the proposed framework on the application ofhealth entities learning problem which is known to be privacy sensitive. Ourexperiments show that our approach: 1) could learn the real image'sdistribution from multiple datasets without sharing the patient's raw data. 2)is more efficient and requires lower bandwidth than other distributed deeplearning methods. 3) achieves higher performance compared to the model trainedby one real dataset, and almost the same performance compared to the modeltrained by all real datasets. 4) has provable guarantees that the generator couldlearn the distributed distribution in an all important fashion thus isunbiased.

代码：https://github.com/tommy-qichang/AsynDGAN

论文：http://openaccess.thecvf.com/content_CVPR_2020/papers/Chang_Synthetic_Learning_Learn_From_Distributed_Asynchronized_Discriminator_GAN_Without_Sharing_CVPR_2020_paper.pdf

本文由“壹伴编辑器”提供技术支持

WhatCan Be Transferred: Unsupervised Domain Adaptation for Endoscopic LesionsSegmentation

Jiahua Dong, Yang Cong, Gan Sun, Bineng Zhong, Xiaowei Xu; TheIEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020,pp. 4023-4032

Abstract

Unsupervised domain adaptation has attractedgrowing research attention on semantic segmentation. However, 1) most existingmodels cannot be directly applied into lesions transfer of medical images, dueto the diverse appearances of same lesion among different datasets; 2) equalattention has been paid into all semantic representations instead of neglectingirrelevant knowledge, which leads to negative transfer of untransferableknowledge. To address these challenges, we develop a new unsupervised semantictransfer model including two complementary modules (i.e., T_D and T_F ) forendoscopic lesions segmentation, which can alternatively determine where andhow to explore transferable domain-invariant knowledge between labeled sourcelesions dataset (e.g., gastroscope) and unlabeled target diseases dataset(e.g., enteroscopy). Specifically, T_D focuses on where to translatetransferable visual information of medical lesions via residualtransferability-aware bottleneck, while neglecting untransferable visualcharacterizations. Furthermore, T_F highlights how to augment transferablesemantic features of various lesions and automatically ignore untransferablerepresentations, which explores domain-invariant knowledge and in returnimproves the performance of T_D. To the end, theoretical analysis and extensiveexperiments on medical endoscopic dataset and several non-medical publicdatasets well demonstrate the superiority of our proposed model.

论文：http://openaccess.thecvf.com/content_CVPR_2020/papers/Dong_What_Can_Be_Transferred_Unsupervised_Domain_Adaptation_for_Endoscopic_Lesions_CVPR_2020_paper.pdf

本文由“壹伴编辑器”提供技术支持

Organat Risk Segmentation for Head and Neck Cancer Using Stratified Learning andNeural Architecture Search

Dazhou Guo, Dakai Jin, Zhuotun Zhu, Tsung-Ying Ho, Adam P. Harrison,Chun-Hung Chao, Jing Xiao, Le Lu; The IEEE/CVF Conference on ComputerVision and Pattern Recognition (CVPR), 2020, pp. 4223-4232

Abstract

OAR segmentation is a critical step inradiotherapy of head and neck (H&N) cancer, where inconsistencies acrossradiation oncologists and prohibitive labor costs motivate automatedapproaches. However, leading methods using standard fully convolutional networkworkflows that are challenged when the number of OARs becomes large, e.g. >40. For such scenarios, insights can be gained from the stratificationapproaches seen in manual clinical OAR delineation. This is the goal of ourwork, where we introduce stratified organ at risk segmentation (SOARS), anapproach that stratifies OARs into anchor, mid-level, and small & hard(S&H) categories. SOARS stratifies across two dimensions. The firstdimension is that distinct processing pipelines are used for each OAR category.In particular, inspired by clinical practices, anchor OARs are used to guidethe mid-level and S&H categories. The second dimension is that distinctnetwork architectures are used to manage the significant contrast, size, andanatomy variations between different OARs. We use differentiable neuralarchitecture search (NAS), allowing the network to choose among 2D, 3D orPseudo-3D convolutions. Extensive 4-fold cross-validation on 142 H&N cancerpatients with 42 manually labeled OARs, the most comprehensive OAR dataset todate, demonstrates that both pipeline- and NAS-stratification significantlyimproves quantitative performance over the state-of-the-art (from 69.52% to73.68% in absolute Dice scores). Thus, SOARS provides a powerful and principledmeans to manage the highly complex segmentation space of OARs.

论文：http://openaccess.thecvf.com/content_CVPR_2020/papers/Guo_Organ_at_Risk_Segmentation_for_Head_and_Neck_Cancer_Using_CVPR_2020_paper.pdf

本文由“壹伴编辑器”提供技术支持

InstanceSegmentation of Biological Images Using Harmonic Embeddings

Victor Kulikov, Victor Lempitsky; The IEEE/CVF Conference onComputer Vision and Pattern Recognition (CVPR), 2020, pp. 3843-3851

Abstract

We present a new instance segmentationapproach tailored to biological images, where instances may correspond toindividual cells, organisms or plant parts. Unlike instance segmentation foruser photographs or road scenes, in biological data object instances may beparticularly densely packed, the appearance variation may be particularly low,the processing power may be restricted, while, on the other hand, thevariability of sizes of individual instances may be limited. The proposedapproach successfully addresses these peculiarities. Our approach describeseach object instance using an expectation of a limited number of sine waveswith frequencies and phases adjusted to particular object sizes and densities.At train time, a fully-convolutional network is learned to predict the objectembeddings at each pixel using a simple pixelwise regression loss, while attest time the instances are recovered using clustering in the embedding space.In the experiments, we show that our approach outperforms previousembedding-based instance segmentation approaches on a number of biologicaldatasets, achieving state-of-the-art on a popular CVPPP benchmark. Thisexcellent performance is combined with computational efficiency that is neededfor deployment to domain specialists.

代码：https://github.com/kulikovv/harmonic

论文：http://openaccess.thecvf.com/content_CVPR_2020/papers/Kulikov_Instance_Segmentation_of_Biological_Images_Using_Harmonic_Embeddings_CVPR_2020_paper.pdf

本文由“壹伴编辑器”提供技术支持

StructureBoundary Preserving Segmentation for Medical Image With Ambiguous Boundary

Hong Joo Lee, Jung Uk Kim, Sangmin Lee, Hak Gu Kim, Yong Man Ro;The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),2020, pp. 4817-4826

Abstract

In this paper, we propose a novel imagesegmentation method to tackle two critical problems of medical image, which are(i) ambiguity of structure boundary in the medical image domain and (ii)uncertainty of the segmented region without specialized domain knowledge. Tosolve those two problems in automatic medical segmentation, we propose a novelstructure boundary preserving segmentation framework. To this end, the boundarykey point selection algorithm is proposed. In the proposed algorithm, the keypoints on the structural boundary of the target object are estimated. Then, aboundary preserving block (BPB) with the boundary key point map is applied forpredicting the structure boundary of the target object. Further, for embeddingexperts' knowledge in the fully automatic segmentation, we propose a novelshape boundary-aware evaluator (SBE) with the ground-truth structureinformation indicated by experts. The proposed SBE could give feedback to thesegmentation network based on the structure boundary key point. The proposedmethod is general and flexible enough to be built on top of any deeplearning-based segmentation network. We demonstrate that the proposed methodcould surpass the state-of-the-art segmentation network and improve theaccuracy of three different segmentation network models on different types ofmedical image datasets.

论文：http://openaccess.thecvf.com/content_CVPR_2020/papers/Lee_Structure_Boundary_Preserving_Segmentation_for_Medical_Image_With_Ambiguous_Boundary_CVPR_2020_paper.pdf

本文由“壹伴编辑器”提供技术支持

Iteratively-RefinedInteractive 3D Medical Image Segmentation With Multi-Agent ReinforcementLearning

Xuan Liao, Wenhao Li, Qisen Xu, Xiangfeng Wang, Bo Jin, Xiaoyun Zhang,Yanfeng Wang, Ya Zhang; The IEEE/CVF Conference on Computer Vision andPattern Recognition (CVPR), 2020, pp. 9394-9402

Abstract

Existing automatic 3D image segmentationmethods usually fail to meet the clinic use. Many studies have explored aninteractive strategy to improve the image segmentation performance byiteratively incorporating user hints. However, the dynamic process forsuccessive interactions is largely ignored. We here propose to model thedynamic process of iterative interactive image segmentation as a Markovdecision process (MDP) and solve it with reinforcement learning (RL).Unfortunately, it is intractable to use single-agent RL for voxel-wiseprediction due to the large exploration space. To reduce the exploration spaceto a tractable size, we treat each voxel as an agent with a shared voxel-levelbehavior strategy so that it can be solved with multi-agent reinforcementlearning. An additional advantage of this multi-agent model is to capture thedependency among voxels for segmentation task. Meanwhile, to enrich theinformation of previous segmentations, we reserve the prediction uncertainty inthe state space of MDP and derive an adjustment action space leading to a moreprecise and finer segmentation. In addition, to improve the efficiency ofexploration, we design a relative cross-entropy gain-based reward to update thepolicy in a constrained direction. Experimental results on various medicaldatasets have shown that our method significantly outperforms existingstate-of-the-art methods, with the advantage of less interactions and a fasterconvergence.

论文：http://openaccess.thecvf.com/content_CVPR_2020/papers/Liao_Iteratively-Refined_Interactive_3D_Medical_Image_Segmentation_With_Multi-Agent_Reinforcement_Learning_CVPR_2020_paper.pdf

本文由“壹伴编辑器”提供技术支持

UnsupervisedInstance Segmentation in Microscopy Images via Panoptic Domain Adaptation andTask Re-Weighting

Dongnan Liu, Donghao Zhang, Yang Song, Fan Zhang, Lauren O'Donnell, HengHuang, Mei Chen, Weidong Cai; The IEEE/CVF Conference on ComputerVision and Pattern Recognition (CVPR), 2020, pp. 4243-4252

Abstract

Unsupervised domain adaptation (UDA) fornuclei instance segmentation is important for digital pathology, as italleviates the burden of labor-intensive annotation and domain shift acrossdatasets. In this work, we propose a Cycle Consistency Panoptic Domain AdaptiveMask R-CNN (CyC-PDAM) architecture for unsupervised nuclei segmentation inhistopathology images, by learning from fluorescence microscopy images. Morespecifically, we first propose a nuclei inpainting mechanism to remove theauxiliary generated objects in the synthesized images. Secondly, a semanticbranch with a domain discriminator is designed to achieve panoptic-level domainadaptation. Thirdly, in order to avoid the influence of the source-biasedfeatures, we propose a task re-weighting mechanism to dynamically add trade-offweights for the task-specific loss functions. Experimental results on threedatasets indicate that our proposed method outperforms state-of-the-art UDAmethods significantly, and demonstrates a similar performance as fullysupervised methods.

论文：http://openaccess.thecvf.com/content_CVPR_2020/papers/Liu_Unsupervised_Instance_Segmentation_in_Microscopy_Images_via_Panoptic_Domain_Adaptation_CVPR_2020_paper.pdf

本文由“壹伴编辑器”提供技术支持

DeepDistance Transform for Tubular Structure Segmentation in CT Scans

Yan Wang, Xu Wei, Fengze Liu, Jieneng Chen, Yuyin Zhou, Wei Shen, ElliotK. Fishman, Alan L. Yuille; The IEEE/CVF Conference on Computer Visionand Pattern Recognition (CVPR), 2020, pp. 3833-3842

Abstract

Tubular structure segmentation in medicalimages, e.g., segmenting vessels in CT scans, serves as a vital step in the useof computers to aid in screening early stages of related diseases. Butautomatic tubular structure segmentation in CT scans is a challenging problem,due to issues such as poor contrast, noise and complicated background. Atubular structure usually has a cylinder-like shape which can be well representedby its skeleton and cross-sectional radii (scales). Inspired by this, wepropose a geometry-aware tubular structure segmentation method, Deep DistanceTransform (DDT), which combines intuitions from the classical distancetransform for skeletonization and modern deep segmentation networks. DDT firstlearns a multi-task network to predict a segmentation mask for a tubularstructure and a distance map. Each value in the map represents the distancefrom each tubular structure voxel to the tubular structure surface. Then thesegmentation mask is refined by leveraging the shape prior reconstructed fromthe distance map. We apply our DDT on six medical image datasets. Results showthat (1) DDT can boost tubular structure segmentation performance significantly(e.g., over 13% DSC improvement for pancreatic duct segmentation), and (2) DDTadditionally provides a geometrical measurement for a tubular structure, whichis important for clinical diagnosis (e.g., the cross-sectional scale of apancreatic duct can be an indicator for pancreatic cancer).

论文：http://openaccess.thecvf.com/content_CVPR_2020/papers/Wang_Deep_Distance_Transform_for_Tubular_Structure_Segmentation_in_CT_Scans_CVPR_2020_paper.pdf

本文由“壹伴编辑器”提供技术支持

LT-Net:Label Transfer by Learning Reversible Voxel-Wise Correspondence for One-ShotMedical Image Segmentation

Shuxin Wang, Shilei Cao, Dong Wei, Renzhen Wang, Kai Ma, Liansheng Wang,Deyu Meng, Yefeng Zheng; The IEEE/CVF Conference on Computer Vision andPattern Recognition (CVPR), 2020, pp. 9162-9171

Abstract

We introduce a one-shot segmentation methodto alleviate the burden of manual annotation for medical images. The main ideais to treat one-shot segmentation as a classical atlas-based segmentationproblem, where voxel-wise correspondence from the atlas to the unlabelled datais learned. Subsequently, segmentation label of the atlas can be transferred tothe unlabelled data with the learned correspondence. However, since groundtruth correspondence between images is usually unavailable, the learning systemmust be well-supervised to avoid mode collapse and convergence failure. Toovercome this difficulty, we resort to the forward-backward consistency, whichis widely used in correspondence problems, and additionally learn the backwardcorrespondences from the warped atlases back to the original atlas. Thiscycle-correspondence learning design enables a variety of extra,cycle-consistency-based supervision signals to make the training processstable, while also boost the performance. We demonstrate the superiority of ourmethod over both deep learning-based one-shot segmentation methods and aclassical multi-atlas segmentation method via thorough experiments.

论文：http://openaccess.thecvf.com/content_CVPR_2020/papers/Wang_LT-Net_Label_Transfer_by_Learning_Reversible_Voxel-Wise_Correspondence_for_One-Shot_CVPR_2020_paper.pdf

本文由“壹伴编辑器”提供技术支持

CPR-GCN:Conditional Partial-Residual Graph Convolutional Network in AutomatedAnatomical Labeling of Coronary Arteries

Han Yang, Xingjian Zhen, Ying Chi, Lei Zhang, Xian-Sheng Hua; TheIEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020,pp. 3803-3811

Abstract

Automated anatomical labeling plays a vitalrole in coronary artery disease diagnosing procedure. The main challenge inthis problem is the large individual variability inherited in human anatomy.Existing methods usually rely on the position information and the priorknowledge of the topology of the coronary artery tree, which may lead tounsatisfactory performance when the main branches are confusing. Motivated bythe wide application of the graph neural network in structured data, in thispaper, we propose a conditional partial-residual graph convolutional network(CPR-GCN), which takes both position and CT image into consideration, since CTimage contains abundant information such as branch size and spanning direction.Two majority parts, a Partial-Residual GCN and a conditions extractor, areincluded in CPR-GCN. The conditions extractor is a hybrid model containing the3D CNN and the LSTM, which can extract 3D spatial image features along thebranches. On the technical side, the Partial-Residual GCN takes the positionfeatures of the branches, with the 3D spatial image features as conditions, topredict the label for each branches. While on the mathematical side, ourapproach twists the partial differential equation (PDE) into the graphmodeling. A dataset with 511 subjects is collected from the clinic andannotated by two experts with a two-phase annotation process. According to thefive-fold cross-validation, our CPR-GCN yields 95.8% meanRecall, 95.4%meanPrecision and 0.955 meanF1, which outperforms state-of-the-art approaches.

论文：http://openaccess.thecvf.com/content_CVPR_2020/papers/Yang_CPR-GCN_Conditional_Partial-Residual_Graph_Convolutional_Network_in_Automated_Anatomical_Labeling_CVPR_2020_paper.pdf

本文由“壹伴编辑器”提供技术支持

C2FNAS:Coarse-to-Fine Neural Architecture Search for 3D Medical Image Segmentation

Qihang Yu, Dong Yang, Holger Roth, Yutong Bai, Yixiao Zhang, Alan L.Yuille, Daguang Xu; The IEEE/CVF Conference on Computer Vision andPattern Recognition (CVPR), 2020, pp. 4126-4135

Abstract

3D convolution neural networks (CNN) havebeen proved very successful in parsing organs or tumours in 3D medical images,but it remains sophisticated and time-consuming to choose or design proper 3Dnetworks given different task contexts. Recently, Neural Architecture Search(NAS) is proposed to solve this problem by searching for the best networkarchitecture automatically. However, the inconsistency between search stage anddeployment stage often exists in NAS algorithms due to memory constraints andlarge search space, which could become more serious when applying NAS to somememory and time-consuming tasks, such as 3D medical image segmentation. In thispaper, we propose a coarse-to-fine neural architecture search (C2FNAS) toautomatically search a 3D segmentation network from scratch withoutinconsistency on network size or input size. Specifically, we divide the searchprocedure into two stages: 1) the coarse stage, where we search the macro-leveltopology of the network, i.e. how each convolution module is connected to othermodules; 2) the fine stage, where we search at micro-level for operations ineach cell based on previous searched macro-level topology. The coarse-to-finemanner divides the search procedure into two consecutive stages and meanwhileresolves the inconsistency. We evaluate our method on 10 public datasets fromMedical Segmentation Decalthon (MSD) challenge, and achieve state-of-the-artperformance with the network searched using one dataset, which demonstrates theeffectiveness and generalization of our searched models.

论文：http://openaccess.thecvf.com/content_CVPR_2020/papers/Yu_C2FNAS_Coarse-to-Fine_Neural_Architecture_Search_for_3D_Medical_Image_Segmentation_CVPR_2020_paper.pdf

Classification (4)

Multi-scale Domain-adversarialMultiple-instance CNN for Cancer Subtype Classification with UnannotatedHistopathological Images

Noriaki Hashimoto, Daisuke Fukushima, Ryoichi Koga, Yusuke Takagi, KahoKo, Kei Kohno, Masato Nakaguro, Shigeo Nakamura, Hidekata Hontani, IchiroTakeuchi; The IEEE/CVF Conference on Computer Vision and PatternRecognition (CVPR), 2020, pp. 3852-3861

Abstract

We propose a new method for cancer subtypeclassification from histopathological images, which can automatically detecttumor-specific features in a given whole slide image (WSI). The cancer subtypeshould be classified by referring to a WSI, i.e., a large-sized image(typically 40,000x40,000 pixels) of an entire pathological tissue slide, whichconsists of cancer and non-cancer portions. One difficulty arises from the highcost associated with annotating tumor regions in WSIs. Furthermore, both globaland local image features must be extracted from the WSI by changing themagnifications of the image. In addition, the image features should be stablydetected against the differences of staining conditions among thehospitals/specimens. In this paper, we develop a new CNN-based cancer subtypeclassification method by effectively combining multiple-instance, domainadversarial, and multi-scale learning frameworks in order to overcome thesepractical difficulties. When the proposed method was applied to malignantlymphoma subtype classifications of 196 cases collected from multiplehospitals, the classification performance was significantly better than thestandard CNN or other conventional methods, and the accuracy compared favorablywith that of standard pathologists.

论文：

http://openaccess.thecvf.com/content_CVPR_2020/papers/Hashimoto_Multi-scale_Domain-adversarial_Multiple-instance_CNN_for_Cancer_Subtype_Classification_with_Unannotated_CVPR_2020_paper.pdf

本文由“壹伴编辑器”提供技术支持

SOS: Selective Objective Switch for RapidImmunofluorescence Whole Slide Image Classification

Sam Maksoud, Kun Zhao, Peter Hobson, Anthony Jennings, Brian C. Lovell;The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),2020, pp. 3862-3871

Abstract

The difficulty of processing gigapixel wholeslide images (WSIs) in clinical microscopy has been a long-standing barrier toimplementing computer aided diagnostic systems. Since modern computingresources are unable to perform computations at this extremely large scale,current state of the art methods utilize patch-based processing to preserve theresolution of WSIs. However, these methods are often resource intensive andmake significant compromises on processing time. In this paper, we demonstratethat conventional patch-based processing is redundant for certain WSI classificationtasks where high resolution is only required in a minority of cases. Thisreflects what is observed in clinical practice; where a pathologist may screenslides using a low power objective and only switch to a high power in caseswhere they are uncertain about their findings. To eliminate these redundancies,we propose a method for the selective use of high resolution processing basedon the confidence of predictions on downscaled WSIs --- we call this theSelective Objective Switch (SOS). Our method is validated on a novel dataset of684 Liver-Kidney-Stomach immunofluorescence WSIs routinely used in theinvestigation of autoimmune liver disease. By limiting high resolutionprocessing to cases which cannot be classified confidently at low resolution,we maintain the accuracy of patch-level analysis whilst reducing the inferencetime by a factor of 7.74.

论文：

http://openaccess.thecvf.com/content_CVPR_2020/papers/Maksoud_SOS_Selective_Objective_Switch_for_Rapid_Immunofluorescence_Whole_Slide_Image_CVPR_2020_paper.pdf

本文由“壹伴编辑器”提供技术支持

ADINet: Attribute Driven Incremental Networkfor Retinal Image Classification

Qier Meng, Satoh Shin'ichi; The IEEE/CVF Conference on ComputerVision and Pattern Recognition (CVPR), 2020, pp. 4033-4042

Abstract

Retinal diseases encompass a variety oftypes, including different diseases and severity levels. Training a model withdifferent types of disease is impractical. Dynamically training a model isnecessary when a patient with a new disease appears. Deep learning techniqueshave stood out in recent years, but they suffer from catastrophic forgetting,i.e., a dramatic decrease in performance when new training classes appear. Wefound that keeping the feature distribution of an old model helps maintain theperformance of incremental learning. In this paper, we design a framework named"Attribute Driven Incremental Network" (ADINet), a new architecturethat integrates class label prediction and attribute prediction into anincremental learning framework to boost the classification performance. Withimage-level classification, we apply knowledge distillation (KD) to retain theknowledge of base classes. With attribute prediction, we calculate the weightof each attribute of an image and use these weights for more precise attributeprediction. We designed attribute distillation (AD) loss to retain theinformation of base class attributes as new classes appear. This incrementallearning can be performed multiple times with a moderate drop in performance.The results of an experiment on our private retinal fundus image datasetdemonstrate that our proposed method outperforms existing state-of-the-artmethods. For demonstrating the generalization of our proposed method, we testit on the ImageNet-150K-sub dataset and show good performance.

论文：

http://openaccess.thecvf.com/content_CVPR_2020/papers/Meng_ADINet_Attribute_Driven_Incremental_Network_for_Retinal_Image_Classification_CVPR_2020_paper.pdf

本文由“壹伴编辑器”提供技术支持

Predicting Lymph Node Metastasis UsingHistopathological Images Based on Multiple Instance Learning With Deep GraphConvolution

Yu Zhao, Fan Yang, Yuqi Fang, Hailing Liu, Niyun Zhou, Jun Zhang, JiaruiSun, Sen Yang, Bjoern Menze, Xinjuan Fan, Jianhua Yao; The IEEE/CVFConference on Computer Vision and Pattern Recognition (CVPR), 2020, pp.4837-4846

Abstract

Multiple instance learning (MIL) is a typicalweakly-supervised learning method where the label is associated with a bag ofinstances instead of a single instance. Despite extensive research over pastyears, effectively deploying MIL remains an open and challenging problem,especially when the commonly assumed standard multiple instance (SMI)assumption is not satisfied. In this paper, we propose a multiple instancelearning method based on deep graph convolutional network and feature selection(FS-GCN-MIL) for histopathological image classification. The proposed methodconsists of three components, including instance-level feature extraction,instance-level feature selection, and bag-level classification. We develop aself-supervised learning mechanism to train the feature extractor based on acombination model of variational autoencoder and generative adversarial network(VAE-GAN). Additionally, we propose a novel instance-level feature selectionmethod to select the discriminative instance features. Furthermore, we employ agraph convolutional network (GCN) for learning the bag-level representation andthen performing the classification. We apply the proposed method in theprediction of lymph node metastasis using histopathological images ofcolorectal cancer. Experimental results demonstrate that the proposed methodachieves superior performance compared to the state-of-the-art methods.

论文：

http://openaccess.thecvf.com/content_CVPR_2020/papers/Zhao_Predicting_Lymph_Node_Metastasis_Using_Histopathological_Images_Based_on_Multiple_CVPR_2020_paper.pdf

Synthesis & Reconstruction (4)

A Spatiotemporal Volumetric InterpolationNetwork for 4D Dynamic Medical Image

Yuyu Guo, Lei Bi, Euijoon Ahn, Dagan Feng, Qian Wang, Jinman Kim;The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),2020, pp. 4726-4735

Abstract

Dynamic medical images are often limited inits application due to the large radiation doses and longer image scanning andreconstruction times. Existing methods attempt to reduce the volume samples inthe dynamic sequence by interpolating the volumes between the acquired samples.However, these methods are limited to either 2D images and/or are unable tosupport large but periodic variations in the functional motion between theimage volume samples. In this paper, we present a spatiotemporal volumetricinterpolation network (SVIN) designed for 4D dynamic medical images. SVINintroduces dual networks: the first is the spatiotemporal motion network thatleverages the 3D convolutional neural network (CNN) for unsupervised parametricvolumetric registration to derive spatiotemporal motion field from a pair ofimage volumes; the second is the sequential volumetric interpolation network,which uses the derived motion field to interpolate image volumes, together witha new regression-based module to characterize the periodic motion cycles infunctional organ structures. We also introduce an adaptive multi-scalearchitecture to capture the volumetric large anatomy motions. Experimentalresults demonstrated that our SVIN outperformed state-of-the-art temporalmedical interpolation methods and natural video interpolation method that hasbeen extended to support volumetric images. Code is available at [1].

代码：https://github.com/guoyu-niubility/SVIN

论文：

http://openaccess.thecvf.com/content_CVPR_2020/papers/Guo_A_Spatiotemporal_Volumetric_Interpolation_Network_for_4D_Dynamic_Medical_Image_CVPR_2020_paper.pdf

本文由“壹伴编辑器”提供技术支持

Augmenting Colonoscopy Using Extended andDirectional CycleGAN for Lossy Image Translation

Shawn Mathew, Saad Nadeem, Sruti Kumari, Arie Kaufman; TheIEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020,pp. 4696-4705

Abstract

Colorectal cancer screening modalities, suchas optical colonoscopy (OC) and virtual colonoscopy (VC), are critical fordiagnosing and ultimately removing polyps (precursors for colon cancer). Thenon-invasive VC is normally used to inspect a 3D reconstructed colon (fromcomputed tomography scans) for polyps and if found, the OC procedure isperformed to physically traverse the colon via endoscope and remove thesepolyps. In this paper, we present a deep learning framework, Extended andDirectional CycleGAN, for lossy unpaired image-to-image translation between OCand VC to augment OC video sequences with scale-consistent depth informationfrom VC and VC with patient-specific textures, color and specular highlightsfrom OC (e.g. for realistic polyp synthesis). Both OC and VC contain structuralinformation, but it is obscured in OC by additional patient-specific textureand specular highlights, hence making the translation from OC to VC lossy. Theexisting CycleGAN approaches do not handle lossy transformations. To addressthis shortcoming, we introduce an extended cycle consistency loss, whichcompares the geometric structures from OC in the VC domain. This loss removesthe need for the CycleGAN to embed OC information in the VC domain. To handle astronger removal of the textures and lighting, a Directional Discriminator isintroduced to differentiate the direction of translation (by creating pairedinformation for the discriminator), as opposed to the standard CycleGAN whichis direction-agnostic. Combining the extended cycle consistency loss and theDirectional Discriminator, we show state-of-the-art results on scale-consistentdepth inference for phantom, textured VC and for real polyp and normal colonvideo sequences. We also present results for realistic pendunculated and flatpolyp synthesis from bumps introduced in 3D VC models.

论文：

http://openaccess.thecvf.com/content_CVPR_2020/papers/Mathew_Augmenting_Colonoscopy_Using_Extended_and_Directional_CycleGAN_for_Lossy_Image_CVPR_2020_paper.pdf

本文由“壹伴编辑器”提供技术支持

SAINT: Spatially Aware Interpolation NeTworkfor Medical Slice Synthesis

Cheng Peng, Wei-An Lin, Haofu Liao, Rama Chellappa, S. Kevin Zhou;The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),2020, pp. 7750-7759

Abstract

Deep learning-based single imagesuper-resolution (SISR) methods face various challenges when applied to 3Dmedical volumetric data (i.e., CT and MR images) due to the high memory costand anisotropic resolution, which adversely affect their performance. Furthermore,mainstream SISR methods are designed to work over specific upsampling factors,which makes them ineffective in clinical practice. In this paper, we introducea Spatially Aware Interpolation NeTwork (SAINT) for medical slice synthesis toalleviate the memory constraint that volumetric data poses. Compared to othersuper-resolution methods, SAINT utilizes voxel spacing information to providedesirable levels of details, and allows for the upsampling factor to bedetermined on the fly. Our evaluations based on 853 CT scans from four datasetsthat contain liver, colon, hepatic vessels, and kidneys show that SAINTconsistently outperforms other SISR methods in terms of medical slice synthesisquality, while using only a single model to deal with different upsamplingfactors

论文：

http://openaccess.thecvf.com/content_CVPR_2020/papers/Peng_SAINT_Spatially_Aware_Interpolation_NeTwork_for_Medical_Slice_Synthesis_CVPR_2020_paper.pdf

本文由“壹伴编辑器”提供技术支持

DuDoRNet: Learning a Dual-Domain RecurrentNetwork for Fast MRI Reconstruction With Deep T1 Prior

Bo Zhou, S. Kevin Zhou; The IEEE/CVF Conference on ComputerVision and Pattern Recognition (CVPR), 2020, pp. 4273-4282

Abstract

MRI with multiple protocols is commonly usedfor diagnosis, but it suffers from a long acquisition time, which yields theimage quality vulnerable to say motion artifacts. To accelerate, variousmethods have been proposed to reconstruct full images from under-sampledk-space data. However, these algorithms are inadequate for two main reasons.Firstly, aliasing artifacts generated in the image domain are structural andnon-local, so that sole image domain restoration is insufficient. Secondly,though MRI comprises multiple protocols during one exam, almost all previousstudies only employ the reconstruction of an individual protocol using a highlydistorted undersampled image as input, leaving the use of fully-sampled shortprotocol (say T1) as complementary information highly underexplored. In thiswork, we address the above two limitations by proposing a Dual Domain RecurrentNetwork (DuDoRNet) with deep T1 prior embedded to simultaneously recoverk-space and images for accelerating the acquisition of MRI with a long imagingprotocol. Specifically, a Dilated Residual Dense Network (DRDNet) is customizedfor dual domain restorations from undersampled MRI data. Extensive experimentson different sampling patterns and acceleration rates demonstrate that ourmethod consistently outperforms state-of-the-art methods, and can reconstructhigh quality MRI.

论文：

http://openaccess.thecvf.com/content_CVPR_2020/papers/Zhou_DuDoRNet_Learning_a_Dual-Domain_Recurrent_Network_for_Fast_MRI_Reconstruction_CVPR_2020_paper.pdf

CAD (2)

Cross-View Correspondence Reasoning Based onBipartite Graph Convolutional Network for Mammogram Mass Detection

Yuhang Liu, Fandong Zhang, Qianyi Zhang, Siwen Wang, Yizhou Wang, YizhouYu; The IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), 2020, pp. 3812-3822

Abstract

Mammogram mass detection is of great clinicalsignificance due to its high proportion in breast cancers. The information fromcross views (i.e., mediolateral oblique and cranio-caudal) is highly relatedand complementary, and is helpful to make comprehensive decisions. However,unlike radiologists who are able to recognize masses with reasoning ability incross-view images, most existing methods lack the ability to reason under theguidance of domain knowledge, thus it limits the performance. In this paper, weintroduce bipartite graph convolutional network to endow existing methods withcross-view reasoning ability of radiologists in mammogram mass detection. Thebipartite node sets are constructed by cross-view images respectively torepresent relatively consistent regions in breasts, while the bipartite edgelearns to model both inherent cross-view geometric constraints and appearancesimilarities between correspondences. Based on the bipartite graph, theinformation propagates methodically through correspondences and enables spatialvisual features equipped with customized cross-view reasoning ability.Experimental results on DDSM dataset demonstrate that the proposed algorithmachieves state-of-the-art performance. Besides, visual analysis shows the modelhas a clear physical meaning, which is helpful for radiologists in clinicalinterpretation.

论文：

http://openaccess.thecvf.com/content_CVPR_2020/papers/Liu_Cross-View_Correspondence_Reasoning_Based_on_Bipartite_Graph_Convolutional_Network_for_CVPR_2020_paper.pdf

本文由“壹伴编辑器”提供技术支持

FocalMix: Semi-Supervised Learning for 3DMedical Image Detection

Dong Wang, Yuan Zhang, Kexin Zhang, Liwei Wang; The IEEE/CVFConference on Computer Vision and Pattern Recognition (CVPR), 2020, pp.3951-3960

Abstract

Applying artificial intelligence techniquesin medical imaging is one of the most promising areas in medicine. However,most of the recent success in this area highly relies on large amounts ofcarefully annotated data, whereas annotating medical images is a costlyprocess. In this paper, we propose a novel method, called FocalMix, which, tothe best of our knowledge, is the first to leverage recent advances insemi-supervised learning (SSL) for 3D medical image detection. We conductedextensive experiments on two widely used datasets for lung nodule detection,LUNA16 and NLST. Results show that our proposed SSL methods can achieve asubstantial improvement of up to 17.3% over state-of-the-art supervisedlearning approaches with 400 unlabeled CT scans.

论文：

http://openaccess.thecvf.com/content_CVPR_2020/papers/Wang_FocalMix_Semi-Supervised_Learning_for_3D_Medical_Image_Detection_CVPR_2020_paper.pdf

Motion & Tracking (2)

MPM: Joint Representation of Motion andPosition Map for Cell Tracking

Junya Hayashida, Kazuya Nishimura, Ryoma Bise; The IEEE/CVFConference on Computer Vision and Pattern Recognition (CVPR), 2020, pp.3823-3832

Abstract

Conventional cell tracking methods detectmultiple cells in each frame (detection) and then associate the detectionresults in successive time-frames (association). Most cell tracking methodsperform the association task independently from the detection task. However,there is no guarantee of preserving coherence between these tasks, and lack ofcoherence may adversely affect tracking performance. In this paper, we proposethe Motion and Position Map (MPM) that jointly represents both detection andassociation for not only migration but also cell division. It guaranteescoherence such that if a cell is detected, the corresponding motion flow canalways be obtained. It is a simple but powerful method for multi-objecttracking in dense environments. We compared the proposed method with currenttracking methods under various conditions in real biological images and foundthat it outperformed the state-of-the-art (+5.2% improvement compared to thesecond-best).

论文：

http://openaccess.thecvf.com/content_CVPR_2020/papers/Hayashida_MPM_Joint_Representation_of_Motion_and_Position_Map_for_Cell_CVPR_2020_paper.pdf

本文由“壹伴编辑器”提供技术支持

FOAL: Fast Online Adaptive Learning forCardiac Motion Estimation

Hanchao Yu, Shanhui Sun, Haichao Yu, Xiao Chen, Honghui Shi, Thomas S.Huang, Terrence Chen; The IEEE/CVF Conference on Computer Vision andPattern Recognition (CVPR), 2020, pp. 4313-4323

Abstract

Motion estimation of cardiac MRI videos iscrucial for the evaluation of human heart anatomy and function. Recentresearches show promising results with deep learning-based methods. In clinicaldeployment, however, they suffer dramatic performance drops due to mismatcheddistributions between training and testing datasets, commonly encountered inthe clinical environment. On the other hand, it is arguably impossible tocollect all representative datasets and to train a universal tracker beforedeployment. In this context, we proposed a novel fast online adaptive learning(FOAL) framework: an online gradient descent based optimizer that is optimizedby a meta-learner. The meta-learner enables the online optimizer to perform afast and robust adaptation. We evaluated our method through extensiveexperiments on two public clinical datasets. The results showed the superiorperformance of FOAL in accuracy compared to the offline-trained tracking method.On average, the FOAL took only 0.4 second per video for online optimization.

论文：

http://openaccess.thecvf.com/content_CVPR_2020/papers/Yu_FOAL_Fast_Online_Adaptive_Learning_for_Cardiac_Motion_Estimation_CVPR_2020_paper.pdf

Registration (2)

Fast Symmetric Diffeomorphic ImageRegistration with Convolutional Neural Networks

Tony C.W. Mok, Albert C.S. Chung; The IEEE/CVF Conference on ComputerVision and Pattern Recognition (CVPR), 2020, pp. 4644-4653

Abstract

Diffeomorphic deformable image registrationis crucial in many medical image studies, as it offers unique, special featuresincluding topology preservation and invertibility of the transformation. Recentdeep learning-based deformable image registration methods achieve fast imageregistration by leveraging a convolutional neural network (CNN) to learn thespatial transformation from the synthetic ground truth or the similarity metric.However, these approaches often ignore the topology preservation of thetransformation and the smoothness of the transformation which is enforced by aglobal smoothing energy function alone. Moreover, deep learning-basedapproaches often estimate the displacement field directly, which cannotguarantee the existence of the inverse transformation. In this paper, wepresent a novel, efficient unsupervised symmetric image registration methodwhich maximizes the similarity between images within the space of diffeomorphicmaps and estimates both forward and inverse transformations simultaneously. Weevaluate our method on 3D image registration with a large scale brain imagedataset. Our method achieves state-of-the-art registration accuracy and runningtime while maintaining desirable diffeomorphic properties.

论文：

http://openaccess.thecvf.com/content_CVPR_2020/papers/Mok_Fast_Symmetric_Diffeomorphic_Image_Registration_with_Convolutional_Neural_Networks_CVPR_2020_paper.pdf

本文由“壹伴编辑器”提供技术支持

DeepFLASH: An Efficient Network forLearning-Based Medical Image Registration

Jian Wang, Miaomiao Zhang; The IEEE/CVF Conference on ComputerVision and Pattern Recognition (CVPR), 2020, pp. 4444-4452

Abstract

This paper presents DeepFLASH, a novelnetwork with efficient training and inference for learning-based medical imageregistration. In contrast to existing approaches that learn spatialtransformations from training data in the high dimensional imaging space, wedevelop a new registration network entirely in a low dimensional bandlimitedspace. This dramatically reduces the computational cost and memory footprint ofan expensive training and inference. To achieve this goal, we first introducecomplex-valued operations and representations of neural architectures thatprovide key components for learning-based registration models. We thenconstruct an explicit loss function of transformation fields fullycharacterized in a bandlimited space with much fewer parameterizations.Experimental results show that our method is significantly faster than thestate-of-the-art deep learning based image registration methods, whileproducing equally accurate alignment. We demonstrate our algorithm in two differentapplications of image registration: 2D synthetic data and 3D real brainmagnetic resonance (MR) images.

论文：

http://openaccess.thecvf.com/content_CVPR_2020/papers/Wang_DeepFLASH_An_Efficient_Network_for_Learning-Based_Medical_Image_Registration_CVPR_2020_paper.pdf