机器学习学术速递[12.23]

公众号-arXiv每日学术速递

发布于 2021-12-27 17:05:33

1.1K0

发布于 2021-12-27 17:05:33

cs.LG 方向，今日共计77篇

Graph相关(图学习|图神经网络|图优化等)(6篇)

【1】 Graph augmented Deep Reinforcement Learning in the GameRLand3D environment 标题：GameRLand3D环境下的图形增广深度强化学习链接：https://arxiv.org/abs/2112.11731

作者：Edward Beeching,Maxim Peter,Philippe Marcotte,Jilles Debangoye,Olivier Simonin,Joshua Romoff,Christian Wolf 机构： Ubisoft La Forge, Montreal, INRIA Chroma team, CITI Laboratory. INSA-Lyon, France., Universit´e de Lyon, INSA-Lyon, LIRIS, CNRS, France. 摘要：我们在具有挑战性的3D视频游戏中解决规划和导航问题，该游戏的特点是地图上的断开区域可由使用特殊操作的代理访问。在此设置中，经典符号规划者不适用或难以适应。我们介绍了一种混合技术，它将经过强化学习的低级策略和基于图的高级经典规划器相结合。除了提供人类可解释的路径外，该方法还提高了未查看地图中端到端方法的泛化性能，在未查看的1km x 1km大比例尺地图中，与经常使用的端到端代理相比，该方法的成功率绝对提高了20%。在一项深入的实验研究中，我们量化了端到端深度RL方法在广阔环境中的局限性，并引入了“GameRLand3D”，这是一个新的基准，即将发布的环境可以为导航任务生成复杂的程序3D地图。摘要：We address planning and navigation in challenging 3D video games featuring maps with disconnected regions reachable by agents using special actions. In this setting, classical symbolic planners are not applicable or difficult to adapt. We introduce a hybrid technique combining a low level policy trained with reinforcement learning and a graph based high level classical planner. In addition to providing human-interpretable paths, the approach improves the generalization performance of an end-to-end approach in unseen maps, where it achieves a 20% absolute increase in success rate over a recurrent end-to-end agent on a point to point navigation task in yet unseen large-scale maps of size 1km x 1km. In an in-depth experimental study, we quantify the limitations of end-to-end Deep RL approaches in vast environments and we also introduce "GameRLand3D", a new benchmark and soon to be released environment can generate complex procedural 3D maps for navigation tasks.

【2】 SkipNode: On Alleviating Over-smoothing for Deep Graph Convolutional Networks 标题：SkipNode：减轻深图卷积网络的过平滑问题链接：https://arxiv.org/abs/2112.11628

作者：Weigang Lu,Yibing Zhan,Ziyu Guan,Liu Liu,Baosheng Yu,Wei Zhao,Yaming Yang,Dacheng Tao 机构：State Key Laboratory of Integrated Services Networks, School of Computer Science and Technology, Xidian, JD Explore Academy, The University of Sydney 备注：12 pages, 5 figures 摘要：过度平滑是一个具有挑战性的问题，它会降低深图卷积网络（GCN）的性能。然而，现有的缓解过度平滑问题的研究缺乏普遍性或有效性。在本文中，我们分析了过度平滑问题背后的潜在问题，即特征多样性退化、梯度消失和模型权重过度衰减。受此启发，我们提出了一个简单但有效的即插即用模块SkipNode，以缓解过度平滑。具体而言，对于GCN模型的每个中间层，SkipNode通过直接将其输入特征反馈给非线性函数，随机（或基于节点度）选择节点跳过卷积运算。从理论上讲，1）跳过卷积运算可以防止特征丢失多样性；2）“跳过”节点使梯度能够直接传回，从而缓解梯度消失和模型权重衰减问题。为了证明SkipNode的优越性，我们在九个流行的数据集上进行了广泛的实验，包括同亲图和异亲图，在两个典型任务上使用不同的图大小：节点分类和链接预测。具体而言，1）SkipNode具有很强的通用性，可应用于不同数据集和任务的各种基于GCN的模型；2）SkipNode在不同设置下的性能优于最新的防过度平滑即插即用模块，即DropEdge和DropNode。代码将在GitHub上公开。摘要：Over-smoothing is a challenging problem, which degrades the performance of deep graph convolutional networks (GCNs). However, existing studies for alleviating the over-smoothing problem lack either generality or effectiveness. In this paper, we analyze the underlying issues behind the over-smoothing problem, i.e., feature-diversity degeneration, gradient vanishing, and model weights over-decaying. Inspired by this, we propose a simple yet effective plug-and-play module, SkipNode, to alleviate over-smoothing. Specifically, for each middle layer of a GCN model, SkipNode randomly (or based on node degree) selects nodes to skip the convolutional operation by directly feeding their input features to the nonlinear function. Analytically, 1) skipping the convolutional operation prevents the features from losing diversity; and 2) the "skipped" nodes enable gradients to be directly passed back, thus mitigating the gradient vanishing and model weights over-decaying issues. To demonstrate the superiority of SkipNode, we conduct extensive experiments on nine popular datasets, including both homophilic and heterophilic graphs, with different graph sizes on two typical tasks: node classification and link prediction. Specifically, 1) SkipNode has strong generalizability of being applied to various GCN-based models on different datasets and tasks; and 2) SkipNode outperforms recent state-of-the-art anti-over-smoothing plug-and-play modules, i.e., DropEdge and DropNode, in different settings. Code will be made publicly available on GitHub.

【3】 GCoD: Graph Convolutional Network Acceleration via Dedicated Algorithm and Accelerator Co-Design 标题：GCoD：基于专用算法和加速器协同设计的图卷积网络加速链接：https://arxiv.org/abs/2112.11594

作者：Haoran You,Tong Geng,Yongan Zhang,Ang Li,Yingyan Lin 机构：†Rice University, Houston, TX, ‡Pacific Northwest National Laboratory, Richland, WA 备注：Published as a conference paper at HPCA 2022 摘要：图卷积网络（GCN）已成为最先进的图学习模型。然而，在大型图形数据集上推断GCN可能是出了名的挑战，这限制了它们在大型真实世界图形上的应用，并阻碍了对更深入和更复杂GCN图形的探索。这是因为现实世界中的图形可能非常大和稀疏。此外，GCN的节点度倾向于遵循幂律分布，因此具有高度不规则的邻接矩阵，导致数据处理和移动效率低下，从而严重限制了可实现的GCN加速效率。为此，本文提出了一个GCN算法和加速器协同设计框架GCoD，该框架可以大大缓解上述GCN不规则性，提高GCN的推理效率。具体而言，在算法层面上，GCoD集成了一种分而治之的GCN训练策略，在不影响模型精度的情况下，将局部邻域中的图形极化为更密集或更稀疏的图形，产生的图邻接矩阵（大部分）只有两个级别的工作负载，并且具有很大程度上增强的规律性，因此易于加速。在硬件层面上，我们进一步开发了一个专用的双管齐下的加速器，该加速器带有一个单独的引擎，用于处理上述每一个更密集和更稀疏的工作负载，进一步提高了总体利用率和加速效率。大量实验和烧蚀研究证实，我们的GCoD持续减少了片外访问的数量，与CPU、GPU和现有技术GCN加速器（包括HyGCN和AWB-GCN）相比，分别提高了15286x、294x、7.8x和2.5x的速度，同时保持甚至提高了任务精度。摘要：Graph Convolutional Networks (GCNs) have emerged as the state-of-the-art graph learning model. However, it can be notoriously challenging to inference GCNs over large graph datasets, limiting their application to large real-world graphs and hindering the exploration of deeper and more sophisticated GCN graphs. This is because real-world graphs can be extremely large and sparse. Furthermore, the node degree of GCNs tends to follow the power-law distribution and therefore have highly irregular adjacency matrices, resulting in prohibitive inefficiencies in both data processing and movement and thus substantially limiting the achievable GCN acceleration efficiency. To this end, this paper proposes a GCN algorithm and accelerator Co-Design framework dubbed GCoD which can largely alleviate the aforementioned GCN irregularity and boost GCNs' inference efficiency. Specifically, on the algorithm level, GCoD integrates a split and conquer GCN training strategy that polarizes the graphs to be either denser or sparser in local neighborhoods without compromising the model accuracy, resulting in graph adjacency matrices that (mostly) have merely two levels of workload and enjoys largely enhanced regularity and thus ease of acceleration. On the hardware level, we further develop a dedicated two-pronged accelerator with a separated engine to process each of the aforementioned denser and sparser workloads, further boosting the overall utilization and acceleration efficiency. Extensive experiments and ablation studies validate that our GCoD consistently reduces the number of off-chip accesses, leading to speedups of 15286x, 294x, 7.8x, and 2.5x as compared to CPUs, GPUs, and prior-art GCN accelerators including HyGCN and AWB-GCN, respectively, while maintaining or even improving the task accuracy.

【4】 Deep Reinforcement Learning for Optimal Power Flow with Renewables Using Spatial-Temporal Graph Information 标题：基于时空图信息的含可再生能源最优潮流的深度强化学习链接：https://arxiv.org/abs/2112.11461

作者：Jinhao Li,Ruichang Zhang,Hao Wang,Zhi Liu,Hongyang Lai,Yanru Zhang 机构：ZhangarewiththeSchoolofComputerSci-ence and Engineering, University of Electronic Science and Technol-ogy of China 备注：18 pages, 14 figures 摘要：可再生能源（RER）已越来越多地融入现代电力系统，特别是在大型配电网（DNs）中。在本文中，我们提出了一种基于深度强化学习（DRL）的方法来动态搜索具有高RER率的DNs中的最佳运行点，即最优潮流（OPF）。考虑到RER引起的不确定性和电压波动问题，我们将OPF转化为一个多目标优化（MOO）问题。为了解决MOO问题，我们开发了一种利用配电网图形信息的新型DRL算法。具体而言，我们采用最先进的DRL算法，即深度确定性策略梯度（DDPG），来学习OPF的最优策略。由于DN中的潮流再分配是一个连续的过程，其中节点在时间和空间视图中是自相关和相互关联的，为了充分利用DNs的图形信息，我们开发了一个基于多粒度注意的时空图卷积网络（MG-ASTGCN）用于时空图信息提取，为其顺序DDPG做准备。我们在改进的IEEE 33、69和118母线径向配电系统（RDS）中验证了我们提出的基于DRL的方法，并表明我们基于DRL的方法优于其他基准算法。我们的实验结果还表明，MG-ASTGCN可以显著加快DDPG的训练过程，并提高DDPG对OPF潮流的重新分配能力。所提出的基于DRL的方法还提高了DNs在存在节点故障时的稳定性，特别是对于大规模DNs。摘要：Renewable energy resources (RERs) have been increasingly integrated into modern power systems, especially in large-scale distribution networks (DNs). In this paper, we propose a deep reinforcement learning (DRL)-based approach to dynamically search for the optimal operation point, i.e., optimal power flow (OPF), in DNs with a high uptake of RERs. Considering uncertainties and voltage fluctuation issues caused by RERs, we formulate OPF into a multi-objective optimization (MOO) problem. To solve the MOO problem, we develop a novel DRL algorithm leveraging the graphical information of the distribution network. Specifically, we employ the state-of-the-art DRL algorithm, i.e., deep deterministic policy gradient (DDPG), to learn an optimal strategy for OPF. Since power flow reallocation in the DN is a consecutive process, where nodes are self-correlated and interrelated in temporal and spatial views, to make full use of DNs' graphical information, we develop a multi-grained attention-based spatial-temporal graph convolution network (MG-ASTGCN) for spatial-temporal graph information extraction, preparing for its sequential DDPG. We validate our proposed DRL-based approach in modified IEEE 33, 69, and 118-bus radial distribution systems (RDSs) and show that our DRL-based approach outperforms other benchmark algorithms. Our experimental results also reveal that MG-ASTGCN can significantly accelerate the DDPG training process and improve DDPG's capability in reallocating power flow for OPF. The proposed DRL-based approach also promotes DNs' stability in the presence of node faults, especially for large-scale DNs.

【5】 Encoding protein dynamic information in graph representation for functional residue identification 标题：用于功能残基识别的图示法编码蛋白质动态信息链接：https://arxiv.org/abs/2112.12033

作者：Yuan Chiang,Wei-Han Hui,Shu-Wei Chang 机构：†Department of Civil Engineering, National Taiwan University, Taipei, Taiwan, ‡Department of Biomedical Engineering, National Taiwan University, Taipei, Taiwan 摘要：蛋白质功能预测的最新进展利用基于图形的深度学习方法将蛋白质的结构和拓扑特征与其分子功能关联起来。然而，体内的蛋白质不是静态的，而是动态的分子，为了功能目的改变构象。在这里，我们将正常模式分析应用于天然蛋白质构象，并通过连接动态相关残基对之间的边来扩充蛋白质图。在多标签函数分类任务中，基于这种动态信息表示，我们的方法表现出显著的性能增益。提出的图形神经网络ProDAR提高了残基水平注释的可解释性和可推广性，并有力地反映了蛋白质的结构差异。通过比较hMTH1、硝基磷蛋白和SARS-CoV-2受体结合域的类激活图，我们阐明了动态信息在图形表示中的重要性。我们的模型成功地学习了蛋白质的动态指纹，并提供了蛋白质功能的分子洞察，在广泛的生物技术和制药应用方面具有巨大的潜力。摘要：Recent advances in protein function prediction exploit graph-based deep learning approaches to correlate the structural and topological features of proteins with their molecular functions. However, proteins in vivo are not static but dynamic molecules that alter conformation for functional purposes. Here we apply normal mode analysis to native protein conformations and augment protein graphs by connecting edges between dynamically correlated residue pairs. In the multilabel function classification task, our method demonstrates a remarkable performance gain based on this dynamics-informed representation. The proposed graph neural network, ProDAR, increases the interpretability and generalizability of residue-level annotations and robustly reflects structural nuance in proteins. We elucidate the importance of dynamic information in graph representation by comparing class activation maps for the hMTH1, nitrophorin, and SARS-CoV-2 receptor binding domain. Our model successfully learns the dynamic fingerprints of proteins and provides molecular insights into protein functions, with vast untapped potential for broad biotechnology and pharmaceutical applications.

【6】 RepBin: Constraint-based Graph Representation Learning for Metagenomic Binning 标题：RepBin：元基因组装箱的基于约束的图表示学习链接：https://arxiv.org/abs/2112.11696

作者：Hansheng Xue,Vijini Mallawaarachchi,Yujia Zhang,Vaibhav Rajan,Yu Lin 机构： School of Computing, The Australian National University, Canberra, Australia, School of Computing, National University of Singapore, Singapore 备注：Accepted by AAAI-2022 摘要：在许多环境中（从人类肠道到海洋生态系统）都可以发现混合生物群落，它们对人类健康和环境有着深远的影响。宏基因组学通过高通量测序来研究这些群体的基因组材料，该测序产生DNA子序列以供后续分析。标准工作流程中的一个基本问题，称为binning，是发现与未知组成生物体相关的基因组子序列簇。子序列中固有的噪声、需要对其施加的各种生物约束以及歪斜的簇大小分布加剧了这种无监督学习问题的难度。在本文中，我们提出了一个新的公式，使用一个图，其中节点是子序列，边表示同态信息。此外，我们对生物约束进行建模，以提供关于无法聚集在一起的节点的异嗜性信号。我们通过开发新的算法来解决装箱问题：（i）保持同质关系和异质约束的图表示学习（ii）基于约束的图聚类方法，该方法解决了倾斜簇大小分布的问题。在真实数据集和合成数据集上进行的大量实验表明，我们称为RepBin的方法优于各种竞争方法。我们的基于约束的图表示学习和聚类方法可能在其他领域也很有用，提高了宏基因组学分类和图表示学习的最新水平。摘要：Mixed communities of organisms are found in many environments (from the human gut to marine ecosystems) and can have profound impact on human health and the environment. Metagenomics studies the genomic material of such communities through high-throughput sequencing that yields DNA subsequences for subsequent analysis. A fundamental problem in the standard workflow, called binning, is to discover clusters, of genomic subsequences, associated with the unknown constituent organisms. Inherent noise in the subsequences, various biological constraints that need to be imposed on them and the skewed cluster size distribution exacerbate the difficulty of this unsupervised learning problem. In this paper, we present a new formulation using a graph where the nodes are subsequences and edges represent homophily information. In addition, we model biological constraints providing heterophilous signal about nodes that cannot be clustered together. We solve the binning problem by developing new algorithms for (i) graph representation learning that preserves both homophily relations and heterophily constraints (ii) constraint-based graph clustering method that addresses the problems of skewed cluster size distribution. Extensive experiments, on real and synthetic datasets, demonstrate that our approach, called RepBin, outperforms a wide variety of competing methods. Our constraint-based graph representation learning and clustering methods, that may be useful in other domains as well, advance the state-of-the-art in both metagenomics binning and graph representation learning.

Transformer(3篇)

【1】 Trees in transformers: a theoretical analysis of the Transformer's ability to represent trees 标题：Transformer中的树：Transformer表示树能力的理论分析链接：https://arxiv.org/abs/2112.11913

作者：Qi He,João Sedoc,Jordan Rodu 机构： 1 New York University, 2 New York University, 3 University ofVirginia 摘要：Transformer网络是自然语言处理中事实上的标准架构。迄今为止，还没有对Transformer捕获树状结构的能力进行理论分析。我们关注Transformer网络学习树结构的能力，这对树转换问题很重要。我们首先分析标准Transformer体系结构的理论能力，在枚举所有可能的树主干（我们定义为没有标签的树）的情况下学习树结构。然后，我们证明了具有ReLU激活函数的两个线性层可以从任意两个非零、线性独立的起始主干恢复任意树主干。这意味着Transformer在理论上可以很好地学习树结构。我们对合成数据进行了实验，发现标准变换器与树位置信息显式编码的变换器相比具有相似的精度，尽管收敛速度较慢。这从经验上证实了Transformer可以学习树结构。摘要：Transformer networks are the de facto standard architecture in natural language processing. To date, there are no theoretical analyses of the Transformer's ability to capture tree structures. We focus on the ability of Transformer networks to learn tree structures that are important for tree transduction problems. We first analyze the theoretical capability of the standard Transformer architecture to learn tree structures given enumeration of all possible tree backbones, which we define as trees without labels. We then prove that two linear layers with ReLU activation function can recover any tree backbone from any two nonzero, linearly independent starting backbones. This implies that a Transformer can learn tree structures well in theory. We conduct experiments with synthetic data and find that the standard Transformer achieves similar accuracy compared to a Transformer where tree position information is explicitly encoded, albeit with slower convergence. This confirms empirically that Transformers can learn tree structures.

【2】 Domain Adaptation with Pre-trained Transformers for Query Focused Abstractive Text Summarization 标题：面向查询的摘要文本摘要中基于预训练转换器的领域自适应链接：https://arxiv.org/abs/2112.11670

作者：Md Tahmid Rahman Laskar,Enamul Hoque,Jimmy Xiangji Huang 机构：Dialpad Canada Inc., Information Retrieval & Knowledge Management Research Lab, York University, School of Information Technology, York University, Toronto, Ontario, Canada 备注：The final version will be published in the Computational Linguistics journal 摘要：以查询为中心的文本摘要（QFTS）任务旨在构建基于给定查询生成文本文档摘要的系统。解决此任务的一个关键挑战是缺少用于训练摘要模型的大型标记数据。在本文中，我们通过探索一系列领域适应技术来应对这一挑战。鉴于预训练的transformer模型最近在广泛的自然语言处理任务中取得了成功，我们利用这些模型为单文档和多文档场景的QFTS任务生成抽象摘要。对于域自适应，我们使用预训练的基于转换器的摘要模型应用各种技术，包括转移学习、弱监督学习和远程监督。在六个数据集上进行的大量实验表明，我们提出的方法在生成QFTS任务的抽象摘要方面非常有效，同时在多个数据集中设置了一组自动和人工评估指标的最新结果。摘要：The Query Focused Text Summarization (QFTS) task aims at building systems that generate the summary of the text document(s) based on the given query. A key challenge in addressing this task is the lack of large labeled data for training the summarization model. In this paper, we address this challenge by exploring a series of domain adaptation techniques. Given the recent success of pre-trained transformer models in a wide range of natural language processing tasks, we utilize such models to generate abstractive summaries for the QFTS task for both single-document and multi-document scenarios. For domain adaptation, we apply a variety of techniques using pre-trained transformer-based summarization models including transfer learning, weakly supervised learning, and distant supervision. Extensive experiments on six datasets show that our proposed approach is very effective in generating abstractive summaries for the QFTS task while setting a new state-of-the-art result in several datasets across a set of automatic and human evaluation metrics.

【3】 MIA-Former: Efficient and Robust Vision Transformers via Multi-grained Input-Adaptation 标题：基于多粒度输入自适应的高效鲁棒视觉转换器链接：https://arxiv.org/abs/2112.11542

作者：Zhongzhi Yu,Yonggan Fu,Sicheng Li,Chaojian Li,Yingyan Lin 机构： Department of Electrical and Computer Engineering, Rice University , Alibaba DAMO Academy 摘要：VIT的计算成本通常太高，无法安装在现实世界中资源受限的设备上，这是因为（1）VIT的复杂性随着输入令牌的数量而二次增加，（2）VIT的自我注意头和模型深度参数化过度。同时，不同的图像具有不同的复杂性，它们的不同区域可以包含不同级别的视觉信息，这表明在模型复杂性方面平等对待所有区域/标记是不必要的，而这种降低VIT复杂性的机会尚未充分探索。为此，我们提出了一个称为MIA-Former的多粒度输入自适应视觉转换器框架，该框架可以在三个粗粒度到细粒度（即模型深度和模型头/令牌的数量）下输入自适应地调整vit的结构。特别是，我们的MIA-Former采用了一种低成本的网络，通过混合监督和强化训练方法进行训练，以输入自适应的方式跳过不必要的层、头和令牌，从而降低了总体计算成本。此外，我们的MIA-Former的一个有趣的副作用是，它产生的vit与静态的vit相比，自然具有更强的抗对手攻击的鲁棒性，因为MIA-Former的多粒度动态控制提高了模型的多样性，类似于集成的效果，从而增加了针对其所有子模型的对抗性攻击的难度。大量实验和烧蚀研究证实，所提出的MIA-Former框架能够有效地分配计算预算，以适应输入图像的难度，同时提高鲁棒性，实现最先进的（SOTA）精度和效率权衡，例如。，与SOTA动态Transformer模型相比，在相同或更高精度的情况下，计算节省20%。摘要：ViTs are often too computationally expensive to be fitted onto real-world resource-constrained devices, due to (1) their quadratically increased complexity with the number of input tokens and (2) their overparameterized self-attention heads and model depth. In parallel, different images are of varied complexity and their different regions can contain various levels of visual information, indicating that treating all regions/tokens equally in terms of model complexity is unnecessary while such opportunities for trimming down ViTs' complexity have not been fully explored. To this end, we propose a Multi-grained Input-adaptive Vision Transformer framework dubbed MIA-Former that can input-adaptively adjust the structure of ViTs at three coarse-to-fine-grained granularities (i.e., model depth and the number of model heads/tokens). In particular, our MIA-Former adopts a low-cost network trained with a hybrid supervised and reinforcement training method to skip unnecessary layers, heads, and tokens in an input adaptive manner, reducing the overall computational cost. Furthermore, an interesting side effect of our MIA-Former is that its resulting ViTs are naturally equipped with improved robustness against adversarial attacks over their static counterparts, because MIA-Former's multi-grained dynamic control improves the model diversity similar to the effect of ensemble and thus increases the difficulty of adversarial attacks against all its sub-models. Extensive experiments and ablation studies validate that the proposed MIA-Former framework can effectively allocate computation budgets adaptive to the difficulty of input images meanwhile increase robustness, achieving state-of-the-art (SOTA) accuracy-efficiency trade-offs, e.g., 20% computation savings with the same or even a higher accuracy compared with SOTA dynamic transformer models.

GAN|对抗|攻击|生成相关(10篇)

【1】 Detect & Reject for Transferability of Black-box Adversarial Attacks Against Network Intrusion Detection Systems 标题：网络入侵检测系统黑盒敌意攻击的可传递性检测与拒绝链接：https://arxiv.org/abs/2112.12095

作者：Islam Debicha,Thibault Debatty,Jean-Michel Dricot,Wim Mees,Tayeb Kenaza 机构： Royal Military Academy, Rue Hobbema , Brussels, Belgium, Universit´e libre de Bruxelles, Avenue Franklin Roosevelt , Brussels, Belgium, ´Ecole Militaire Polytechnique, Bordj El-Bahri , Algiers, Algeria 备注：None 摘要：在过去的十年中，机器学习技术在基于异常的入侵检测系统中的应用取得了很大的成功。然而，最近的研究表明，一般的机器学习和深度学习都容易受到敌对攻击，攻击者试图通过提供欺骗性输入来愚弄模型。这一漏洞最早被发现的计算机视觉研究表明，旨在愚弄特定模型的敌对图像可以欺骗其他机器学习模型。在本文中，我们研究了对抗性网络流量对多个基于机器学习的入侵检测系统的可转移性。此外，我们还分析了集成入侵检测系统的鲁棒性，与单一模型相比，集成入侵检测系统具有更好的准确性，能够抵抗对手攻击的可转移性。最后，我们研究了Detect&Reject作为一种防御机制来限制对抗性网络流量的可转移性对基于机器学习的入侵检测系统的影响。摘要：In the last decade, the use of Machine Learning techniques in anomaly-based intrusion detection systems has seen much success. However, recent studies have shown that Machine learning in general and deep learning specifically are vulnerable to adversarial attacks where the attacker attempts to fool models by supplying deceptive input. Research in computer vision, where this vulnerability was first discovered, has shown that adversarial images designed to fool a specific model can deceive other machine learning models. In this paper, we investigate the transferability of adversarial network traffic against multiple machine learning-based intrusion detection systems. Furthermore, we analyze the robustness of the ensemble intrusion detection system, which is notorious for its better accuracy compared to a single model, against the transferability of adversarial attacks. Finally, we examine Detect & Reject as a defensive mechanism to limit the effect of the transferability property of adversarial network traffic against machine learning-based intrusion detection systems.

【2】 Generating Synthetic Mixed-type Longitudinal Electronic Health Records for Artificial Intelligent Applications 标题：为人工智能应用生成合成混合型纵向电子健康记录链接：https://arxiv.org/abs/2112.12047

作者：Jin Li,Benjamin J. Cairns,Jingsong Li,Tingting Zhu 机构：†Zhejiang University, ‡University of Oxford 备注：Main article (22 pages, 7 figures); Appendix (15 pages, 8 figures) 摘要：最近电子健康记录（EHR）的出现为开发人工智能（AI）算法提供了巨大的机会。然而，患者隐私已成为一个主要问题，它限制了医院环境中的数据共享，进而阻碍了人工智能的发展。\textit{合成数据}得益于生成模型的发展和扩散，已成为真实患者EHR数据的有希望的替代品。然而，当前的生成模型是有限的，因为它们只生成临床数据的{single type}，即连续值或离散值。在本文中，我们提出了一个名为EHR-M-GAN的生成性对抗网络（GAN），它综合了\textit{mixed type}时间序列EHR数据。EHR-M-GAN能够捕获患者轨迹中的多维、异质和相关的时间动力学。我们在三个公开的重症监护病房数据库上验证了EHR-M-GAN，共有141488名独特患者的记录，并对提议的模型进行了隐私风险评估。EHR-M-GAN在合成高保真的临床时间序列方面显示了其性能优于最先进的基准。值得注意的是，当训练数据随着EHR-M-GAN生成时间序列的增加而增加时，重症监护结果的预测模型表现明显更好。EHR-M-GAN可用于在资源有限的环境中开发AI算法，降低数据采集的障碍，同时保护患者隐私。摘要：The recent availability of electronic health records (EHRs) have provided enormous opportunities to develop artificial intelligence (AI) algorithms. However, patient privacy has become a major concern that limits data sharing across hospital settings and subsequently hinders the advances in AI. \textit{Synthetic data}, which benefits from the development and proliferation of generative models, has served as a promising substitute for real patient EHR data. However, the current generative models are limited as they only generate \textit{single type} of clinical data, i.e., either continuous-valued or discrete-valued. In this paper, we propose a generative adversarial network (GAN) entitled EHR-M-GAN which synthesizes \textit{mixed-type} timeseries EHR data. EHR-M-GAN is capable of capturing the multidimensional, heterogeneous, and correlated temporal dynamics in patient trajectories. We have validated EHR-M-GAN on three publicly-available intensive care unit databases with records from a total of 141,488 unique patients, and performed privacy risk evaluation of the proposed model. EHR-M-GAN has demonstrated its superiority in performance over state-of-the-art benchmarks for synthesizing clinical timeseries with high fidelity. Notably, prediction models for outcomes of intensive care performed significantly better when training data was augmented with the addition of EHR-M-GAN-generated timeseries. EHR-M-GAN may have use in developing AI algorithms in resource-limited settings, lowering the barrier for data acquisition while preserving patient privacy.

【3】 Catch Me If You GAN: Using Artificial Intelligence for Fake Log Generation 标题：如果你敢抓我：使用人工智能生成假日志链接：https://arxiv.org/abs/2112.12006

作者：Christian Toemmel 机构： Using Artificial Intelligencefor Fake Log Generation 1st Christian T¨ommelInstitute for Wireless Communication and NavigationUniversity of KaiserslauternKaiserslautern 摘要：随着人工智能（AI）在日常生活的各个方面变得越来越重要，其他技术已经受到处理大量数据的新方式的广泛影响。尽管人工智能已经广泛传播，但它在网络安全领域的影响力却非常有限。网络安全专家使用的许多技术和技术都是通过人工实现的，几乎不依赖自动化，例如，系统管理员通常会手动检查日志，以查找潜在的恶意关键字。这项工作评估了一种称为生成性对抗网络（GANs）的特殊类型人工智能在日志生成中的应用。更准确地说，本研究回顾了三种不同的生成性对抗网络SeqGAN、MaliGAN和CoT的性能，重点是生成新日志，作为欺骗红队系统管理员的手段。虽然伪造原木的静态生成器已经存在了一段时间，但它们的产品通常很容易被披露。使用人工智能作为解决这个问题的方法还没有得到广泛的研究。确定的挑战包括格式、日期和时间以及总体一致性。综上所述，GANs似乎不适合生成假日志。然而，它们检测假日志的能力可能在实际场景中有用。摘要：With artificial intelligence (AI) becoming relevant in various parts of everyday life, other technologies are already widely influenced by the new way of handling large amounts of data. Although widespread already, AI has had only punctual influences on the cybersecurity field specifically. Many techniques and technologies used by cybersecurity experts function through manual labor and barely draw on automation, e.g., logs are often reviewed manually by system admins for potentially malicious keywords. This work evaluates the use of a special type of AI called generative adversarial networks (GANs) for log generation. More precisely, three different generative adversarial networks, SeqGAN, MaliGAN, and CoT, are reviewed in this research regarding their performance, focusing on generating new logs as a means of deceiving system admins for red teams. Although static generators for fake logs have been around for a while, their produces are usually easy to reveal as such. Using AI as an approach to this problem has not been widely researched. Identified challenges consist of formatting, dates and times, and overall consistency. Summing up the results, GANs seem not to be a good fit for generating fake logs. Their capability to detect fake logs, however, might be of use in practical scenarios.

【4】 Evaluating the Robustness of Deep Reinforcement Learning for Autonomous and Adversarial Policies in a Multi-agent Urban Driving Environment 标题：多智能体城市驾驶环境下自主对抗性策略的深度强化学习鲁棒性评估链接：https://arxiv.org/abs/2112.11947

作者：Aizaz Sharif,Dusica Marijan 机构：Simula Research Laboratory, Norway 摘要：深度强化学习积极用于在基于视觉的城市模拟环境中训练自主驾驶代理。由于各种强化学习算法的大量可用性，我们仍然不确定哪种算法在单智能体和多智能体驾驶环境中训练自动驾驶汽车时效果更好。在基于视觉的自动驾驶中进行深度强化学习的比较，将为更好地训练自动驾驶汽车提供可能性。此外，基于深度强化学习算法训练的自动驾驶汽车因易受对抗性攻击而闻名，而我们对于哪种算法可以充当好的对抗性代理的信息也较少。在这项工作中，我们提供了一个系统的评估和比较分析的6个深度强化学习算法的自主和敌对驾驶在四路交叉口的情况。具体来说，我们首先使用最先进的深度强化学习算法训练自动驾驶汽车。其次，我们在单代理和多代理场景中测试经过训练的自主策略的驱动能力。最后，我们使用相同的深度强化学习算法来训练对抗性驾驶代理，以测试自动驾驶汽车的驾驶性能，并寻找可能的碰撞和越野驾驶场景。我们使用仅限视觉的高保真城市驾驶模拟环境进行实验。摘要：Deep reinforcement learning is actively used for training autonomous driving agents in a vision-based urban simulated environment. Due to the large availability of various reinforcement learning algorithms, we are still unsure of which one works better while training autonomous cars in single-agent as well as multi-agent driving environments. A comparison of deep reinforcement learning in vision-based autonomous driving will open up the possibilities for training better autonomous car policies. Also, autonomous cars trained on deep reinforcement learning-based algorithms are known for being vulnerable to adversarial attacks, and we have less information on which algorithms would act as a good adversarial agent. In this work, we provide a systematic evaluation and comparative analysis of 6 deep reinforcement learning algorithms for autonomous and adversarial driving in four-way intersection scenario. Specifically, we first train autonomous cars using state-of-the-art deep reinforcement learning algorithms. Second, we test driving capabilities of the trained autonomous policies in single-agent as well as multi-agent scenarios. Lastly, we use the same deep reinforcement learning algorithms to train adversarial driving agents, in order to test the driving performance of autonomous cars and look for possible collision and offroad driving scenarios. We perform experiments by using vision-only high fidelity urban driving simulated environments.

【5】 Classifier Data Quality: A Geometric Complexity Based Method for Automated Baseline And Insights Generation 标题：分类器数据质量：一种基于几何复杂度的自动基线和洞察力生成方法链接：https://arxiv.org/abs/2112.11832

作者：George Kour,Marcel Zalmanovici,Orna Raz,Samuel Ackerman,Ateret Anaby-Tavor 机构：IBM Research, Haifa, Sagol Department of Neurobiology, University of Haifa 备注：Accepted to EDSMLS workshop at AAAI conference 摘要：测试机器学习（ML）模型和人工智能注入的应用程序（AIIAs），或者包含ML模型的系统，是非常具有挑战性的。除了测试经典软件的挑战外，统计ML模型有时会输出不正确的结果，这是可以接受的。一个主要的挑战是确定不正确程度（例如模型准确性或分类器的F1分数）何时可接受，何时不可接受。除了应该提供一个阈值的业务需求之外，最好的做法是要求任何建议的ML解决方案都比简单的基线模型（如决策树）更有效。我们已经开发了复杂性度量，它量化了给定的观测值分配到其真实类别标签的难度；然后，可以使用这些度量自动确定基线性能阈值。这些度量优于最佳实践基线，因为对于线性计算成本，它们还以可解释的形式量化每个观察的分类复杂性，而不管使用的分类器模型如何。我们对数值合成数据和真实自然语言聊天机器人数据的实验表明，复杂性度量有效地突出了可能被错误分类的数据区域和观测值。摘要：Testing Machine Learning (ML) models and AI-Infused Applications (AIIAs), or systems that contain ML models, is highly challenging. In addition to the challenges of testing classical software, it is acceptable and expected that statistical ML models sometimes output incorrect results. A major challenge is to determine when the level of incorrectness, e.g., model accuracy or F1 score for classifiers, is acceptable and when it is not. In addition to business requirements that should provide a threshold, it is a best practice to require any proposed ML solution to out-perform simple baseline models, such as a decision tree. We have developed complexity measures, which quantify how difficult given observations are to assign to their true class label; these measures can then be used to automatically determine a baseline performance threshold. These measures are superior to the best practice baseline in that, for a linear computation cost, they also quantify each observation' classification complexity in an explainable form, regardless of the classifier model used. Our experiments with both numeric synthetic data and real natural language chatbot data demonstrate that the complexity measures effectively highlight data regions and observations that are likely to be misclassified.

【6】 A Survey of Natural Language Generation 标题：自然语言生成研究综述链接：https://arxiv.org/abs/2112.11739

作者：Chenhe Dong,Yinghui Li,Haifan Gong,Miaoxin Chen,Junxin Li,Ying Shen,Min Yang 机构： Sun Yat-Sen University, Tsinghua University 备注：36 pages, 4 tables; Under review 摘要：本文全面回顾了近二十年来自然语言生成（NLG）的研究，特别是数据到文本生成和文本到文本生成的深度学习方法，以及NLG技术的新应用。这项调查旨在（a）提供关于NLG核心任务的深度学习研究的最新综合，以及该领域采用的架构；（b）细致全面地细化NLG的各项任务和数据集，关注NLG评估面临的挑战，关注不同的评估方法及其关系；（c）强调由于NLG与其他人工智能领域（如计算机视觉、文本和计算创造力）之间的协同作用日益增强而产生的一些未来重点和相对较新的研究问题。摘要：This paper offers a comprehensive review of the research on Natural Language Generation (NLG) over the past two decades, especially in relation to data-to-text generation and text-to-text generation deep learning methods, as well as new applications of NLG technology. This survey aims to (a) give the latest synthesis of deep learning research on the NLG core tasks, as well as the architectures adopted in the field; (b) detail meticulously and comprehensively various NLG tasks and datasets, and draw attention to the challenges in NLG evaluation, focusing on different evaluation methods and their relationships; (c) highlight some future emphasis and relatively recent research issues that arise due to the increasing synergy between NLG and other artificial intelligence areas, such as computer vision, text and computational creativity.

【7】 How Should Pre-Trained Language Models Be Fine-Tuned Towards Adversarial Robustness? 标题：预先训练的语言模型应该如何针对对手的健壮性进行微调？链接：https://arxiv.org/abs/2112.11668

作者：Xinhsuai Dong,Luu Anh Tuan,Min Lin,Shuicheng Yan,Hanwang Zhang 机构：Nanyang Technological University, Sea AI Lab 备注：Accepted by NeurIPS-2021 摘要：预训练语言模型的微调在许多NLP领域都取得了巨大的成功。然而，它极易受到敌对示例的攻击，例如，仅使用同义词的单词替换攻击很容易愚弄基于BERT的情绪分析模型。在本文中，我们证明了对抗性训练，这种流行的防御技术，并不直接适用于传统的微调场景，因为它受到灾难性遗忘的严重影响：无法保留预训练模型已经捕捉到的通用和健壮的语言特征。有鉴于此，我们从信息论的角度提出了一种新的对抗性微调方法——鲁棒信息微调（RIFT）。特别是，RIFT鼓励目标模型在整个微调过程中保留从预训练模型中学习到的特征，而传统模型仅使用预训练权重进行初始化。实验结果表明，在各种预先训练的语言模型中，在不同的攻击下，RIFT在两种流行的NLP任务：情感分析和自然语言推理上始终优于最新水平。摘要：The fine-tuning of pre-trained language models has a great success in many NLP fields. Yet, it is strikingly vulnerable to adversarial examples, e.g., word substitution attacks using only synonyms can easily fool a BERT-based sentiment analysis model. In this paper, we demonstrate that adversarial training, the prevalent defense technique, does not directly fit a conventional fine-tuning scenario, because it suffers severely from catastrophic forgetting: failing to retain the generic and robust linguistic features that have already been captured by the pre-trained model. In this light, we propose Robust Informative Fine-Tuning (RIFT), a novel adversarial fine-tuning method from an information-theoretical perspective. In particular, RIFT encourages an objective model to retain the features learned from the pre-trained model throughout the entire fine-tuning process, whereas a conventional one only uses the pre-trained weights for initialization. Experimental results show that RIFT consistently outperforms the state-of-the-arts on two popular NLP tasks: sentiment analysis and natural language inference, under different attacks across various pre-trained language models.

【8】 Multiple Imputation via Generative Adversarial Network for High-dimensional Blockwise Missing Value Problems 标题：高维顺时针失值问题的生成对抗性网络多重插补链接：https://arxiv.org/abs/2112.11507

作者：Zongyu Dai,Zhiqi Bu,Qi Long 机构：Department of AMCS, University of Pennsylvania, Philadelphia, USA, Division of Biostatistics 摘要：缺失数据存在于大多数现实问题中，需要仔细处理，以保持下游分析中的预测准确性和统计一致性。作为处理缺失数据的金标准，提出了多重插补（MI）方法来解释插补不确定性并提供适当的统计推断。在这项工作中，我们提出了通过生成性对抗网络（MI-GAN）进行多重插补的方法，这是一种基于深度学习（具体而言，基于GAN）的多重插补方法，在理论支持下可以在随机缺失（MAR）机制下工作。MI-GAN利用了条件生成对抗性神经工程的最新进展，并在插补误差方面表现出与现有高维数据集最先进插补方法相匹配的强大性能。特别是，MI-GAN在统计推断和计算速度方面明显优于其他插补方法。摘要：Missing data are present in most real world problems and need careful handling to preserve the prediction accuracy and statistical consistency in the downstream analysis. As the gold standard of handling missing data, multiple imputation (MI) methods are proposed to account for the imputation uncertainty and provide proper statistical inference. In this work, we propose Multiple Imputation via Generative Adversarial Network (MI-GAN), a deep learning-based (in specific, a GAN-based) multiple imputation method, that can work under missing at random (MAR) mechanism with theoretical support. MI-GAN leverages recent progress in conditional generative adversarial neural works and shows strong performance matching existing state-of-the-art imputation methods on high-dimensional datasets, in terms of imputation error. In particular, MI-GAN significantly outperforms other imputation methods in the sense of statistical inference and computational speed.

【9】 Adversarial Neural Networks for Error Correcting Codes 标题：纠错码的对抗性神经网络链接：https://arxiv.org/abs/2112.11491

作者：Hung T. Nguyen,Steven Bottone,Kwang Taik Kim,Mung Chiang,H. Vincent Poor 机构：Princeton University, com‡Purdue University 备注：6 pages, accepted to GLOBECOM 2021 摘要：纠错码是现代通信系统的基本组成部分，要求极高的吞吐量、超可靠性和低延迟。最近使用机器学习（ML）模型作为解码器的方法既提高了性能，又对传统解码器难以适应的未知环境具有很强的适应性。我们引入了一个通用框架来进一步提高ML模型的性能和适用性。我们建议将ML解码器与竞争鉴别器网络相结合，该网络试图区分码字和噪声字，从而引导解码模型恢复传输的码字。我们的框架是博弈论的，由生成性对抗网络（GAN）驱动，解码器和鉴别器在零和博弈中竞争。解码器学习同时解码和生成码字，而鉴别器学习分辨解码输出和码字之间的差异。因此，解码器能够将噪声接收信号解码成码字，从而增加成功解码的概率。通过证明最优最大似然译码器定义了博弈的纳什均衡点，我们展示了我们的框架与最优最大似然译码器的强大联系。因此，均衡训练有很好的可能性实现最佳的最大似然性能。此外，我们的框架不需要训练标签，训练标签通常在通信期间不可用，因此，似乎可以在线训练并适应渠道动态。为了证明我们的框架的性能，我们将其与最新的神经译码器相结合，并在各种代码上显示出比原始模型和传统译码算法更好的性能。摘要：Error correcting codes are a fundamental component in modern day communication systems, demanding extremely high throughput, ultra-reliability and low latency. Recent approaches using machine learning (ML) models as the decoders offer both improved performance and great adaptability to unknown environments, where traditional decoders struggle. We introduce a general framework to further boost the performance and applicability of ML models. We propose to combine ML decoders with a competing discriminator network that tries to distinguish between codewords and noisy words, and, hence, guides the decoding models to recover transmitted codewords. Our framework is game-theoretic, motivated by generative adversarial networks (GANs), with the decoder and discriminator competing in a zero-sum game. The decoder learns to simultaneously decode and generate codewords while the discriminator learns to tell the differences between decoded outputs and codewords. Thus, the decoder is able to decode noisy received signals into codewords, increasing the probability of successful decoding. We show a strong connection of our framework with the optimal maximum likelihood decoder by proving that this decoder defines a Nash equilibrium point of our game. Hence, training to equilibrium has a good possibility of achieving the optimal maximum likelihood performance. Moreover, our framework does not require training labels, which are typically unavailable during communications, and, thus, seemingly can be trained online and adapt to channel dynamics. To demonstrate the performance of our framework, we combine it with the very recent neural decoders and show improved performance compared to the original models and traditional decoding algorithms on various codes.

【10】 BACON: Deep-Learning Powered AI for Poetry Generation with Author Linguistic Style Transfer 标题：培根：深度学习驱动的人工智能诗歌生成与作者语言风格转换链接：https://arxiv.org/abs/2112.11483

作者：Alejandro Rodriguez Pascual 机构：Campolindo High School, Moraga Rd, Moraga, CA , Note: This paper was presented at the , California Science & Engineering Fair, Los Angeles, CA, . It is the result of research conducted by the author while in High School, and it is submitted to arXiv 备注：9 pages, 7 figures, independent high school research project 摘要：本文描述了BACON，一个具有作者语言风格转换的自动诗歌生成器的基本原型。它结合了有限状态机、概率模型、人工神经网络和深度学习的概念和技术，以任何特定作者的风格创作具有丰富美学品质的原创诗歌。对培根生成的输出进行的外部评估表明，参与者无法以任何具有统计意义的方式区分人类和人工智能生成的诗歌。摘要：This paper describes BACON, a basic prototype of an automatic poetry generator with author linguistic style transfer. It combines concepts and techniques from finite state machinery, probabilistic models, artificial neural networks and deep learning, to write original poetry with rich aesthetic-qualities in the style of any given author. Extrinsic evaluation of the output generated by BACON shows that participants were unable to tell the difference between human and AI-generated poems in any statistically significant way.

半/弱/无/有监督|不确定性|主动学习(3篇)

【1】 Meta-Learning and Self-Supervised Pretraining for Real World Image Translation 标题：用于真实世界图像翻译的元学习和自我监督预训练链接：https://arxiv.org/abs/2112.11929

作者：Ileana Rugina,Rumen Dangovski,Mark Veillette,Pooya Khorrami,Brian Cheung,Olga Simek,Marin Soljačić 机构：MIT EECS, MIT Lincoln Lab, MIT CSAIL & BCS, Marin Soljaˇci´c, MIT Physics 备注：10 pages, 8 figures, 2 tables 摘要：深度学习的最新进展，特别是硬件进步和大数据的推动，在计算机视觉、自然语言或强化学习等广泛的计算问题上取得了令人印象深刻的结果。然而，这些改进中的许多局限于需要大量人力收集的大规模管理数据集的问题。此外，这些模型在轻微的分布变化和低数据状态下的泛化能力较差。近年来，元学习或自监督学习等新兴领域通过将深度学习扩展到半监督和Few-Shot领域，缩小了概念验证结果与机器学习实际应用之间的差距。我们遵循这一工作思路，探索最近引入的图像到图像翻译问题中的时空结构，以便：i）制定一个新的多任务Few-Shot图像生成基准，ii）探索图像翻译下游任务对比预训练中的数据增强。我们提出了少数镜头问题的几个基线，并讨论了不同方法之间的权衡。我们的代码可在https://github.com/irugina/meta-image-translation. 摘要：Recent advances in deep learning, in particular enabled by hardware advances and big data, have provided impressive results across a wide range of computational problems such as computer vision, natural language, or reinforcement learning. Many of these improvements are however constrained to problems with large-scale curated data-sets which require a lot of human labor to gather. Additionally, these models tend to generalize poorly under both slight distributional shifts and low-data regimes. In recent years, emerging fields such as meta-learning or self-supervised learning have been closing the gap between proof-of-concept results and real-life applications of machine learning by extending deep-learning to the semi-supervised and few-shot domains. We follow this line of work and explore spatio-temporal structure in a recently introduced image-to-image translation problem in order to: i) formulate a novel multi-task few-shot image generation benchmark and ii) explore data augmentations in contrastive pre-training for image translation downstream tasks. We present several baselines for the few-shot problem and discuss trade-offs between different approaches. Our code is available at https://github.com/irugina/meta-image-translation.

【2】 Practical Active Learning with Model Selection for Small Data 标题：基于模型选择的小数据实用型主动学习链接：https://arxiv.org/abs/2112.11572

作者：Maryam Pardakhti,Nila Mandal,Anson W. K. Ma,Qian Yang 机构：Computer Science and Engineering; Chemical and Biomolecular Engineering, University of Connecticut, Storrs, USA, ORCID: ,-,-,- 备注：Accepted for publication in the Proceedings of the 2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA) 摘要：主动学习对于许多实际应用非常有意义，尤其是在工业和物理科学领域，在这些领域，迫切需要尽可能减少训练预测模型所需的昂贵实验数量。然而，在许多实际应用中采用主动学习方法仍然存在重大挑战。一个重要的挑战是，许多方法假设一个固定的模型，其中模型超参数是事先选择的。在实践中，事先知道一个好的模型很少是真的。现有的模型选择主动学习方法通常依赖于中等规模的标签预算。在这项工作中，我们将重点放在具有非常小的标记预算的情况下，在几十个数据点的顺序上，并开发一种简单快速的方法，用于实际的主动学习和模型选择。我们的方法基于一个底层的基于池的主动学习器，使用支持向量分类和径向基函数核进行二元分类。首先，我们从经验上证明，与oracle模型相比，我们的方法能够在不易分离、难以分类的数据集上找到性能最佳的超参数，并在更易分离、更易于分类的数据集上找到合理的性能。然后，我们证明了可以使用加权方法来改进模型选择方法，以在易于分类的数据集上实现最佳性能与难以分类的数据集之间进行权衡，后者可以基于数据集的先验领域知识进行调整。摘要：Active learning is of great interest for many practical applications, especially in industry and the physical sciences, where there is a strong need to minimize the number of costly experiments necessary to train predictive models. However, there remain significant challenges for the adoption of active learning methods in many practical applications. One important challenge is that many methods assume a fixed model, where model hyperparameters are chosen a priori. In practice, it is rarely true that a good model will be known in advance. Existing methods for active learning with model selection typically depend on a medium-sized labeling budget. In this work, we focus on the case of having a very small labeling budget, on the order of a few dozen data points, and develop a simple and fast method for practical active learning with model selection. Our method is based on an underlying pool-based active learner for binary classification using support vector classification with a radial basis function kernel. First we show empirically that our method is able to find hyperparameters that lead to the best performance compared to an oracle model on less separable, difficult to classify datasets, and reasonable performance on datasets that are more separable and easier to classify. Then, we demonstrate that it is possible to refine our model selection method using a weighted approach to trade-off between achieving optimal performance on datasets that are easy to classify, versus datasets that are difficult to classify, which can be tuned based on prior domain knowledge about the dataset.

【3】 Teacher-Student Architecture for Mixed Supervised Lung Tumor Segmentation 标题：基于师生结构的混合监督肺肿瘤分割算法链接：https://arxiv.org/abs/2112.11541

作者：Vemund Fredriksen,Svein Ole M. Svele,André Pedersen,Thomas Langø,Gabriel Kiss,Frank Lindseth 机构： Department of Computer Science, Norwegian University of Science and Technology, Department of Health Research, Medical Technology, SINTEF, Trondheim, Norway, Department of Clinical and Molecular Medicine, Norwegian University of Science 备注：17 pages, 3 figures, 5 tables, submitted to journal 摘要：目的：自动化任务，如肺部肿瘤的定位和分割，可以为放射科医生和其他临床人员节省宝贵的时间。卷积神经网络可能适合此类任务，但需要大量标记数据进行训练。获取标记数据是一项挑战，尤其是在医学领域。方法：本文研究了利用师生设计，利用具有不同监督类型的数据集，训练在计算机断层扫描图像上执行肺肿瘤分割的自动模型。该框架由两个模型组成：执行端到端自动肿瘤分割的学生和在训练期间向学生提供附加伪注释数据的教师。结果：仅使用一小部分语义标记数据和大量边界框注释数据，我们通过师生设计获得了具有竞争力的性能。在大量语义标注上训练的模型并不比在教师标注数据上训练的模型表现更好。结论：我们的结果证明了利用师生设计来减少注释负载的潜力，因为可以执行较少监督的注释方案，而不会降低分割精度。摘要：Purpose: Automating tasks such as lung tumor localization and segmentation in radiological images can free valuable time for radiologists and other clinical personnel. Convolutional neural networks may be suited for such tasks, but require substantial amounts of labeled data to train. Obtaining labeled data is a challenge, especially in the medical domain. Methods: This paper investigates the use of a teacher-student design to utilize datasets with different types of supervision to train an automatic model performing pulmonary tumor segmentation on computed tomography images. The framework consists of two models: the student that performs end-to-end automatic tumor segmentation and the teacher that supplies the student additional pseudo-annotated data during training. Results: Using only a small proportion of semantically labeled data and a large number of bounding box annotated data, we achieved competitive performance using a teacher-student design. Models trained on larger amounts of semantic annotations did not perform better than those trained on teacher-annotated data. Conclusions: Our results demonstrate the potential of utilizing teacher-student designs to reduce the annotation load, as less supervised annotation schemes may be performed, without any real degradation in segmentation accuracy.

迁移|Zero/Few/One-Shot|自适应(3篇)

【1】 ALP: Data Augmentation using Lexicalized PCFGs for Few-Shot Text Classification 标题：ALP：基于词汇化PCFGS的Few-Shot文本分类数据增强链接：https://arxiv.org/abs/2112.11916

作者：Hazel Kim,Daecheol Woo,Seong Joon Oh,Jeong-Won Cha,Yo-Sub Han 机构： Yonsei University, Seoul, Republic of Korea, NAVER AI Lab, Changwon National University, Changwon, Republic of Korea 备注：Accepted to AAAI2022 摘要：数据扩充是提高学习模型性能的一个重要因素。以前用于少量镜头文本分类的数据增强方法已经极大地提高了性能。然而，它们并不是为了捕捉自然语言复杂的组成结构而设计的。结果，他们无法生成具有似是而非和多样化句子结构的样本。基于此，我们提出了使用词汇化概率上下文无关语法（ALP）进行数据扩充，该语法生成具有不同语法结构和似是而非语法的扩充样本。词汇化的PCFG解析树考虑成分和依赖关系，以产生句法框架，最大化各种单词选择的语法可保存的方式，而无特定领域专家。对少量镜头文本分类任务的实验表明，ALP增强了许多最先进的分类方法。作为第二个贡献，我们深入研究了数据扩充方法发挥作用时的train-val分割方法。我们在经验上认为，与我们的新的基于增强的分割策略相比，传统的训练集和验证集的分割是次优的，该策略使用相同数量的标记数据进一步扩展训练分割。综上所述，我们在数据增强策略方面的贡献为少量镜头文本分类任务提供了强大的训练配方。摘要：Data augmentation has been an important ingredient for boosting performances of learned models. Prior data augmentation methods for few-shot text classification have led to great performance boosts. However, they have not been designed to capture the intricate compositional structure of natural language. As a result, they fail to generate samples with plausible and diverse sentence structures. Motivated by this, we present the data Augmentation using Lexicalized Probabilistic context-free grammars (ALP) that generates augmented samples with diverse syntactic structures with plausible grammar. The lexicalized PCFG parse trees consider both the constituents and dependencies to produce a syntactic frame that maximizes a variety of word choices in a syntactically preservable manner without specific domain experts. Experiments on few-shot text classification tasks demonstrate that ALP enhances many state-of-the-art classification methods. As a second contribution, we delve into the train-val splitting methodologies when a data augmentation method comes into play. We argue empirically that the traditional splitting of training and validation sets is sub-optimal compared to our novel augmentation-based splitting strategies that further expand the training split with the same number of labeled data. Taken together, our contributions on the data augmentation strategies yield a strong training recipe for few-shot text classification tasks.

【2】 Few-Shot Object Detection: A Survey 标题：Few-Shot目标检测研究综述链接：https://arxiv.org/abs/2112.11699

作者：Mona Köhler,Markus Eisenbach,Horst-Michael Gross 机构：IlmenauUniversityofTechnology 备注：24 pages, 13 figures, submitted to IEEE Transactions on Neural Networks and Learning Systems 摘要：人类甚至可以通过几个例子来学习识别新的物体。相反，训练基于深度学习的对象检测器需要大量带注释的数据。为了避免获取和注释这些海量数据的需要，少数镜头目标检测旨在从目标域中新类别的少数对象实例中学习。在这项调查中，我们提供了一个在少数镜头对象检测技术的现状概述。我们根据训练计划和架构布局对方法进行分类。对于每种类型的方法，我们描述了一般实现以及提高新类别性能的概念。在适当的时候，我们会简要介绍这些概念，以突出最佳想法。最后，我们介绍了常用的数据集及其评估协议，并分析了报告的基准测试结果。因此，我们强调了评估中的共同挑战，并确定了少数镜头目标检测这一新兴领域中最有前景的当前趋势。摘要：Humans are able to learn to recognize new objects even from a few examples. In contrast, training deep-learning-based object detectors requires huge amounts of annotated data. To avoid the need to acquire and annotate these huge amounts of data, few-shot object detection aims to learn from few object instances of new categories in the target domain. In this survey, we provide an overview of the state of the art in few-shot object detection. We categorize approaches according to their training scheme and architectural layout. For each type of approaches, we describe the general realization as well as concepts to improve the performance on novel categories. Whenever appropriate, we give short takeaways regarding these concepts in order to highlight the best ideas. Eventually, we introduce commonly used datasets and their evaluation protocols and analyze reported benchmark results. As a result, we emphasize common challenges in evaluation and identify the most promising current trends in this emerging field of few-shot object detection.

【3】 Convolutional neural network based on transfer learning for breast cancer screening 标题：基于转移学习的卷积神经网络在乳腺癌筛查中的应用链接：https://arxiv.org/abs/2112.11629

作者：Hussin Ragb,Redha Ali,Elforjani Jera,Nagi Buaossa 机构：Department of Electrical and Computer Engineering, Christian Brothers University, Memphis, TN , University of Dayton, Dayton, Ohio , Department of Electro Optics and Photonics 备注：9 pages, 7 figures. arXiv admin note: text overlap with arXiv:2009.08831 摘要：乳腺癌是世界上最常见的癌症，也是全世界女性最常见的死亡原因。然而，如果早期发现，它也是最可治疗的恶性肿瘤之一。本文提出了一种基于深度卷积神经网络的超声图像乳腺癌精确识别算法。在该算法中，将多个神经网络融合在一个并行结构中以执行分类过程，并在候选对象类之间的最终分类决策中应用投票准则，其中每个神经网络的输出代表一次投票。在由537张良性、360张恶性和133张正常图像组成的乳腺超声数据集上进行了几项实验。这些实验显示了一个乐观的结果，并且所提出的模型在几个方面都优于许多最先进的算法。使用k-fold交叉验证和bagging分类器集成，我们实现了99.5%的准确率和99.6%的灵敏度。摘要：Breast cancer is the most common cancer in the world and the most prevalent cause of death among women worldwide. Nevertheless, it is also one of the most treatable malignancies if detected early. In this paper, a deep convolutional neural network-based algorithm is proposed to aid in accurately identifying breast cancer from ultrasonic images. In this algorithm, several neural networks are fused in a parallel architecture to perform the classification process and the voting criteria are applied in the final classification decision between the candidate object classes where the output of each neural network is representing a single vote. Several experiments were conducted on the breast ultrasound dataset consisting of 537 Benign, 360 malignant, and 133 normal images. These experiments show an optimistic result and a capability of the proposed model to outperform many state-of-the-art algorithms on several measures. Using k-fold cross-validation and a bagging classifier ensemble, we achieved an accuracy of 99.5% and a sensitivity of 99.6%.

强化学习(1篇)

【1】 Do Androids Dream of Electric Fences? Safety-Aware Reinforcement Learning with Latent Shielding 标题：机器人梦想电子栅栏吗？具有潜在屏蔽的安全意识强化学习链接：https://arxiv.org/abs/2112.11490

作者：Peter He,Borja G. Leon,Francesco Belardinelli 机构： Department of Computing, Imperial College London, Wellcome EPSRC Centre for Interventional and Surgical Sciences, University College London, Apricity, London 备注：Accepted at SafeAI 2022 摘要：新兴强化学习系统进入现实世界应用的趋势越来越明显，同时人们也越来越关注其安全性和鲁棒性。近年来，人们提出了多种方法来应对安全意识强化学习的挑战；然而，这些方法通常需要事先提供手工制作的环境模型，或者环境相对简单且低维。我们提出了一种在高维环境中进行安全感知深度强化学习的新方法，称为潜在屏蔽。潜在屏蔽利用基于模型的代理学习的环境的内部表示来“想象”未来的轨迹，并避免那些被认为不安全的轨迹。我们通过实验证明，这种方法能够更好地遵守正式定义的安全规范。摘要：The growing trend of fledgling reinforcement learning systems making their way into real-world applications has been accompanied by growing concerns for their safety and robustness. In recent years, a variety of approaches have been put forward to address the challenges of safety-aware reinforcement learning; however, these methods often either require a handcrafted model of the environment to be provided beforehand, or that the environment is relatively simple and low-dimensional. We present a novel approach to safety-aware deep reinforcement learning in high-dimensional environments called latent shielding. Latent shielding leverages internal representations of the environment learnt by model-based agents to "imagine" future trajectories and avoid those deemed unsafe. We experimentally demonstrate that this approach leads to improved adherence to formally-defined safety specifications.

符号|符号学习(2篇)

【1】 Neural-Symbolic Integration for Interactive Learning and Conceptual Grounding 标题：用于互动学习和概念基础的神经-符号整合链接：https://arxiv.org/abs/2112.11805

作者：Benedikt Wagner,Artur d'Avila Garcez 机构：Department of Computer Science, City, University of London, London, EC,HB, UK, Artur d’Avila Garcez 备注：None 摘要：我们提出了用于抽象概念解释和交互式学习的神经-符号集成。神经符号集成和解释允许用户和领域专家了解大型神经模型的数据驱动决策过程。使用符号逻辑语言查询模型。然后，与用户的交互确认或拒绝使用基于逻辑的约束对神经模型进行修改，这些约束可以提炼到模型架构中。该方法用逻辑张量网络框架和概念激活向量加以说明，并应用于卷积神经网络。摘要：We propose neural-symbolic integration for abstract concept explanation and interactive learning. Neural-symbolic integration and explanation allow users and domain-experts to learn about the data-driven decision making process of large neural models. The models are queried using a symbolic logic language. Interaction with the user then confirms or rejects a revision of the neural model using logic-based constraints that can be distilled into the model architecture. The approach is illustrated using the Logic Tensor Network framework alongside Concept Activation Vectors and applied to a Convolutional Neural Network.

【2】 Analytical Modelling of Exoplanet Transit Specroscopy with Dimensional Analysis and Symbolic Regression 标题：用量纲分析和符号回归建立系外行星凌日光谱的解析模型链接：https://arxiv.org/abs/2112.11600

作者：Konstantin T. Matchev,Katia Matcheva,Alexander Roman 机构：Physics Department, University of Florida, Gainesville, FL , USA, Submitted to ApJ 备注：Submitted to AAS Journals, 24 pages, 7 figures 摘要：新发现的系外行星的物理特征和大气化学成分通常是通过复杂的辐射传输数值模型获得的凌日光谱推断出来的。或者，简单的分析表达式为相关的大气过程提供了深刻的物理直觉。深度学习革命开启了一扇大门，可以通过与数据相匹配的计算机算法直接得出此类分析结果。作为一个概念证明，我们成功地展示了对一般热木星系外行星凌日半径合成数据的符号回归，以推导出相应的分析公式。作为预处理步骤，我们使用维度分析来识别相关的无量纲变量组合，并减少独立输入的数量，这提高了符号回归的性能。量纲分析还允许我们从数学上推导并适当地参数化输入大气参数中最普遍的简并族，这些参数通过渡越光谱影响系外行星大气的特征。摘要：The physical characteristics and atmospheric chemical composition of newly discovered exoplanets are often inferred from their transit spectra which are obtained from complex numerical models of radiative transfer. Alternatively, simple analytical expressions provide insightful physical intuition into the relevant atmospheric processes. The deep learning revolution has opened the door for deriving such analytical results directly with a computer algorithm fitting to the data. As a proof of concept, we successfully demonstrate the use of symbolic regression on synthetic data for the transit radii of generic hot Jupiter exoplanets to derive a corresponding analytical formula. As a preprocessing step, we use dimensional analysis to identify the relevant dimensionless combinations of variables and reduce the number of independent inputs, which improves the performance of the symbolic regression. The dimensional analysis also allowed us to mathematically derive and properly parametrize the most general family of degeneracies among the input atmospheric parameters which affect the characterization of an exoplanet atmosphere through transit spectroscopy.

医学相关(3篇)

【1】 Community Detection in Medical Image Datasets: Using Wavelets and Spectral Methods 标题：基于小波和谱方法的医学图像数据中的社区检测链接：https://arxiv.org/abs/2112.12021

作者：Roozbeh Yousefzadeh 机构：Yale Center for Medical Informatics and VA Connecticut Healthcare System 摘要：医学图像数据集可以有大量代表不同健康状况和不同疾病严重程度的患者的图像。在处理未标记的原始图像数据集时，大量的样本往往使专家和非专家难以理解数据集中存在的各种图像。监督学习方法依赖于标记图像，这需要医学专家付出相当大的努力，首先了解数据中存在的图像社区，然后标记图像。在这里，我们提出了一种算法，以方便在医学图像数据集中自动识别社区。我们进一步解释，当图像已经标记时，这种分析在有监督的环境中也可以是有见地的。这些见解是有用的，因为在现实中，健康和疾病的严重程度可以被视为一个连续的光谱，在每个类别中，通常都有值得调查的更好的社区，特别是当它们与其他类别中的社区有相似性时。在我们的方法中，我们将图像的小波分解与光谱方法结合使用。我们证明了图拉普拉斯算子的特征值可以揭示图像数据集中显著社区的数量。在我们的实验中，我们使用了一组标记有不同条件的新冠病毒患者的图像数据集。我们在数据集中检测了25个社区，然后观察到其中只有6个社区有肺炎患者。我们还调查了结直肠癌组织病理学数据集的内容。摘要：Medical image datasets can have large number of images representing patients with different health conditions and various disease severity. When dealing with raw unlabeled image datasets, the large number of samples often makes it hard for experts and non-experts to understand the variety of images present in a dataset. Supervised learning methods rely on labeled images which requires a considerable effort by medical experts to first understand the communities of images present in the data and then labeling the images. Here, we propose an algorithm to facilitate the automatic identification of communities in medical image datasets. We further explain that such analysis can also be insightful in a supervised setting, when the images are already labeled. Such insights are useful because in reality, health and disease severity can be considered a continuous spectrum, and within each class, there usually are finer communities worthy of investigation, especially when they have similarities to communities in other classes. In our approach, we use wavelet decomposition of images in tandem with spectral methods. We show that the eigenvalues of a graph Laplacian can reveal the number of notable communities in an image dataset. In our experiments, we use a dataset of images labeled with different conditions for COVID patients. We detect 25 communities in the dataset and then observe that only 6 of those communities contain patients with pneumonia. We also investigate the contents of a colorectal cancer histopathology dataset.

【2】 The Phonetic Footprint of Parkinson's Disease 标题：帕金森病的语音足迹链接：https://arxiv.org/abs/2112.11514

作者：Philipp Klumpp,Tomás Arias-Vergara,Juan Camilo Vásquez-Correa,Paula Andrea Pérez-Toro,Juan Rafael Orozco-Arroyave,Anton Batliner,Elmar Nöth 机构：Friedrich-Alexander-University Erlangen-Nuremberg, Pattern Recognition Lab, Martensstrasse , Erlangen, Germany, Universidad de Antioquia, Medellín, Colombia, Chair of Embedded Intelligence for Health Care and Wellbeing, University of Augsburg 备注：None 摘要：帕金森病（Parkinson'sdisease，PD）是最常见的神经退行性疾病之一，对患者的精细运动技能有重要影响。在言语产生和实现所需肌肉张力的过程中，不同发音者之间复杂的相互作用变得越来越困难，从而导致发音障碍。在受影响的个体中，经常可以观察到元音不稳定、发音模糊和说话缓慢等特征模式，并在以前的研究中进行分析，以确定PD的存在和进展。在这项工作中，我们使用专门针对健康语音数据训练的语音识别器来研究PD如何影响患者的语音足迹。尽管我们的系统以前从未见过任何病理性语言，但我们重新发现了在以前的贡献中描述的许多模式。此外，我们可以证明来自神经网络的中间激活可以作为编码个体疾病状态相关信息的特征向量。我们还能够直接将专家评定的说话人的可懂度与语音预测的平均置信度联系起来。我们的结果支持这样的假设，即训练能够分析PD语音的系统不一定需要病理数据。摘要：As one of the most prevalent neurodegenerative disorders, Parkinson's disease (PD) has a significant impact on the fine motor skills of patients. The complex interplay of different articulators during speech production and realization of required muscle tension become increasingly difficult, thus leading to a dysarthric speech. Characteristic patterns such as vowel instability, slurred pronunciation and slow speech can often be observed in the affected individuals and were analyzed in previous studies to determine the presence and progression of PD. In this work, we used a phonetic recognizer trained exclusively on healthy speech data to investigate how PD affected the phonetic footprint of patients. We rediscovered numerous patterns that had been described in previous contributions although our system had never seen any pathological speech previously. Furthermore, we could show that intermediate activations from the neural network could serve as feature vectors encoding information related to the disease state of individuals. We were also able to directly correlate the expert-rated intelligibility of a speaker with the mean confidence of phonetic predictions. Our results support the assumption that pathological data is not necessarily required to train systems that are capable of analyzing PD speech.

【3】 Multi-Coil MRI Reconstruction Challenge -- Assessing Brain MRI Reconstruction Models and their Generalizability to Varying Coil Configurations 标题：多线圈MRI重建的挑战--评价脑MRI重建模型及其对不同线圈结构的泛化能力链接：https://arxiv.org/abs/2011.07952

作者：Youssef Beauferris,Jonas Teuwen,Dimitrios Karkalousos,Nikita Moriakov,Mattha Caan,George Yiasemis,Lívia Rodrigues,Alexandre Lopes,Hélio Pedrini,Letícia Rittner,Maik Dannecker,Viktor Studenyak,Fabian Gröger,Devendra Vyas,Shahrooz Faghih-Roohi,Amrit Kumar Jethi,Jaya Chandra Raju,Mohanasankar Sivaprakasam,Mike Lasby,Nikita Nogovitsyn,Wallace Loos,Richard Frayne,Roberto Souza 机构：Radiology and Clinical Neurosciences, University of Calgary, Canada, Hotchkiss Brain Institute, University of Calgary, Canada, Seaman Family MR Research Centre, Foothills Medical Center, Canada 摘要：基于深度学习的脑磁共振成像（MRI）重建方法有可能加速MRI采集过程。然而，科学界缺乏适当的基准来评估高分辨率脑图像的MRI重建质量，以及评估这些提出的算法在存在微小但预期的数据分布变化时的表现。多线圈磁共振图像（MC-MRI）重建挑战提供了一个基准，旨在解决这些问题，使用高分辨率、三维、T1加权MRI扫描的大型数据集。这项挑战有两个主要目标：1）比较此数据集上的不同MRI重建模型，2）评估这些模型与不同数量接收器线圈采集的数据的通用性。在本文中，我们描述了挑战性实验设计，并总结了一组基线和最先进的脑MRI重建模型的结果。我们提供了当前MRI重建最新技术的相关比较信息，并强调了在更广泛的临床应用之前获得通用模型的挑战。MC-MRI基准数据、评估代码和当前挑战排行榜可公开获取。它们为脑MRI重建领域的未来发展提供了客观的性能评估。摘要：Deep-learning-based brain magnetic resonance imaging (MRI) reconstruction methods have the potential to accelerate the MRI acquisition process. Nevertheless, the scientific community lacks appropriate benchmarks to assess MRI reconstruction quality of high-resolution brain images, and evaluate how these proposed algorithms will behave in the presence of small, but expected data distribution shifts. The Multi-Coil Magnetic Resonance Image (MC-MRI) Reconstruction Challenge provides a benchmark that aims at addressing these issues, using a large dataset of high-resolution, three-dimensional, T1-weighted MRI scans. The challenge has two primary goals: 1) to compare different MRI reconstruction models on this dataset and 2) to assess the generalizability of these models to data acquired with a different number of receiver coils. In this paper, we describe the challenge experimental design, and summarize the results of a set of baseline and state of the art brain MRI reconstruction models. We provide relevant comparative information on the current MRI reconstruction state-of-the-art and highlight the challenges of obtaining generalizable models that are required prior to broader clinical adoption. The MC-MRI benchmark data, evaluation code and current challenge leaderboard are publicly available. They provide an objective performance assessment for future developments in the field of brain MRI reconstruction.

推荐(1篇)

【1】 Movie Recommender System using critic consensus 标题：基于评论家共识的电影推荐系统链接：https://arxiv.org/abs/2112.11854

作者：A Nayan Varma,Kedareshwara Petluri 机构：Department of CSE, PES University, Bangalore, India 备注：4 pages, IEEE 2021 International Conference on Advances in Computing, Communication and Control (ICAC3'21) 7thEdition (3rd and 4th December 2021) 摘要：推荐系统可能是通过现代互联网世界实现行业增长的最重要的因素之一。以前的推荐系统方法包括协同过滤和基于内容的过滤推荐系统。这两种方法本质上是脱节的，需要连续存储用户偏好以获得更好的推荐。为了更好地集成这两个过程，我们提出了一个基于协作和基于内容的内容集成的混合推荐系统，同时考虑了顶级评论家的共识和电影评级分数。我们想提出一个新的模型，根据用户偏好和关键共识分数的组合来推荐电影。摘要：Recommendation systems are perhaps one of the most important agents for industry growth through the modern Internet world. Previous approaches on recommendation systems include collaborative filtering and content based filtering recommendation systems. These 2 methods are disjointed in nature and require the continuous storage of user preferences for a better recommendation. To provide better integration of the two processes, we propose a hybrid recommendation system based on the integration of collaborative and content-based content, taking into account the top critic consensus and movie rating score. We would like to present a novel model that recommends movies based on the combination of user preferences and critical consensus scores.

自动驾驶|车辆|车道检测等(1篇)

【1】 Exploring Credibility Scoring Metrics of Perception Systems for Autonomous Driving 标题：自动驾驶感知系统可信度评分指标探讨链接：https://arxiv.org/abs/2112.11643

作者：Viren Khandal,Arth Vidyarthi 机构：University of California, Berkeley, Berkeley, California, United States 备注：In 14th International Conference on COMmunication Systems & NETworkS (COMSNETS) Intelligent Transportation Systems 2022 摘要：自动和半自动车辆的感知算法可能遇到错误的目标检测情况，例如道路上的目标分类错误，这可能导致安全违规和潜在的致命后果。虽然在目标检测算法和在线度量学习的稳健性方面已经做了大量工作，但很少有关于基准评分度量的研究，以确定潜在错误分类的任何可能指标。重点放在探索在线获取这些评分指标的潜力，以便让AV在实时约束条件下做出基于感知的决策。在这项工作中，我们探讨了当感知算法和对象检测器出现故障时，哪些指标（如果有的话）可以作为在线指标。我们的工作提供了更好的设计原则和在线度量特征的见解，以准确评估对象检测器的可信度。我们的方法对图像采用非对抗性和真实的扰动，并在此基础上评估各种定量指标。我们发现，离线指标可以用来解释现实世界中的腐败，如恶劣的天气条件，对这些指标的分析可以为在线指标的设计提供一个步骤。这是一个明确的下一步，因为它可以实现无错误的自动车辆感知和更安全的时间关键和安全关键决策。摘要：Autonomous and semi-autonomous vehicles' perception algorithms can encounter situations with erroneous object detection, such as misclassification of objects on the road, which can lead to safety violations and potentially fatal consequences. While there has been substantial work in the robustness of object detection algorithms and online metric learning, there is little research on benchmarking scoring metrics to determine any possible indicators of potential misclassification. An emphasis is put on exploring the potential of taking these scoring metrics online in order to allow the AV to make perception-based decisions given real-time constraints. In this work, we explore which, if any, metrics act as online indicators of when perception algorithms and object detectors are failing. Our work provides insight on better design principles and characteristics of online metrics to accurately evaluate the credibility of object detectors. Our approach employs non-adversarial and realistic perturbations to images, on which we evaluate various quantitative metrics. We found that offline metrics can be designed to account for real-world corruptions such as poor weather conditions and that the analysis of such metrics can provide a segue into designing online metrics. This is a clear next step as it can allow for error-free autonomous vehicle perception and safer time-critical and safety-critical decision-making.

联邦学习|隐私保护|加密(3篇)

【1】 FedLGA: Towards System-Heterogeneity of Federated Learning via Local Gradient Approximation 标题：FedLGA：基于局部梯度逼近的联邦学习系统异构性链接：https://arxiv.org/abs/2112.11989

作者：Xingyu Li,Zhe Qu,Bo Tang,Zhuo Lu 机构：Xingyu Li and Bo Tang are with the Department of Electrical and Computer Engineering, Mississippi State University 摘要：联邦学习（FL）是一种分散的机器学习体系结构，它利用大量远程设备学习具有分布式训练数据的联合模型。然而，系统异构性是FL网络实现鲁棒分布式学习性能的一个主要挑战，这包括两个方面：i）由于设备之间的计算能力不同而导致的设备异构性；ii）由于网络中的数据分布不一致，导致数据异构。尽管已经有针对异构FL的基准测试，例如FedProx，但之前的研究缺乏形式化，仍然是一个开放的问题。在这项工作中，我们形式化了系统异构FL问题，并提出了一种新的算法，称为FedLGA，该算法通过梯度近似桥接局部模型更新的分歧来解决该问题。为了实现这一点，FedLGA提供了一种交替的Hessian估计方法，它只需要聚合器上额外的线性复杂度。理论上，我们证明了当设备异质性比率为$\rho$时，FedLGA针对$\mathcal{O}\left（\frac{（1+\rho）}{\sqrt{ENT}+\frac{1}{T}\right）$和$\mathcal{O}\left（\frac{（1+\rho）\sqrt{E}{E}{TK}+\frac{1}{T}\right）设备的非凸优化问题，在非i.i.i.d分布式FL训练数据上实现了收敛速度，分别用于完全和部分参与，其中，$E$是本地学习期的数量，$T$是总通信轮数，$N$是总设备数，$K$是部分参与计划下一轮通信中所选设备的数量。在多个数据集上的综合实验结果表明，FedLGA在系统异构性方面优于当前的FL基准。摘要：Federated Learning (FL) is a decentralized machine learning architecture, which leverages a large number of remote devices to learn a joint model with distributed training data. However, the system-heterogeneity is one major challenge in a FL network to achieve robust distributed learning performance, which is of two aspects: i) device-heterogeneity due to the diverse computational capacity among devices; ii) data-heterogeneity due to the non-identically distributed data across the network. Though there have been benchmarks against the heterogeneous FL, e.g., FedProx, the prior studies lack formalization and it remains an open problem. In this work, we formalize the system-heterogeneous FL problem and propose a new algorithm, called FedLGA, which addresses this problem by bridging the divergence of local model updates via gradient approximation. To achieve this, FedLGA provides an alternated Hessian estimation method, which only requires extra linear complexity on the aggregator. Theoretically, we show that with a device-heterogeneous ratio $\rho$, FedLGA achieves convergence rates on non-i.i.d distributed FL training data against non-convex optimization problems for $\mathcal{O} \left( \frac{(1+\rho)}{\sqrt{ENT}} + \frac{1}{T} \right)$ and $\mathcal{O} \left( \frac{(1+\rho)\sqrt{E}}{\sqrt{TK}} + \frac{1}{T} \right)$ for full and partial device participation respectively, where $E$ is the number of local learning epoch, $T$ is the number of total communication round, $N$ is the total device number and $K$ is the number of selected device in one communication round under partially participation scheme. The results of comprehensive experiments on multiple datasets show that FedLGA outperforms current FL benchmarks against the system-heterogeneity.

【2】 FLoBC: A Decentralized Blockchain-Based Federated Learning Framework 标题：FLoBC：一种基于区块链的去中心化联邦学习框架链接：https://arxiv.org/abs/2112.11873

作者：Mohamed Ghanem,Fadi Dawoud,Habiba Gamal,Eslam Soliman,Hossam Sharara,Tamer El-Batt 机构： A Decentralized Blockchain-BasedFederated Learning FrameworkThe American University in Cairo – Computer Science and Engineering DepartmentMohamed Ghanemoscar 备注：BSc Computer Engineering Thesis [AUC, Spring 21] 摘要：数据在世界范围内的快速扩展需要更多的分布式解决方案，以便在更大范围内应用机器学习。由此产生的分布式学习系统可以有不同程度的集中。在这项工作中，我们展示了我们的解决方案FLoBC，用于使用区块链技术构建一个通用的分散联邦学习系统，适应任何与梯度下降优化兼容的机器学习模型。我们介绍了我们的系统设计，包括两个分散的参与者：训练师和验证器，以及我们确保所述系统可靠高效运行的方法。最后，我们利用FLoBC作为实验沙箱，比较和对比训练师与验证人比率、奖惩策略和模型同步方案对系统整体性能的影响，最终通过示例说明，分散的联邦学习系统确实是更集中的体系结构的可行替代方案。摘要：The rapid expansion of data worldwide invites the need for more distributed solutions in order to apply machine learning on a much wider scale. The resultant distributed learning systems can have various degrees of centralization. In this work, we demonstrate our solution FLoBC for building a generic decentralized federated learning system using blockchain technology, accommodating any machine learning model that is compatible with gradient descent optimization. We present our system design comprising the two decentralized actors: trainer and validator, alongside our methodology for ensuring reliable and efficient operation of said system. Finally, we utilize FLoBC as an experimental sandbox to compare and contrast the effects of trainer-to-validator ratio, reward-penalty policy, and model synchronization schemes on the overall system performance, ultimately showing by example that a decentralized federated learning system is indeed a feasible alternative to more centralized architectures.

【3】 On-the-fly Resource-Aware Model Aggregation for Federated Learning in Heterogeneous Edge 标题：异构边缘联合学习的飞翔资源感知模型聚合链接：https://arxiv.org/abs/2112.11485

作者：Hung T. Nguyen,Roberto Morabito,Kwang Taik Kim,Mung Chiang 机构：∗Princeton University, †Purdue University 备注：6 pages, accepted to GLOBECOM 2021 摘要：Edge computing凭借其灵活、安全和高性能的特点，彻底改变了移动和无线网络世界。最近，我们看到越来越多地使用it来更好地执行机器学习（ML）技术的部署，如联合学习（FL）。与传统的分布式机器学习（ML）相比，FL的推出提高了通信效率。最初的FL假设有一个中央聚合服务器来聚合本地优化的参数，可能会带来可靠性和延迟问题。在本文中，我们深入研究了在每一轮FL优化中，根据当前参与者和/或可用资源动态选择飞行大师来取代中央服务器的策略。具体来说，我们比较不同的指标来选择这位飞行大师，并评估执行选择的一致算法。我们的结果表明，与在EdgeAI试验台和使用运行边缘试验台的实际5G网络上进行的测量结果相比，使用我们的Flight master FL框架可以显著减少运行时间。摘要：Edge computing has revolutionized the world of mobile and wireless networks world thanks to its flexible, secure, and performing characteristics. Lately, we have witnessed the increasing use of it to make more performing the deployment of machine learning (ML) techniques such as federated learning (FL). FL was debuted to improve communication efficiency compared to conventional distributed machine learning (ML). The original FL assumes a central aggregation server to aggregate locally optimized parameters and might bring reliability and latency issues. In this paper, we conduct an in-depth study of strategies to replace this central server by a flying master that is dynamically selected based on the current participants and/or available resources at every FL round of optimization. Specifically, we compare different metrics to select this flying master and assess consensus algorithms to perform the selection. Our results demonstrate a significant reduction of runtime using our flying master FL framework compared to the original FL from measurements results conducted in our EdgeAI testbed and over real 5G networks using an operational edge testbed.

推理|分析|理解|解释(1篇)

【1】 A Unified Analysis Method for Online Optimization in Normed Vector Space 标题：赋范向量空间中在线优化的统一分析方法链接：https://arxiv.org/abs/2112.12134

作者：Qingxin Meng,Jianwei Liu 机构：Department of Automation, College of Information Science and Engineering, China University of Petroleum-Beijing, Beijing, China 摘要：我们提出了一种基于广义余弦规则和$\phi$-凸的统一分析方法，在赋范向量空间中使用动态后悔作为性能度量进行在线优化。在结合更新规则时，我们从策略$S$（一种包含替代线性化损失的乐观FTRL的双参数变量策略）开始，通过松弛得到$S$-I（I型松弛变量形式$S$）和$S$-II（II型松弛变量形式$S$，即乐观MD）。$S$-I和$S$-II的遗憾界限是最严格的。作为实例，归一化指数次梯度和贪婪/懒惰投影的遗憾界优于目前已知的最优结果。我们将在线凸优化扩展到在线单调优化，用单调算子代替在线游戏的损失，扩展遗憾的定义，即遗憾$^n$，并扩展了$S$-I和$S$-II的应用范围。摘要：We present a unified analysis method that relies on the generalized cosine rule and $\phi$-convex for online optimization in normed vector space using dynamic regret as the performance metric. In combing the update rules, we start with strategy $S$ (a two-parameter variant strategy covering Optimistic-FTRL with surrogate linearized losses), and obtain $S$-I (type-I relaxation variant form of $S$) and $S$-II (type-II relaxation variant form of $S$, which is Optimistic-MD) by relaxation. Regret bounds for $S$-I and $S$-II are the tightest possible. As instantiations, regret bounds of normalized exponentiated subgradient and greedy/lazy projection are better than the currently known optimal results. We extend online convex optimization to online monotone optimization, by replacing losses of online game with monotone operators and extending the definition of regret, namely regret$^n$, and expand the application scope of $S$-I and $S$-II.

检测相关(3篇)

【1】 Two Stream Network for Stroke Detection in Table Tennis 标题：双流网络在乒乓球击球检测中的应用链接：https://arxiv.org/abs/2112.12073

作者：Anam Zahra,Pierre-Etienne Martin 机构：CCP Department, Max Planck Institute for Evolutionary Anthropology, D-, Leipzig, Germany 备注：MediaEval 2021, Dec 2021, Online, Germany 摘要：提出了一种基于视频的乒乓球击球检测方法。该方法依赖于并行处理RGB流及其计算光流的两流卷积神经网络。该方法已被开发为中世纪2021基准的运动任务的一部分。我们的贡献在测试集上没有超过提供的基线，但在mAP指标方面在其他参与者中表现最好。摘要：This paper presents a table tennis stroke detection method from videos. The method relies on a two-stream Convolutional Neural Network processing in parallel the RGB Stream and its computed optical flow. The method has been developed as part of the MediaEval 2021 benchmark for the Sport task. Our contribution did not outperform the provided baseline on the test set but has performed the best among the other participants with regard to the mAP metric.

【2】 Evaluating categorical encoding methods on a real credit card fraud detection database 标题：在真实信用卡欺诈检测数据库上评估分类编码方法链接：https://arxiv.org/abs/2112.12024

作者：François de la Bourdonnaye,Fabrice Daniel 机构：Artificial Intelligence Department of Lusis, Paris, France 摘要：在有监督的学习环境中正确处理分类数据仍然是一个主要问题。此外，尽管一些机器学习方法包含了处理分类特征的内置方法，但不清楚它们是否带来了一些改进，以及它们与常用的分类编码方法相比如何。在本文中，我们描述了几种著名的基于目标统计和证据权重的分类编码方法。我们将其应用于大型真实的信用卡欺诈检测数据库。然后，我们使用最先进的梯度提升方法训练编码数据库，并评估其性能。我们表明，相对于没有编码，分类编码方法通常会带来实质性的改进。这项工作的贡献有两方面：（1）我们在大规模数据库上比较了许多最先进的“lite”分类编码方法；（2）我们使用了一个真实的信用卡欺诈检测数据库。摘要：Correctly dealing with categorical data in a supervised learning context is still a major issue. Furthermore, though some machine learning methods embody builtin methods to deal with categorical features, it is unclear whether they bring some improvements and how do they compare with usual categorical encoding methods. In this paper, we describe several well-known categorical encoding methods that are based on target statistics and weight of evidence. We apply them on a large and real credit card fraud detection database. Then, we train the encoded databases using state-of-the-art gradient boosting methods and evaluate their performances. We show that categorical encoding methods generally bring substantial improvements with respect to the absence of encoding. The contribution of this work is twofold: (1) we compare many state-of-the-art "lite" categorical encoding methods on a large scale database and (2) we use a real credit card fraud detection database.

【3】 AtteSTNet -- An attention and subword tokenization based approach for code-switched Hindi-English hate speech detection 标题：AtteSTNet--一种基于关注度和子词标记化的语码转换印英仇恨言语检测方法链接：https://arxiv.org/abs/2112.11479

作者：Vedangi Wagh,Geet Shingi 机构：Pune Institute of Computer Technology, Maharashtra, India 摘要：最近的技术进步促进了社交媒体的使用，最终导致了大量用户生成的数据，其中还包括仇恨和冒犯性言论。社交媒体中使用的语言通常是英语和本地语言的结合。在印度，印地语主要使用，并且经常与英语进行代码转换，从而产生了Hinglish（印地语+英语）语言。过去，人们使用不同的机器学习和基于深度学习的技术对混合编码的Hinglish仇恨语音进行分类。然而，这些技术利用了卷积机制上的递归，卷积机制的计算代价很高，并且具有很高的内存需求。过去的技术还利用复杂的数据处理，使得现有技术非常复杂，无法持续地改变数据。我们提出了一个更简单的方法，不仅与这些复杂的网络，但也超过性能与使用子词标记算法，如BPE和UNIGRAM以及多头关注为基础的技术，给出了87.41%的准确性和F1得分0.851的标准数据集。有效使用BPE和Unigram算法有助于处理非传统的Hinglish词汇，使我们的技术简单、高效、可持续地在现实世界中使用。摘要：Recent advancements in technology have led to a boost in social media usage which has ultimately led to large amounts of user-generated data which also includes hateful and offensive speech. The language used in social media is often a combination of English and the native language in the region. In India, Hindi is used predominantly and is often code-switched with English, giving rise to the Hinglish (Hindi+English) language. Various approaches have been made in the past to classify the code-mixed Hinglish hate speech using different machine learning and deep learning-based techniques. However, these techniques make use of recurrence on convolution mechanisms which are computationally expensive and have high memory requirements. Past techniques also make use of complex data processing making the existing techniques very complex and non-sustainable to change in data. We propose a much simpler approach which is not only at par with these complex networks but also exceeds performance with the use of subword tokenization algorithms like BPE and Unigram along with multi-head attention-based technique giving an accuracy of 87.41% and F1 score of 0.851 on standard datasets. Efficient use of BPE and Unigram algorithms help handle the non-conventional Hinglish vocabulary making our technique simple, efficient and sustainable to use in the real world.

分类|识别(2篇)

【1】 Towards Malicious address identification in Bitcoin 标题：论比特币中的恶意地址识别链接：https://arxiv.org/abs/2112.11721

作者：Deepesh Chaudhari,Rachit Agarwal,Sandeep Kumar Shukla 机构：CSE, IIT Kanpur, India 摘要：区块链交易的时间方面使我们能够研究地址的行为并检测其是否涉及任何非法活动。然而，由于更改地址（用于阻止重播攻击）的概念，时间方面不直接适用于比特币区块链。在利用这些时间方面之前，应执行若干预处理步骤。我们积极研究比特币交易网络，并利用突发性、吸引力、，事件间时间以及一些基于图形的属性，如节点度和聚类系数，以验证比特币区块链上其他加密货币区块链已知的现有方法的适用性。我们生成时间和非时间特征集，并在不同的时间粒度上训练机器学习（ML）算法，以验证最新的方法。我们研究了地址在数据集的不同时间粒度上的行为。我们发现，在应用更改地址聚类后，在比特币中，可以提取现有的时间特征，并且可以应用ML方法。结果的对比分析表明，以太坊和比特币中的地址行为在入度、出度和事件间时间方面是相似的。此外，我们还确定了3名嫌疑犯，他们在不同的时间粒度上表现出恶意行为。这些嫌疑人在比特币中没有被标记为恶意。摘要：The temporal aspect of blockchain transactions enables us to study the address's behavior and detect if it is involved in any illicit activity. However, due to the concept of change addresses (used to thwart replay attacks), temporal aspects are not directly applicable in the Bitcoin blockchain. Several pre-processing steps should be performed before such temporal aspects are utilized. We are motivated to study the Bitcoin transaction network and use the temporal features such as burst, attractiveness, and inter-event time along with several graph-based properties such as the degree of node and clustering coefficient to validate the applicability of already existing approaches known for other cryptocurrency blockchains on the Bitcoin blockchain. We generate the temporal and non-temporal feature set and train the Machine Learning (ML) algorithm over different temporal granularities to validate the state-of-the-art methods. We study the behavior of the addresses over different time granularities of the dataset. We identify that after applying change-address clustering, in Bitcoin, existing temporal features can be extracted and ML approaches can be applied. A comparative analysis of results show that the behavior of addresses in Ethereum and Bitcoin is similar with respect to in-degree, out-degree and inter-event time. Further, we identify 3 suspects that showed malicious behavior across different temporal granularities. These suspects are not marked as malicious in Bitcoin.

【2】 An Attention Score Based Attacker for Black-box NLP Classifier 标题：一种基于注意力得分的黑盒NLP分类器攻击者链接：https://arxiv.org/abs/2112.11660

作者：Yueyang Liu,Hunmin Lee,Zhipeng Cai 机构：Georgia State University, Atlanta, GA, USA 摘要：深度神经网络在解决各种现实任务方面有着广泛的应用，并在计算机视觉、图像分类和自然语言处理等领域取得了令人满意的结果。同时，神经网络的安全性和鲁棒性已成为当务之急，因为各种研究表明了神经网络的脆弱性。例如，在自然语言处理任务中，神经网络可能会被精心修改的文本所愚弄，这些文本与原始文本具有高度的相似性。根据以往的研究，大多数研究集中在图像领域；与图像对抗攻击不同，文本是以离散序列表示的，传统的图像攻击方法不适用于NLP领域。在本文中，我们提出了一个词级NLP情感分类器攻击模型，该模型包括基于自我注意机制的词选择方法和用于词替换的贪婪搜索算法。我们通过在IMDB数据集上攻击GRU和1D-CNN受害者模型来试验我们的攻击模型。实验结果表明，由于采用了高效的单词选择算法，并最小化了单词替换数，因此我们的模型比以前的方法获得了更高的攻击成功率和效率。此外，我们的模型是可转移的，经过多次修改后，可以在图像域中使用。摘要：Deep neural networks have a wide range of applications in solving various real-world tasks and have achieved satisfactory results, in domains such as computer vision, image classification, and natural language processing. Meanwhile, the security and robustness of neural networks have become imperative, as diverse researches have shown the vulnerable aspects of neural networks. Case in point, in Natural language processing tasks, the neural network may be fooled by an attentively modified text, which has a high similarity to the original one. As per previous research, most of the studies are focused on the image domain; Different from image adversarial attacks, the text is represented in a discrete sequence, traditional image attack methods are not applicable in the NLP field. In this paper, we propose a word-level NLP sentiment classifier attack model, which includes a self-attention mechanism-based word selection method and a greedy search algorithm for word substitution. We experiment with our attack model by attacking GRU and 1D-CNN victim models on IMDB datasets. Experimental results demonstrate that our model achieves a higher attack success rate and more efficient than previous methods due to the efficient word selection algorithms are employed and minimized the word substitute number. Also, our model is transferable, which can be used in the image domain with several modifications.

表征(1篇)

【1】 Investigating Neighborhood Modeling and Asymmetry Preservation in Digraph Representation Learning 标题：有向图表示学习中的邻域建模与非对称性保持研究链接：https://arxiv.org/abs/2112.11734

作者：Honglu Zhou,Advith Chegu,Samuel Sohn,Mubbasir Kapadia 机构：Department of Computer Science, Rutgers University, Piscataway, NJ, USA. 摘要：传统上，图神经网络（GNN）对于有向图（有向图）表现出较差的性能，这是因为在1）建模邻域和2）保持不对称性方面存在显著的挑战。在本文中，我们通过利用来自多秩序和分区社区的双曲线协作学习，以及受社会心理因素启发的正则化器，解决了传统GNN中的这些挑战。我们得到的形式主义，有向图双曲网络（D-HYPR）学习双曲空间中的节点表示，以避免真实世界有向图的结构和语义失真。我们在4个任务上进行了综合实验：链接预测、节点分类、符号预测和嵌入可视化。在大多数任务和数据集上，D-HYPR在统计上显著优于当前技术水平，而在其他方面则具有竞争力。我们的代码和数据将可用。摘要：Graph Neural Networks (GNNs) traditionally exhibit poor performance for directed graphs (digraphs) due to notable challenges in 1) modeling neighborhoods and 2) preserving asymmetry. In this paper, we address these challenges in traditional GNNs by leveraging hyperbolic collaborative learning from multi-ordered and partitioned neighborhoods, and regularizers inspired by socio-psychological factors. Our resulting formalism, Digraph Hyperbolic Network (D-HYPR) learns node representations in hyperbolic space to avoid structural and semantic distortion of real-world digraphs. We conduct comprehensive experimentation on 4 tasks: link prediction, node classification, sign prediction, and embedding visualization. D-HYPR statistically significantly outperforms the current state of the art on a majority of tasks and datasets, while achieving competitive performance otherwise. Our code and data will be available.

编码器(1篇)

【1】 List Autoencoder: Towards Deep Learning Based Reliable Transmission Over Noisy Channels 标题：列表自动编码器：基于深度学习的噪声信道可靠传输链接：https://arxiv.org/abs/2112.11920

作者：Hamid Saber,Homayoon Hatami,Jung Hyun Bae 机构：SOC Lab, Samsung Semiconductor Inc., San Diego, CA, USA 备注：8 pages with references and 5 figures 摘要：近年来，人们对自动编码器（AE）框架中的信道编码器和解码器的自动化设计越来越感兴趣，以便在噪声信道上可靠地传输数据。在本文中，我们提出了一个新的框架设计AEs的目的。特别地，我们提出了一个AE框架，即listAE，其中解码器网络输出解码消息词候选列表。假设解码器的输出端有一个genie，并提出了特定的损耗函数来优化genie-aided（GA-listAE）的性能。listAE是一个通用的AE框架，可用于任何网络体系结构。我们提出了一种特定的端到端网络结构，该结构以降低的速率对一系列分量码上的接收字进行解码。基于所提出的架构的listAE称为增量冗余listAE（IR-listAE），在GA解码下，在低误块率下将最先进的AE性能提高了1db。然后，我们使用循环冗余校验（CRC）码替换解码器上的精灵，与GA listAE相比，CRC辅助（CA）listAE的性能损失可以忽略不计。CA listAE显示出有意义的编码增益，但代价是由于在消息字上附加CRC而使速率略微降低。摘要：There has been a growing interest in automating the design of channel encoders and decoders in an auto-encoder(AE) framework in recent years for reliable transmission of data over noisy channels. In this paper we present a new framework for designing AEs for this purpose. In particular, we present an AE framework, namely listAE, in which the decoder network outputs a list of decoded message word candidates. A genie is assumed to be available at the output of the decoder and specific loss functions are proposed to optimize the performance of the genie-aided (GA)-listAE. The listAE is a general AE framework and can be used with any network architecture. We propose a specific end-to-end network architecture which decodes the received word on a sequence of component codes with decreasing rates. The listAE based on the proposed architecture, referred to as incremental redundancy listAE (IR-listAE), improves the state-of-the-art AE performance by 1 dB at low block error rates under GA decoding. We then employ cyclic redundancy check (CRC) codes to replace the genie at the decoder, giving CRC-aided (CA)-listAE with negligible performance loss compared to the GA-listAE. The CA-listAE shows meaningful coding gain at the price of a slight decrease in the rate due to appending CRC to the message word.

优化|敛散性(2篇)

【1】 Latent Space Simulation for Carbon Capture Design Optimization 标题：碳捕集设计优化的潜在空间模拟链接：https://arxiv.org/abs/2112.11656

作者：Brian Bartoldson,Rui Wang,Yucheng Fu,David Widemann,Sam Nguyen,Jie Bao,Zhijie Xu,Brenda Ng 机构：Lawrence Livermore National Laboratory; ,UCSD; ,Pacific Northwest National Laboratory 备注：Extended version of a paper appearing in the Proceedings of the 34th Annual Conference on Innovative Applications of Artificial Intelligence (IAAI-22) 摘要：溶剂型碳捕集系统（CCS）的CO2捕集效率主要取决于气体-溶剂界面面积（IA），使IA的最大化成为CCS设计中的一个基本挑战。虽然与特定CCS设计相关的IA可通过计算流体力学（CFD）模拟进行估算，但使用CFD推导与众多CCS设计相关的IAs成本过高。幸运的是，以前的工作，如Deep Fluids（DF）（Kim等人，2019年）表明，通过将CFD模拟器替换为真实模拟CFD模拟过程的神经网络（NN）替代物，可以实现较大的模拟加速。这提高了快速、准确地替代CFD模拟器的可能性，从而有效地近似CCS设计优化所需的IAs。因此，在这里，我们以DF方法为基础，开发可成功应用于复杂碳捕获CFD模拟的替代物。我们优化的DF样式代理产生了较大的加速比（4000x），同时在训练配置范围内的未知CCS配置上获得了低至4%的IA相对误差。这暗示了神经网络代理在CCS设计优化问题中的应用前景。尽管如此，DF在CCS设计方面存在固有的局限性（例如，将经过训练的模型转移到新CCS包装的能力有限）。最后，我们提出了应对这些挑战的想法。摘要：The CO2 capture efficiency in solvent-based carbon capture systems (CCSs) critically depends on the gas-solvent interfacial area (IA), making maximization of IA a foundational challenge in CCS design. While the IA associated with a particular CCS design can be estimated via a computational fluid dynamics (CFD) simulation, using CFD to derive the IAs associated with numerous CCS designs is prohibitively costly. Fortunately, previous works such as Deep Fluids (DF) (Kim et al., 2019) show that large simulation speedups are achievable by replacing CFD simulators with neural network (NN) surrogates that faithfully mimic the CFD simulation process. This raises the possibility of a fast, accurate replacement for a CFD simulator and therefore efficient approximation of the IAs required by CCS design optimization. Thus, here, we build on the DF approach to develop surrogates that can successfully be applied to our complex carbon-capture CFD simulations. Our optimized DF-style surrogates produce large speedups (4000x) while obtaining IA relative errors as low as 4% on unseen CCS configurations that lie within the range of training configurations. This hints at the promise of NN surrogates for our CCS design optimization problem. Nonetheless, DF has inherent limitations with respect to CCS design (e.g., limited transferability of trained models to new CCS packings). We conclude with ideas to address these challenges.

【2】 On Asymptotic Linear Convergence of Projected Gradient Descent for Constrained Least Squares 标题：关于约束最小二乘投影梯度下降的渐近线性收敛性链接：https://arxiv.org/abs/2112.11760

作者：Trung Vu,Raviv Raich 机构： convergence properties ofTrung Vu and Raviv Raich are with the School of Electrical Engineeringand Computer Science, Oregon State University 备注：16 pages 摘要：信号处理和机器学习中的许多新问题，如压缩感知、图像恢复、矩阵/张量恢复和非负矩阵分解，都可以归结为约束优化。投影梯度下降法是求解此类约束优化问题的一种简单而有效的方法。局部收敛分析进一步加深了我们对其在解附近的渐近行为的理解，与全局收敛分析相比，它提供了更精确的收敛速度界限。然而，局部保证常常分散在机器学习和信号处理的特定问题领域。这篇手稿提出了一个统一的框架，用于在约束最小二乘法的背景下分析投影梯度下降的局部收敛性。所提出的分析深入了解了关键的局部收敛特性，如线性收敛条件、收敛区域、精确渐近收敛速度以及达到一定精度所需迭代次数的界限。为了证明所提出方法的适用性，我们提出了PGD收敛性分析的方法，并通过该方法在四个基本问题（即线性约束最小二乘、稀疏恢复、带单位范数约束的最小二乘和矩阵完备）上的从头到尾应用进行了证明。摘要：Many recent problems in signal processing and machine learning such as compressed sensing, image restoration, matrix/tensor recovery, and non-negative matrix factorization can be cast as constrained optimization. Projected gradient descent is a simple yet efficient method for solving such constrained optimization problems. Local convergence analysis furthers our understanding of its asymptotic behavior near the solution, offering sharper bounds on the convergence rate compared to global convergence analysis. However, local guarantees often appear scattered in problem-specific areas of machine learning and signal processing. This manuscript presents a unified framework for the local convergence analysis of projected gradient descent in the context of constrained least squares. The proposed analysis offers insights into pivotal local convergence properties such as the condition of linear convergence, the region of convergence, the exact asymptotic rate of convergence, and the bound on the number of iterations needed to reach a certain level of accuracy. To demonstrate the applicability of the proposed approach, we present a recipe for the convergence analysis of PGD and demonstrate it via a beginning-to-end application of the recipe on four fundamental problems, namely, linearly constrained least squares, sparse recovery, least squares with the unit norm constraint, and matrix completion.

预测|估计(4篇)

【1】 Automatic Estimation of Anthropometric Human Body Measurements 标题：人体测量尺寸的自动估算链接：https://arxiv.org/abs/2112.11992

作者：Dana Škorvánková,Adam Riečický,Martin Madaras 机构：Physics and Informatics, Comenius University Bratislava, Slovakia, Skeletex Research, Slovakia 摘要：考虑到人体分析对我们日常生活的潜在益处，在过去几十年中，与人体分析相关的研究任务在计算机视觉领域受到了广泛关注。人体测量学是一个定义人体大小、形态和功能能力的物理测量领域。具体而言，从可视人体数据准确估计人体测量值是一个具有挑战性的问题，解决方案将简化许多不同的应用领域，包括人类工效学、服装制造等。本文在深入学习和神经网络领域进行了研究，解决从各种类型的视觉输入数据（如2D图像或3D点云）估计身体测量的挑战。此外，我们还通过生成各种人体形状的合成数据集并执行骨架驱动的注释，来处理缺乏用训练和评估所需的地面真实人体测量值注释的真实人体数据的问题。摘要：Research tasks related to human body analysis have been drawing a lot of attention in computer vision area over the last few decades, considering its potential benefits on our day-to-day life. Anthropometry is a field defining physical measures of a human body size, form, and functional capacities. Specifically, the accurate estimation of anthropometric body measurements from visual human body data is one of the challenging problems, where the solution would ease many different areas of applications, including ergonomics, garment manufacturing, etc. This paper formulates a research in the field of deep learning and neural networks, to tackle the challenge of body measurements estimation from various types of visual input data (such as 2D images or 3D point clouds). Also, we deal with the lack of real human data annotated with ground truth body measurements required for training and evaluation, by generating a synthetic dataset of various human body shapes and performing a skeleton-driven annotation.

【2】 An Alternate Policy Gradient Estimator for Softmax Policies 标题：一种适用于软最大策略的替代策略梯度估计器链接：https://arxiv.org/abs/2112.11622

作者：Shivam Garg,Samuele Tosatto,Yangchen Pan,Martha White,A. Rupam Mahmood 机构： 1University of Alberta, Alberta MachineIntelligence Institute (Amii) 摘要：softmax策略的策略梯度（PG）估计在次优饱和初始化时是无效的，这种情况发生在密度集中于次优动作时。次优策略饱和可能由错误的策略初始化或策略已经收敛后环境中发生的突然变化引起，softmax PG估计器需要大量更新才能恢复有效的策略。这一严重问题导致样本效率高，对新情况的适应性差。为了缓解这个问题，我们提出了一种新的softmax策略梯度估计器，该估计器利用临界估计中的偏差和奖励信号中存在的噪声来避开策略参数空间的饱和区域。我们在bandits和经典MDP基准测试任务上进行的分析和实验表明，我们的估计器对策略饱和更具鲁棒性。摘要：Policy gradient (PG) estimators for softmax policies are ineffective with sub-optimally saturated initialization, which happens when the density concentrates on a sub-optimal action. Sub-optimal policy saturation may arise from bad policy initialization or sudden changes in the environment that occur after the policy has already converged, and softmax PG estimators require a large number of updates to recover an effective policy. This severe issue causes high sample inefficiency and poor adaptability to new situations. To mitigate this problem, we propose a novel policy gradient estimator for softmax policies that utilizes the bias in the critic estimate and the noise present in the reward signal to escape the saturated regions of the policy parameter space. Our analysis and experiments, conducted on bandits and classical MDP benchmarking tasks, show that our estimator is more robust to policy saturation.

【3】 Predicting treatment effects from observational studies using machine learning methods: A simulation study 标题：用机器学习方法预测观察性研究的治疗效果：一项模拟研究链接：https://arxiv.org/abs/2112.12083

作者：Bevan I. Smith,Charles Chimedza 机构：School of Statistics and Actuarial Science, University of the Witwatersrand, Johannesburg 摘要：在观察性研究中测量治疗效果具有挑战性，因为存在混淆偏差。当一个变量同时影响治疗和结果时，就会发生混淆。传统方法，如倾向评分匹配，通过对混杂因素的条件作用来评估治疗效果。最近的文献提出了新的方法，使用机器学习来预测观察研究中的反事实，从而可以估计治疗效果。然而，这些研究已经应用于真实世界的数据，而真实的治疗效果尚不清楚。本研究旨在通过模拟两种主要情景来研究这种反事实预测方法的有效性：有无混杂。每种类型还包括输入和输出数据之间的线性和非线性关系。模拟的关键是我们产生了已知的真正因果效应。线性回归、套索回归和随机森林模型用于预测反事实和治疗效果。将其与真实的治疗效果以及天真的治疗效果进行比较。结果表明，影响这种机器学习方法性能的最重要因素是数据的非线性程度。令人惊讶的是，对于非混杂和混杂，机器学习模型在线性数据集上都表现良好。然而，当引入非线性时，模型的性能非常差。因此，在本模拟研究的条件下，机器学习方法在线性条件下表现良好，即使存在混淆，但在这一阶段，当引入非线性时，不应信任机器学习方法。摘要：Measuring treatment effects in observational studies is challenging because of confounding bias. Confounding occurs when a variable affects both the treatment and the outcome. Traditional methods such as propensity score matching estimate treatment effects by conditioning on the confounders. Recent literature has presented new methods that use machine learning to predict the counterfactuals in observational studies which then allow for estimating treatment effects. These studies however, have been applied to real world data where the true treatment effects have not been known. This study aimed to study the effectiveness of this counterfactual prediction method by simulating two main scenarios: with and without confounding. Each type also included linear and non-linear relationships between input and output data. The key item in the simulations was that we generated known true causal effects. Linear regression, lasso regression and random forest models were used to predict the counterfactuals and treatment effects. These were compared these with the true treatment effect as well as a naive treatment effect. The results show that the most important factor in whether this machine learning method performs well, is the degree of non-linearity in the data. Surprisingly, for both non-confounding \textit{and} confounding, the machine learning models all performed well on the linear dataset. However, when non-linearity was introduced, the models performed very poorly. Therefore under the conditions of this simulation study, the machine learning method performs well under conditions of linearity, even if confounding is present, but at this stage should not be trusted when non-linearity is introduced.

【4】 Neural Echo State Network using oscillations of gas bubbles in water: Computational validation by Mackey-Glass time series forecasting 标题：利用水中气泡振荡的神经回声状态网络：Mackey-Glass时间序列预测的计算验证链接：https://arxiv.org/abs/2112.11592

作者：Ivan S. Maksymov,Andrey Pototsky,Sergey A. Suslov 机构：Optical Sciences Centre, Swinburne University of Technology, Hawthorn, VIC , Australia, Department of Mathematics, Swinburne University of Technology, Hawthorn, Victoria , Australia 备注：5 pages, 3 figures 摘要：物理水库计算（RC）是一种计算框架，其中为数字计算机设计的机器学习算法使用类似模拟计算机的非线性物理系统执行，该系统可提供较高的计算能力，用于预测可使用非线性微分方程发现的时间相关量。在这里，我们提出了一种RC系统，该系统将水中振动气泡群的声学响应的非线性与标准回声状态网络（ESN）算法相结合，该算法非常适合预测非线性和混沌时间序列。我们通过证明所提出的RC系统能够以ESN的效率预测混沌Mackey-Glass时间序列，从而在计算上证实了该系统的合理性。摘要：Physical reservoir computing (RC) is a computational framework, where machine learning algorithms designed for digital computers are executed using analog computer-like nonlinear physical systems that can provide high computational power for predicting time-dependent quantities that can be found using nonlinear differential equations. Here we suggest an RC system that combines the nonlinearity of an acoustic response of a cluster of oscillating gas bubbles in water with a standard Echo State Network (ESN) algorithm that is well-suited to forecast nonlinear and chaotic time series. We computationally confirm the plausibility of the proposed RC system by demonstrating its ability to forecast a chaotic Mackey-Glass time series with the efficiency of ESN.

其他神经网络|深度学习|模型|建模(12篇)

【1】 Deeper Learning with CoLU Activation 标题：利用COLU激活进行更深入的学习链接：https://arxiv.org/abs/2112.12078

作者：Advait Vagerwal 备注：7 pages, 4 figures, 4 tables 摘要：在神经网络中，非线性是由激活函数引入的。一个常用的激活函数是校正线性单元（ReLU）。作为一种激活方式，ReLU一直是一种流行的选择，但也有缺陷。像Swish和Mish这样的最先进的功能现在作为一个更好的选择而受到关注，因为它们克服了其他激活功能所带来的许多缺陷。CoLU在性质上类似于Swish和Mish的激活函数。它被定义为f（x）=x/（1-xe^-（x+e^x））。它是光滑的、连续可微的、上无界的、下有界的、非饱和的、非单调的。基于使用具有不同激活函数的CoLU进行的实验，可以观察到CoLU在更深层次的神经网络上通常比其他函数表现得更好。当在MNIST上以递增的卷积层数训练不同的神经网络时，CoLU在更多的层数中保持了最高的精度。在具有8个卷积层的较小网络上，CoLU具有最高的平均精度，紧随其后的是ReLU。在接受时装师训练的VGG-13上，CoLU的准确率比Mish高4.20%，比ReLU高3.31%。在Cifar-10上训练的ResNet-9上，CoLU的准确率比Swish高0.05%，比Mish高0.09%，比ReLU高0.29%。可以观察到，基于不同的因素，包括层数、层类型、参数数量、学习率、优化器、，可以对这些因素和激活函数进行进一步的研究，以获得更优化的激活函数和更多关于其行为的知识。摘要：In neural networks, non-linearity is introduced by activation functions. One commonly used activation function is Rectified Linear Unit (ReLU). ReLU has been a popular choice as an activation but has flaws. State-of-the-art functions like Swish and Mish are now gaining attention as a better choice as they combat many flaws presented by other activation functions. CoLU is an activation function similar to Swish and Mish in properties. It is defined as f(x)=x/(1-xe^-(x+e^x)). It is smooth, continuously differentiable, unbounded above, bounded below, non-saturating, and non-monotonic. Based on experiments done with CoLU with different activation functions, it is observed that CoLU usually performs better than other functions on deeper neural networks. While training different neural networks on MNIST on an incrementally increasing number of convolutional layers, CoLU retained the highest accuracy for more layers. On a smaller network with 8 convolutional layers, CoLU had the highest mean accuracy, closely followed by ReLU. On VGG-13 trained on Fashion-MNIST, CoLU had a 4.20% higher accuracy than Mish and 3.31% higher accuracy than ReLU. On ResNet-9 trained on Cifar-10, CoLU had 0.05% higher accuracy than Swish, 0.09% higher accuracy than Mish, and 0.29% higher accuracy than ReLU. It is observed that activation functions may behave better than other activation functions based on different factors including the number of layers, types of layers, number of parameters, learning rate, optimizer, etc. Further research can be done on these factors and activation functions for more optimal activation functions and more knowledge on their behavior.

【2】 Machine Learning for Computational Science and Engineering -- a brief introduction and some critical questions 标题：计算科学与工程中的机器学习--简介及若干关键问题链接：https://arxiv.org/abs/2112.12054

作者：Chennakesava Kadapa 机构：University of Bolton, Bolton BL,AB, United Kingdom., Version ,.,. 备注：16 papges 摘要：人工智能（AI）正在进入科学、技术、工程、艺术和管理的每个子领域。多亏了研究基金的炒作和可用性，它在许多领域都得到了广泛的应用。计算科学与工程（CS&E）就是这样一个子领域。通过强调一些关键问题，围绕将机器学习（ML）应用于CS&E的问题和挑战，其中大多数问题在期刊论文中经常被忽略，本文希望对ML在CS&E和相关领域的应用提供一些见解。这是一篇为普通读者和研究人员撰写的通用文章，他们是ML和/或CS\&E领域的新手。这项工作只关注计算科学和工程中的前沿问题。还提供了一些基本方程和MATLAB代码，以帮助读者理解基础知识。摘要：Artificial Intelligence (AI) is now entering every sub-field of science, technology, engineering, arts, and management. Thanks to the hype and availability of research funds, it is being adapted in many fields without much thought. Computational Science and Engineering (CS&E) is one such sub-field. By highlighting some critical questions around the issues and challenges in adapting Machine Learning (ML) for CS&E, most of which are often overlooked in journal papers, this contribution hopes to offer some insights into the adaptation of ML for applications in CS\&E and related fields. This is a general-purpose article written for a general audience and researchers new to the fields of ML and/or CS\&E. This work focuses only on the forward problems in computational science and engineering. Some basic equations and MATLAB code are also provided to help the reader understand the basics.

【3】 Continual learning of longitudinal health records 标题：纵向健康记录的持续学习链接：https://arxiv.org/abs/2112.11944

作者：J. Armstrong,D. Clifton 机构：CONTINUAL LEARNING OF LONGITUDINAL HEALTH RECORDSPREPRINTJacob ArmstrongInstitute of Biomedical EngineeringOxford Universityjacob, CliftonInstitute of Biomedical EngineeringOxford Universitydavidc 备注：15 pages, 5 figures 摘要：持续学习是指机器学习方法，它可以适应新的环境，同时保留和重用从过去经验中获得的知识。这种方法解决了非平稳环境中模型遇到的两个问题：对新数据的不通用性，以及在重新训练时对先前知识的灾难性遗忘。在临床环境中，这是一个普遍存在的问题，患者数据不仅在人群之间表现出协变量变化，而且还随着时间的推移持续变化。然而，尽管连续学习方法在成像领域取得了初步成功，但它们很少应用于危重病患者记录的多变量序列数据特征。在这里，我们在一系列有代表性的医疗场景中评估了各种纵向ICU数据的持续学习方法。我们发现，虽然有几种方法可以缓解短期遗忘，但在大量任务中，域转移仍然是一个具有挑战性的问题，只有基于回放的方法才能实现稳定的长期性能。复制所有实验的代码可以在https://github.com/iacobo/continual 摘要：Continual learning denotes machine learning methods which can adapt to new environments while retaining and reusing knowledge gained from past experiences. Such methods address two issues encountered by models in non-stationary environments: ungeneralisability to new data, and the catastrophic forgetting of previous knowledge when retrained. This is a pervasive problem in clinical settings where patient data exhibits covariate shift not only between populations, but also continuously over time. However, while continual learning methods have seen nascent success in the imaging domain, they have been little applied to the multi-variate sequential data characteristic of critical care patient recordings. Here we evaluate a variety of continual learning methods on longitudinal ICU data in a series of representative healthcare scenarios. We find that while several methods mitigate short-term forgetting, domain shift remains a challenging problem over large series of tasks, with only replay based methods achieving stable long-term performance. Code for reproducing all experiments can be found at https://github.com/iacobo/continual

【4】 The Importance of the Current Input in Sequence Modeling 标题：电流输入在序列建模中的重要性链接：https://arxiv.org/abs/2112.11776

作者：Christian Oliva,Luis F. Lago-Fernández 机构：Departamento de Ingenier´ıa Inform´atica, Universidad Aut´onoma de Madrid, Madrid, Spain 备注：11 pages, 2 appendix pages 摘要：序列建模的最新进展主要基于深度学习方法。目前的技术水平包括使用标准LSTM体系结构的变体，并结合一些技巧来提高训练神经网络的最终预测率。然而，在某些情况下，这些调整可能过于适应正在解决的特定问题。在本文中，我们展示了一个非常简单的想法，即在输入和输出之间添加直接连接，跳过递归模块，从而在与自然语言处理相关的序列建模问题中提高预测精度。在不同问题上进行的实验表明，无论结构和训练的具体细节如何，将这种连接添加到递归网络总是能够改善结果。当这一思想被引入到引领该领域的模型中时，由此产生的网络在语言建模问题上实现了一种新的最先进的困惑。摘要：The last advances in sequence modeling are mainly based on deep learning approaches. The current state of the art involves the use of variations of the standard LSTM architecture, combined with several tricks that improve the final prediction rates of the trained neural networks. However, in some cases, these adaptations might be too much tuned to the particular problems being addressed. In this article, we show that a very simple idea, to add a direct connection between the input and the output, skipping the recurrent module, leads to an increase of the prediction accuracy in sequence modeling problems related to natural language processing. Experiments carried out on different problems show that the addition of this kind of connection to a recurrent network always improves the results, regardless of the architecture and training-specific details. When this idea is introduced into the models that lead the field, the resulting networks achieve a new state-of-the-art perplexity in language modeling problems.

【5】 Accelerated Proximal Alternating Gradient-Descent-Ascent for Nonconvex Minimax Machine Learning 标题：非凸极大极小机器学习的加速近邻交替梯度下降上升算法链接：https://arxiv.org/abs/2112.11663

作者：Ziyi Chen,Shaocong Ma,Yi Zhou 机构：Electrical & Computer Engineering, University of Utah, Salt Lake City, US 备注：12 pages, 1 figure. arXiv admin note: text overlap with arXiv:2102.04653 摘要：交替梯度下降-上升（AltGDA）算法是一种优化算法，广泛应用于各种机器学习应用中的模型训练，旨在解决非凸极大极小优化问题。然而，现有的研究表明，在非凸极大极小优化中，它具有很高的计算复杂度。在本文中，我们开发了一种单循环快速AltGDA型算法，该算法利用近端梯度更新和动量加速来解决正则化非凸极大极小优化问题。通过识别该算法的内在李雅普诺夫函数，我们证明了它收敛到非凸极大极小优化问题的一个临界点，并获得了计算复杂度$\mathcal{O}（\kappa^{1.5}\epsilon^{-2}）$，其中$\epsilon$是所需的精度水平，$\kappa$是问题的条件数。这种计算复杂性提高了单循环GDA和AltGDA算法的最新复杂性（参见表1中的比较摘要）。通过对抗式深度学习的实验，我们证明了算法的有效性。摘要：Alternating gradient-descent-ascent (AltGDA) is an optimization algorithm that has been widely used for model training in various machine learning applications, which aim to solve a nonconvex minimax optimization problem. However, the existing studies show that it suffers from a high computation complexity in nonconvex minimax optimization. In this paper, we develop a single-loop and fast AltGDA-type algorithm that leverages proximal gradient updates and momentum acceleration to solve regularized nonconvex minimax optimization problems. By identifying the intrinsic Lyapunov function of this algorithm, we prove that it converges to a critical point of the nonconvex minimax optimization problem and achieves a computation complexity $\mathcal{O}(\kappa^{1.5}\epsilon^{-2})$, where $\epsilon$ is the desired level of accuracy and $\kappa$ is the problem's condition number. Such a computation complexity improves the state-of-the-art complexities of single-loop GDA and AltGDA algorithms (see the summary of comparison in Table 1). We demonstrate the effectiveness of our algorithm via an experiment on adversarial deep learning.

【6】 A Convergent ADMM Framework for Efficient Neural Network Training 标题：一种高效神经网络训练的收敛ADMM框架链接：https://arxiv.org/abs/2112.11619

作者：Junxiang Wang,Hongyi Li,Liang Zhao 机构：edu) are from Emory University, cn) is from Xidian University 备注：This work is in progress, a journal extension of the conference paper: arXiv:1905.13611 摘要：作为一个著名的优化框架，交替方向乘数法（ADMM）在许多分类和回归应用中取得了巨大的成功。最近，它引起了深度学习研究者的注意，被认为是梯度下降（GD）的潜在替代品。然而，作为一个新兴领域，一些挑战仍未解决，包括1）缺乏全局收敛保证，2）收敛速度缓慢，以及3）特征维度的立方时间复杂性。在本文中，我们提出了一个新的优化框架，通过ADMM（dlADMM）解决一般神经网络训练问题，以同时解决这些挑战。具体地说，每个层中的参数先向后再向前更新，以便有效地交换每个层中的参数信息。当dlADMM应用于特定的体系结构时，通过使用二次近似和回溯技术的专用算法设计，子问题的时间复杂度从三次降低到二次。最后，我们给出了ADMM型方法（dlADMM）在温和条件下次线性收敛到临界点的第一个证明。在七个基准数据集上的实验证明了我们提出的dlADMM算法的收敛性、效率和有效性。摘要：As a well-known optimization framework, the Alternating Direction Method of Multipliers (ADMM) has achieved tremendous success in many classification and regression applications. Recently, it has attracted the attention of deep learning researchers and is considered to be a potential substitute to Gradient Descent (GD). However, as an emerging domain, several challenges remain unsolved, including 1) The lack of global convergence guarantees, 2) Slow convergence towards solutions, and 3) Cubic time complexity with regard to feature dimensions. In this paper, we propose a novel optimization framework to solve a general neural network training problem via ADMM (dlADMM) to address these challenges simultaneously. Specifically, the parameters in each layer are updated backward and then forward so that parameter information in each layer is exchanged efficiently. When the dlADMM is applied to specific architectures, the time complexity of subproblems is reduced from cubic to quadratic via a dedicated algorithm design utilizing quadratic approximations and backtracking techniques. Last but not least, we provide the first proof of convergence to a critical point sublinearly for an ADMM-type method (dlADMM) under mild conditions. Experiments on seven benchmark datasets demonstrate the convergence, efficiency, and effectiveness of our proposed dlADMM algorithm.

【7】 Identifying Mixtures of Bayesian Network Distributions 标题：贝叶斯网络分布的混合识别链接：https://arxiv.org/abs/2112.11602

作者：Spencer L. Gordon,Bijan Mazaheri,Yuval Rabani,Leonard J. Schulman 摘要：贝叶斯网络是一组$n$随机变量（用顶点标识）上的有向无环图（DAG）；贝叶斯网络分布（BND）是rv上的概率分布，在图上是马尔可夫分布。此类模型的有限混合是BND在较大图上的这些变量上的投影，该图具有额外的“隐藏”（或“潜在”）随机变量$U$，范围为$\{1、\ldots、k\}$，以及从$U$到每个其他顶点的有向边。这种类型的模型是因果推理研究的基础，其中$U$模型是一种混杂效应。理论文献中有一个非常特殊的例子：空图。这样的分布只是$k$产品分布的混合。一个长期存在的问题是，给定$k$乘积分布的混合分布，确定每个乘积分布及其混合权重。我们的结果是：（1）我们改进了从$\exp（O（k^2））$到$\exp（O（k\log k））$识别$k$产品分布混合的样本复杂性（和运行时）。考虑到已知的$\exp（\Omega（k））$下限，这几乎是最好的选择。（2）我们给出了非空图情况下的第一个算法。最大度为$\Delta$的图的复杂度为$\exp（O（k（\Delta^2+\log k）））$。（上述复杂性是近似的，不依赖于次要参数。）摘要：A Bayesian Network is a directed acyclic graph (DAG) on a set of $n$ random variables (identified with the vertices); a Bayesian Network Distribution (BND) is a probability distribution on the rv's that is Markovian on the graph. A finite mixture of such models is the projection on these variables of a BND on the larger graph which has an additional "hidden" (or "latent") random variable $U$, ranging in $\{1,\ldots,k\}$, and a directed edge from $U$ to every other vertex. Models of this type are fundamental to research in Causal Inference, where $U$ models a confounding effect. One extremely special case has been of longstanding interest in the theory literature: the empty graph. Such a distribution is simply a mixture of $k$ product distributions. A longstanding problem has been, given the joint distribution of a mixture of $k$ product distributions, to identify each of the product distributions, and their mixture weights. Our results are: (1) We improve the sample complexity (and runtime) for identifying mixtures of $k$ product distributions from $\exp(O(k^2))$ to $\exp(O(k \log k))$. This is almost best possible in view of a known $\exp(\Omega(k))$ lower bound. (2) We give the first algorithm for the case of non-empty graphs. The complexity for a graph of maximum degree $\Delta$ is $\exp(O(k(\Delta^2 + \log k)))$. (The above complexities are approximate and suppress dependence on secondary parameters.)

【8】 Learning Positional Embeddings for Coordinate-MLPs 标题：学习坐标-MLP的位置嵌入链接：https://arxiv.org/abs/2112.11577

作者：Sameera Ramasinghe,Simon Lucey 机构：Australian Institute for Machine Learning, University of Adelaide 摘要：我们提出了一种通过学习特定于实例的位置嵌入来提高坐标MLP性能的新方法。位置嵌入参数和网络权值的端到端优化导致泛化性能较差。相反，我们开发了一个通用框架来学习基于经典图拉普拉斯正则化的位置嵌入，它可以隐式地平衡记忆和泛化之间的平衡。然后利用该框架提出了一种新的位置嵌入方案，在该方案中，每个坐标（即实例）学习超参数以提供最佳性能。结果表明，与已有的随机傅立叶特征（RFF）相比，该嵌入方法具有更好的性能和更高的稳定性。此外，我们证明了所提出的嵌入方案产生稳定的梯度，能够无缝集成到作为中间层的深层体系结构中。摘要：We propose a novel method to enhance the performance of coordinate-MLPs by learning instance-specific positional embeddings. End-to-end optimization of positional embedding parameters along with network weights leads to poor generalization performance. Instead, we develop a generic framework to learn the positional embedding based on the classic graph-Laplacian regularization, which can implicitly balance the trade-off between memorization and generalization. This framework is then used to propose a novel positional embedding scheme, where the hyperparameters are learned per coordinate (i.e, instance) to deliver optimal performance. We show that the proposed embedding achieves better performance with higher stability compared to the well-established random Fourier features (RFF). Further, we demonstrate that the proposed embedding scheme yields stable gradients, enabling seamless integration into deep architectures as intermediate layers.

【9】 On the Compression of Natural Language Models 标题：论自然语言模型的压缩链接：https://arxiv.org/abs/2112.11480

作者：Saeed Damadi 机构：Department of Computer Science and Electrical Engineering Department, University of Maryland, Baltimore County 摘要：深度神经网络是有效的特征提取器，但对于部署场景来说，它们的规模太大了。由于参数数量巨大，不同层参数的可解释性并不直接。这就是为什么神经网络有时被认为是黑箱。虽然简单的模型更容易解释，但找到它们并不容易。若能找到一个稀疏网络，它可以从零开始拟合数据，这将有助于解释神经网络的参数。为此，彩票假设指出，典型的稠密神经网络包含一个小的稀疏子网络，该子网络可以经过训练，在相同的步骤数内达到相似的测试精度。这项工作的目标是评估自然语言模型（NLM）是否存在这样一个可训练的子网络。为了实现这一目标，我们将回顾最先进的压缩技术，如量化、知识提取和剪枝。摘要：Deep neural networks are effective feature extractors but they are prohibitively large for deployment scenarios. Due to the huge number of parameters, interpretability of parameters in different layers is not straight-forward. This is why neural networks are sometimes considered black boxes. Although simpler models are easier to explain, finding them is not easy. If found, a sparse network that can fit to a data from scratch would help to interpret parameters of a neural network. To this end, lottery ticket hypothesis states that typical dense neural networks contain a small sparse sub-network that can be trained to a reach similar test accuracy in an equal number of steps. The goal of this work is to assess whether such a trainable subnetwork exists for natural language models (NLM)s. To achieve this goal we will review state-of-the-art compression techniques such as quantization, knowledge distillation, and pruning.

【10】 Machine learning nonequilibrium electron forces for adiabatic spin dynamics 标题：绝热自旋动力学的机器学习非平衡电子力链接：https://arxiv.org/abs/2112.12124

作者：Puhan Zhang,Gia-Wei Chern 机构：Department of Physics, University of Virginia, Charlottesville, VA , USA 备注：6 pages, 5 figures 摘要：我们提出了朗道-利夫希兹方程非平衡力矩的广义势理论。用两个势能表示交换力的一般公式允许实现非平衡巡回磁系统绝热自旋动力学的精确机器学习模型。为了证明我们的方法，我们开发了一个深度学习神经网络，成功地学习了由非平衡格林函数方法计算的驱动s-d模型中的力。我们表明，使用神经网络模型预测的力进行Landau-Lifshitz动力学模拟可以准确地再现电压驱动的畴壁传播。我们的工作为基于机器学习模型的巡回磁铁和自旋电子学中非平衡动力学现象的多尺度建模开辟了一条新途径。摘要：We present a generalized potential theory of nonequilibrium torques for the Landau-Lifshitz equation. The general formulation of exchange forces in terms of two potential energies allows for the implementation of accurate machine learning models for adiabatic spin dynamics of out-of-equilibrium itinerant magnetic systems. To demonstrate our approach, we develop a deep-learning neural network that successfully learns the forces in a driven s-d model computed from the nonequilibrium Green's function method. We show that the Landau-Lifshitz dynamics simulations with forces predicted from the neural-net model accurately reproduce the voltage-driven domain-wall propagation. Our work opens a new avenue for multi-scale modeling of nonequilibrium dynamical phenomena in itinerant magnets and spintronics based on machine-learning models.

【11】 Robust learning of data anomalies with analytically-solvable entropic outlier sparsification 标题：基于解析可解熵离群点稀疏的数据异常鲁棒学习链接：https://arxiv.org/abs/2112.11768

作者：Illia Horenko 机构：Universit´a della Svizzera Italiana (USI), Institute of Computing, Via, G. Buffi , TI-, Lugano, Switzerland 备注：9 pages, 1 figure 摘要：熵离群点稀疏化（EOS）是一种稳健的计算策略，用于检测一大类学习方法中的数据异常，包括无监督问题（如检测大部分高斯数据中的非高斯离群点）和有监督的错误标记数据学习。状态方程组致力于导出香农熵正则化下（加权）期望误差最小化问题的解析闭式解。与通常的正则化策略不同，这种策略需要计算量随数据维数按多项式进行缩放，而确定的闭式解被证明会带来额外的迭代量，这些迭代量与统计数据的大小成线性关系，并且与数据维数无关。获得的分析结果还解释了为什么球对称高斯混合（在许多流行的数据分析算法中启发式使用）在使用平方欧几里德距离时，代表了非参数概率分布的最佳选择，结合了预期误差最小性、最大熵/无偏性，和线性成本缩放。EOS的性能与一系列常用工具在合成问题和生物医学的部分错误标记监督分类问题上进行了比较。摘要：Entropic Outlier Sparsification (EOS) is proposed as a robust computational strategy for the detection of data anomalies in a broad class of learning methods, including the unsupervised problems (like detection of non-Gaussian outliers in mostly-Gaussian data) and in the supervised learning with mislabeled data. EOS dwells on the derived analytic closed-form solution of the (weighted) expected error minimization problem subject to the Shannon entropy regularization. In contrast to common regularization strategies requiring computational costs that scale polynomial with the data dimension, identified closed-form solution is proven to impose additional iteration costs that depend linearly on statistics size and are independent of data dimension. Obtained analytic results also explain why the mixtures of spherically-symmetric Gaussians - used heuristically in many popular data analysis algorithms - represent an optimal choice for the non-parametric probability distributions when working with squared Euclidean distances, combining expected error minimality, maximal entropy/unbiasedness, and a linear cost scaling. The performance of EOS is compared to a range of commonly-used tools on synthetic problems and on partially-mislabeled supervised classification problems from biomedicine.

【12】 Noise-injected analog Ising machines enable ultrafast statistical sampling and machine learning 标题：注入噪声的模拟伊辛机实现超快统计采样和机器学习链接：https://arxiv.org/abs/2112.11534

作者：Fabian Böhm,Diego Alonso-Urquijo,Guy Verschaffelt,Guy Van der Sande 机构：Applied Physics Research Group, Vrije Universiteit Brussel, Pleinlaan , Brussels, Belgium, ) 摘要：伊辛机是一种很有前途的非冯诺依曼计算概念，用于神经网络训练和组合优化。然而，尽管各种神经网络都可以用伊辛机器实现，但与数字计算机相比，伊辛机器无法执行快速统计采样，这使得它们在训练这些神经网络时效率低下。在这里，我们介绍了一个通用的概念，通过注入模拟噪声实现伊辛机的超快统计采样。通过一台光电伊辛机，我们证明了它可以用于精确的玻尔兹曼分布采样和神经网络的无监督训练，与基于软件的训练具有相同的精度。通过模拟，我们发现伊辛机器可以比基于软件的方法更快地执行数量级的统计采样顺序。这使得伊辛机器成为机器学习和组合优化以外的其他应用的有效工具。摘要：Ising machines are a promising non-von-Neumann computational concept for neural network training and combinatorial optimization. However, while various neural networks can be implemented with Ising machines, their inability to perform fast statistical sampling makes them inefficient for training these neural networks compared to digital computers. Here, we introduce a universal concept to achieve ultrafast statistical sampling with Ising machines by injecting analog noise. With an opto-electronic Ising machine, we demonstrate that this can be used for accurate sampling of Boltzmann distributions and unsupervised training of neural networks, with equal accuracy as software-based training. Through simulations, we find that Ising machines can perform statistical sampling orders-of-magnitudes faster than software-based methods. This makes Ising machines into efficient tools for machine learning and other applications beyond combinatorial optimization.

其他(15篇)

【1】 Spatio-Temporal CNN baseline method for the Sports Video Task of MediaEval 2021 benchmark 标题：用于中世纪2021年基准体育视频任务的时空CNN基线方法链接：https://arxiv.org/abs/2112.12074

作者：Pierre-Etienne Martin 机构：CCP Department, Max Planck Institute for Evolutionary Anthropology, D-, Leipzig, Germany, Video Stream, Conv, (,x,x,), ReLU, Pool, FC, for classification, for detection, SoftMax, Probabilistic, Output 备注：None 摘要：本文介绍了中世纪2021基准的体育视频任务部分的基线方法。此任务提出了笔划检测和笔划分类子任务。此基线处理两个子任务。该模型的时空CNN结构和训练过程是根据所处理的子任务定制的。该方法的目的是帮助参与者解决任务，而不是达到最先进的表演水平。尽管如此，对于检测任务，基线的表现要好于其他参与者，这突出了此类任务的难度。摘要：This paper presents the baseline method proposed for the Sports Video task part of the MediaEval 2021 benchmark. This task proposes a stroke detection and a stroke classification subtasks. This baseline addresses both subtasks. The spatio-temporal CNN architecture and the training process of the model are tailored according to the addressed subtask. The method has the purpose of helping the participants to solve the task and is not meant to reach stateof-the-art performance. Still, for the detection task, the baseline is performing better than the other participants, which stresses the difficulty of such a task.

【2】 SOLIS -- The MLOps journey from data acquisition to actionable insights 标题：SOLIS--MLOPS从数据采集到可操作的洞察力之旅链接：https://arxiv.org/abs/2112.11925

作者：Razvan Ciobanu,Alexandru Purdila,Laurentiu Piciu,Andrei Damian 机构：Lummetry.AI, Bucharest 摘要：机器学习操作无疑是一个非常重要的问题，也是最近人工智能领域最热门的话题之一。能够为机器学习模型能够解决的实际问题定义非常清晰的假设，收集和整理大量数据用于模型训练和验证，然后进行模型架构搜索和实际优化，最后给出结果，非常适合数据科学实验的场景。然而，这种方法并没有提供在实际生产级系统中实际部署机器学习能力所需的程序和管道。自动化实时配置机制，动态适应实时或离线数据捕获和消费，在边缘或云架构上并行服务多个模型，解决GPU内存或计算能力的特定限制，后处理推断或预测结果，并将其作为API或在相同的端到端管道中使用基于物联网的通信堆栈，是我们在本文中试图解决的真正挑战。在本文中，我们提出了一种统一的部署管道和操作自由的方法，在使用基本的跨平台tensor框架和脚本语言引擎的同时支持上述所有需求。摘要：Machine Learning operations is unarguably a very important and also one of the hottest topics in Artificial Intelligence lately. Being able to define very clear hypotheses for actual real-life problems that can be addressed by machine learning models, collecting and curating large amounts of data for model training and validation followed by model architecture search and actual optimization and finally presenting the results fits very well the scenario of Data Science experiments. This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems. Automating live configuration mechanisms, on the fly adapting to live or offline data capture and consumption, serving multiple models in parallel either on edge or cloud architectures, addressing specific limitations of GPU memory or compute power, post-processing inference or prediction results and serving those either as APIs or with IoT based communication stacks in the same end-to-end pipeline are the real challenges that we try to address in this particular paper. In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all above requirements while using basic cross-platform tensor framework and script language engines.

【3】 End to End Software Engineering Research 标题：端到端软件工程研究链接：https://arxiv.org/abs/2112.11858

作者：Idan Amit 机构：Department of Computer Science, Hebrew University of Jerusalem and Acumen Labs 摘要：端到端学习是机器学习，从原始数据开始，预测期望的概念，所有步骤都自动完成。在软件工程上下文中，我们将其视为从源代码开始并预测过程度量。该框架可用于预测缺陷、代码质量、生产率等。端到端通过不需要领域专家和能够提取新知识来改进基于特征的机器学习。我们描述了一个数据集，其中包含了来自为实现这一目标而构建的15k个项目的500万个文件。数据集的构建方式不仅可以预测概念，还可以调查其原因。摘要：End to end learning is machine learning starting in raw data and predicting a desired concept, with all steps done automatically. In software engineering context, we see it as starting from the source code and predicting process metrics. This framework can be used for predicting defects, code quality, productivity and more. End-to-end improves over features based machine learning by not requiring domain experts and being able to extract new knowledge. We describe a dataset of 5M files from 15k projects constructed for this goal. The dataset is constructed in a way that enables not only predicting concepts but also investigating their causes.

【4】 Decentralized Task Offloading in Edge Computing: A Multi-User Multi-Armed Bandit Approach 标题：边缘计算中的分散式任务卸载：一种多用户、多臂的强盗方法链接：https://arxiv.org/abs/2112.11818

作者：Xiong Wang,Jiancheng Ye,John C. S. Lui 机构：∗ National Engineering Research Center for Big Data Technology and System, Services Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China 备注：INFOCOM 2022 摘要：移动边缘计算方便用户将计算任务卸载到边缘服务器，以满足其严格的延迟要求。以前的工作主要探讨在给定系统端信息（例如，服务器处理速度、蜂窝数据速率）时的任务卸载，或在系统不确定性下的集中式卸载。但这两种方法通常都无法在动态和不确定的环境中处理涉及多个共存用户的任务放置。在本文中，我们开发了一个考虑未知但随机的系统端信息的多用户卸载框架，以实现分散的用户发起的服务放置。具体地说，我们将动态任务分配描述为在线多用户多武装bandit过程，并提出了一种基于分散历元的卸载（DEBO）来优化网络延迟下的用户奖励。我们证明了DEBO可以推导出最优的用户-服务器分配，从而实现接近最优的服务性能和紧密的O（logt）卸载。此外，我们将DEBO推广到各种常见场景，如未知的报酬差距、客户的动态进入或离开以及公平的报酬分配，同时进一步探索用户卸载的任务何时需要异构计算资源。特别是，我们为每一个这样的情况都完成了一个次线性遗憾。基于实际测量的评估证实了我们的卸载方案在优化延迟敏感奖励方面优于最先进的方法。摘要：Mobile edge computing facilitates users to offload computation tasks to edge servers for meeting their stringent delay requirements. Previous works mainly explore task offloading when system-side information is given (e.g., server processing speed, cellular data rate), or centralized offloading under system uncertainty. But both generally fall short to handle task placement involving many coexisting users in a dynamic and uncertain environment. In this paper, we develop a multi-user offloading framework considering unknown yet stochastic system-side information to enable a decentralized user-initiated service placement. Specifically, we formulate the dynamic task placement as an online multi-user multi-armed bandit process, and propose a decentralized epoch based offloading (DEBO) to optimize user rewards which are subjected under network delay. We show that DEBO can deduce the optimal user-server assignment, thereby achieving a close-to-optimal service performance and tight O(log T) offloading regret. Moreover, we generalize DEBO to various common scenarios such as unknown reward gap, dynamic entering or leaving of clients, and fair reward distribution, while further exploring when users' offloaded tasks require heterogeneous computing resources. Particularly, we accomplish a sub-linear regret for each of these instances. Real measurements based evaluations corroborate the superiority of our offloading schemes over state-of-the-art approaches in optimizing delay-sensitive rewards.

【5】 Lifting Symmetry Breaking Constraints with Inductive Logic Programming 标题：用归纳逻辑编程提升对称破缺约束链接：https://arxiv.org/abs/2112.11806

作者：Alice Tarzariol,Martin Gebser,Konstantin Schekotihin 机构：Received: date Accepted: date 摘要：有效地省略对称候选解是组合问题求解的关键。现有的大多数方法都是针对具体实例的，并且侧重于为每个给定问题实例自动计算对称破缺约束（SBC）。然而，将此类方法应用于大规模实例或高级问题编码可能存在问题，因为计算出的SBC是命题的，因此既不能有意义地解释，也不能转移到其他实例。因此，必须在每次调用解算器之前完成耗时的SBC重新计算。为了克服这些限制，我们引入了一种新的面向模型的答案集编程方法，该方法使用归纳逻辑编程范式将小问题实例的SBC提升为一组可解释的一阶约束。实验证明了我们的框架能够从一系列组合问题的特定实例SBC中学习一般约束。所得结果表明，我们的方法明显优于最先进的实例特定方法以及直接应用解算器的方法。摘要：Efficient omission of symmetric solution candidates is essential for combinatorial problem-solving. Most of the existing approaches are instance-specific and focus on the automatic computation of Symmetry Breaking Constraints (SBCs) for each given problem instance. However, the application of such approaches to large-scale instances or advanced problem encodings might be problematic since the computed SBCs are propositional and, therefore, can neither be meaningfully interpreted nor transferred to other instances. As a result, a time-consuming recomputation of SBCs must be done before every invocation of a solver. To overcome these limitations, we introduce a new model-oriented approach for Answer Set Programming that lifts the SBCs of small problem instances into a set of interpretable first-order constraints using the Inductive Logic Programming paradigm. Experiments demonstrate the ability of our framework to learn general constraints from instance-specific SBCs for a collection of combinatorial problems. The obtained results indicate that our approach significantly outperforms a state-of-the-art instance-specific method as well as the direct application of a solver.

【6】 DRF Codes: Deep SNR-Robust Feedback Codes 标题：DRF码：深信噪比鲁棒反馈码链接：https://arxiv.org/abs/2112.11789

作者：Mahdi Boloursaz Mashhadi,Deniz Gunduz,Alberto Perotti,Branislav Popovic 机构：Institute for Communication Systems (ICS), University of Surrey 摘要：针对具有输出反馈的衰落信道，我们提出了一种新的基于深度神经网络（DNN）的纠错码，称为深度SNR鲁棒反馈（DRF）码。在编码器处，奇偶校验符号由长短时存储器（LSTM）网络基于消息以及发射机以噪声方式观察到的过去前向信道输出生成。解码器使用双向LSTM架构以及信噪比（SNR）感知注意神经网络对消息进行解码。所提出的码克服了先前提出的基于DNN的码在被动输出反馈信道上的两个主要缺点：（i）解码器处的SNR感知注意机制使得相同训练的NN能够可靠地应用于大范围的SNR值；（ii）采用批量调度的课程训练，以加速和稳定训练，同时提高生成代码的信噪比鲁棒性。我们发现，在加性高斯白噪声（AWGN）信道中，DRF码在信噪比鲁棒性和误码率方面都显著优于现有技术。在接收端具有完美相位补偿的衰落信道中，DRF码学习有效地利用瞬时衰落幅度的知识（编码器可通过反馈获得），以减少与解码器处的信道估计相关的开销和复杂性。最后，我们证明了DRF码在具有反馈的多播信道中的有效性，其中线性反馈码是严格次优的。摘要：We present a new deep-neural-network (DNN) based error correction code for fading channels with output feedback, called deep SNR-robust feedback (DRF) code. At the encoder, parity symbols are generated by a long short term memory (LSTM) network based on the message as well as the past forward channel outputs observed by the transmitter in a noisy fashion. The decoder uses a bi-directional LSTM architecture along with a signal to noise ratio (SNR)-aware attention NN to decode the message. The proposed code overcomes two major shortcomings of the previously proposed DNN-based codes over channels with passive output feedback: (i) the SNR-aware attention mechanism at the decoder enables reliable application of the same trained NN over a wide range of SNR values; (ii) curriculum training with batch-size scheduling is used to speed up and stabilize training while improving the SNR-robustness of the resulting code. We show that the DRF codes significantly outperform state-of-the-art in terms of both the SNR-robustness and the error rate in additive white Gaussian noise (AWGN) channel with feedback. In fading channels with perfect phase compensation at the receiver, DRF codes learn to efficiently exploit knowledge of the instantaneous fading amplitude (which is available to the encoder through feedback) to reduce the overhead and complexity associated with channel estimation at the decoder. Finally, we show the effectiveness of DRF codes in multicast channels with feedback, where linear feedback codes are known to be strictly suboptimal.

【7】 Simple and Effective Balance of Contrastive Losses 标题：简单有效的对比损失平衡链接：https://arxiv.org/abs/2112.11743

作者：Arnaud Sors,Rafael Sampaio de Rezende,Sarah Ibrahimi,Jean-Marc Andreoli 机构：† NAVER LABS Europe, ‡ University of Amsterdam 备注：15 pages, 10 figures 摘要：对比损失长期以来一直是深度度量学习的一个关键因素，现在由于自我监督学习的成功而变得越来越流行。最近的研究表明，将这些损失分解为两个子损失的好处，这两个子损失在学习表示网络时以互补的方式起作用：一个正项和一个熵项。尽管总体损失因此被定义为两个术语的组合，但这两个术语的平衡往往隐藏在实现细节后面，在实践中被忽略和次优。在这项工作中，我们将对比损失平衡作为一个超参数优化问题来处理，并提出了一种基于坐标下降的搜索方法，该方法可以有效地找到优化评估性能的超参数。在此过程中，我们将现有的余额分析扩展到对比利润损失，在余额中包括批次大小，并解释如何从批次中聚合损失元素，以在更大的批次大小范围内保持接近最优的性能。对深度度量学习和自监督学习的基准测试进行的大量实验表明，与其他常用搜索方法相比，该方法能更快地找到最优超参数。摘要：Contrastive losses have long been a key ingredient of deep metric learning and are now becoming more popular due to the success of self-supervised learning. Recent research has shown the benefit of decomposing such losses into two sub-losses which act in a complementary way when learning the representation network: a positive term and an entropy term. Although the overall loss is thus defined as a combination of two terms, the balance of these two terms is often hidden behind implementation details and is largely ignored and sub-optimal in practice. In this work, we approach the balance of contrastive losses as a hyper-parameter optimization problem, and propose a coordinate descent-based search method that efficiently find the hyper-parameters that optimize evaluation performance. In the process, we extend existing balance analyses to the contrastive margin loss, include batch size in the balance, and explain how to aggregate loss elements from the batch to maintain near-optimal performance over a larger range of batch sizes. Extensive experiments with benchmarks from deep metric learning and self-supervised learning show that optimal hyper-parameters are found faster with our method than with other common search methods.

【8】 Squareplus: A Softplus-Like Algebraic Rectifier 标题：Squareplus：一种类似Softplus的代数整流器链接：https://arxiv.org/abs/2112.11687

作者：Jonathan T. Barron 备注：this https URL 摘要：我们介绍了squareplus，一个类似softplus的激活函数，但它只能使用代数运算（加法、乘法和平方根）进行计算。由于squareplus在CPU上的计算速度比softplus快约6倍，并且不需要访问超越函数，因此它在资源有限的深度学习应用中可能具有实用价值。摘要：We present squareplus, an activation function that resembles softplus, but which can be computed using only algebraic operations: addition, multiplication, and square-root. Because squareplus is ~6x faster to evaluate than softplus on a CPU and does not require access to transcendental functions, it may have practical value in resource-limited deep learning applications.

【9】 MECATS: Mixture-of-Experts for Quantile Forecasts of Aggregated Time Series 标题：MECATS：聚合时间序列分位数预测的混合专家链接：https://arxiv.org/abs/2112.11669

作者：Xing Han,Jing Hu,Joydeep Ghosh 机构：University of Texas at Austin, Intuit AI 备注：41 pages, 13 figures, 20 tables 摘要：我们介绍了一个称为\texttt{MECATS}的混合异构专家框架，它同时预测通过聚合层次结构关联的一组时间序列的值。不同类型的预测模型可作为独立专家使用，因此每个模型的形式可根据相应时间序列的性质进行调整。\texttt{MECATS}在训练阶段学习层次关系，以帮助更好地概括建模的所有时间序列，并缓解由于层次结构施加的约束而产生的一致性问题。我们进一步在点预测的基础上构建多个分位数估计。由此产生的概率预测几乎是一致的、无分布的，并且与预测模型的选择无关。我们对点预测和概率预测进行了综合评估，并对序列数据中存在变化点的情况进行了扩展。一般来说，我们的方法是健壮的，适用于具有不同属性的数据集，并且对于大规模预测管道是高度可配置和高效的。摘要：We introduce a mixture of heterogeneous experts framework called \texttt{MECATS}, which simultaneously forecasts the values of a set of time series that are related through an aggregation hierarchy. Different types of forecasting models can be employed as individual experts so that the form of each model can be tailored to the nature of the corresponding time series. \texttt{MECATS} learns hierarchical relationships during the training stage to help generalize better across all the time series being modeled and also mitigates coherency issues that arise due to constraints imposed by the hierarchy. We further build multiple quantile estimators on top of the point forecasts. The resulting probabilistic forecasts are nearly coherent, distribution-free, and independent of the choice of forecasting models. We conduct a comprehensive evaluation on both point and probabilistic forecasts and also formulate an extension for situations where change points exist in sequential data. In general, our method is robust, adaptive to datasets with different properties, and highly configurable and efficient for large-scale forecasting pipelines.

【10】 Decompose the Sounds and Pixels, Recompose the Events 标题：分解声音和像素，重组事件链接：https://arxiv.org/abs/2112.11547

作者：Varshanth R. Rao,Md Ibrahim Khalil,Haoda Li,Peng Dai,Juwei Lu 机构：Huawei Noah’s Ark Lab, University of Waterloo, Canada, University of Toronto, Canada 备注：Accepted at AAAI 2022 摘要：在本文中，我们提出了一个以事件分解-重组网络（EDRNet）为核心的框架来解决有监督和弱监督环境下的视听事件（AVE）定位问题。现实世界中的AVE显示出常见的分解模式（称为事件进度检查点（EPC）），人类可以通过听觉和视觉感官的合作来感知这些模式。与早期尝试识别整个事件序列的方法不同，EDRNet使用堆叠时间卷积对EPC和EPC间关系进行建模。基于EPC表示理论上与事件类别一致的假设，我们介绍了基于状态机的视频融合，这是一种新的增强技术，它使用不同的EPC模板序列混合源视频。此外，我们还设计了一个新的损失函数，称为陆-岸-海损失，以压缩连续的前景和背景表示。最后，为了缓解在弱监控期间混淆事件的问题，我们提出了一种称为包到实例标签校正的预测稳定方法。在AVE数据集上的实验表明，我们的集体框架比最先进的框架有相当大的优势。摘要：In this paper, we propose a framework centering around a novel architecture called the Event Decomposition Recomposition Network (EDRNet) to tackle the Audio-Visual Event (AVE) localization problem in the supervised and weakly supervised settings. AVEs in the real world exhibit common unravelling patterns (termed as Event Progress Checkpoints (EPC)), which humans can perceive through the cooperation of their auditory and visual senses. Unlike earlier methods which attempt to recognize entire event sequences, the EDRNet models EPCs and inter-EPC relationships using stacked temporal convolutions. Based on the postulation that EPC representations are theoretically consistent for an event category, we introduce the State Machine Based Video Fusion, a novel augmentation technique that blends source videos using different EPC template sequences. Additionally, we design a new loss function called the Land-Shore-Sea loss to compactify continuous foreground and background representations. Lastly, to alleviate the issue of confusing events during weak supervision, we propose a prediction stabilization method called Bag to Instance Label Correction. Experiments on the AVE dataset show that our collective framework outperforms the state-of-the-art by a sizable margin.

【11】 Off Environment Evaluation Using Convex Risk Minimization 标题：基于凸风险最小化的离岸环境评价链接：https://arxiv.org/abs/2112.11532

作者：Pulkit Katdare,Shuijing Liu,Katherine Driggs-Campbell 机构： These synthetic perturba-tions are created under the assumption that the discrepancyThe authors are with the department of Electrical and Computer Engi-neering, University of Illinois at Urbana-Champaign 备注：7 pages, 3 figures (with sub-figures) 摘要：在机器人上应用强化学习（RL）方法通常涉及在模拟中训练策略，并将其部署到现实世界中的机器人上。由于真实世界和模拟器之间的模型不匹配，以这种方式部署的RL代理的性能往往不理想。为了解决这个问题，研究人员开发了基于合成噪声干扰的鲁棒策略学习算法。但是，这些方法不能保证在目标环境中的性能。我们提出了一种凸风险最小化算法，利用来自两种环境的轨迹数据估计模拟器和目标域之间的模型失配。我们表明，该估计器可与模拟器一起用于评估目标域中RL代理的性能，从而有效地弥合这两种环境之间的差距。我们还证明了我们的估计的收敛速度为${n^{-1/4}}$$，其中$n$是训练样本数。在模拟中，我们演示了我们的方法如何有效地近似和评估Gridworld、Cartpole和Reacher环境中一系列策略的性能。我们还表明，我们的方法能够估计性能的7自由度机械臂使用模拟器和远程收集的数据从机器人在现实世界中。摘要：Applying reinforcement learning (RL) methods on robots typically involves training a policy in simulation and deploying it on a robot in the real world. Because of the model mismatch between the real world and the simulator, RL agents deployed in this manner tend to perform suboptimally. To tackle this problem, researchers have developed robust policy learning algorithms that rely on synthetic noise disturbances. However, such methods do not guarantee performance in the target environment. We propose a convex risk minimization algorithm to estimate the model mismatch between the simulator and the target domain using trajectory data from both environments. We show that this estimator can be used along with the simulator to evaluate performance of an RL agents in the target domain, effectively bridging the gap between these two environments. We also show that the convergence rate of our estimator to be of the order of ${n^{-1/4}}$, where $n$ is the number of training samples. In simulation, we demonstrate how our method effectively approximates and evaluates performance on Gridworld, Cartpole, and Reacher environments on a range of policies. We also show that the our method is able to estimate performance of a 7 DOF robotic arm using the simulator and remotely collected data from the robot in the real world.

【12】 Sentence Embeddings and High-speed Similarity Search for Fast Computer Assisted Annotation of Legal Documents 标题：基于句子嵌入和快速相似度搜索的快速计算机辅助法律文本标注链接：https://arxiv.org/abs/2112.11494

作者：Hannes Westermann,Jaromir Savelka,Vern R. Walker,Kevin D. Ashley,Karim Benyekhlef 机构：Cyberjustice Laboratory, Facult´e de droit, Universit´e de Montr´eal, School of Computer Science, Carnegie Mellon University, LLT Lab, Maurice A. Deane School of Law, Hofstra University, School of Computing and Information, University of Pittsburgh 备注：None 摘要：法律文件中句子的人工注释是许多支持法律任务的机器学习系统的重要前提。通常，注释是按顺序逐句完成的，这通常很耗时，因此成本很高。在这篇文章中，我们介绍了一个“横向”注释句子的概念证明系统这种方法基于这样一种观察，即意义相似的句子在特定类型系统中通常具有相同的标签。我们利用这一观察结果，允许注释者在整个文档库中快速查看和注释语义与给定句子相似的句子。这里，我们展示了系统的界面，并对该方法进行了实证评估。实验表明，横向标注有可能使标注过程更快、更一致。摘要：Human-performed annotation of sentences in legal documents is an important prerequisite to many machine learning based systems supporting legal tasks. Typically, the annotation is done sequentially, sentence by sentence, which is often time consuming and, hence, expensive. In this paper, we introduce a proof-of-concept system for annotating sentences "laterally." The approach is based on the observation that sentences that are similar in meaning often have the same label in terms of a particular type system. We use this observation in allowing annotators to quickly view and annotate sentences that are semantically similar to a given sentence, across an entire corpus of documents. Here, we present the interface of the system and empirically evaluate the approach. The experiments show that lateral annotation has the potential to make the annotation process quicker and more consistent.

【13】 LSH methods for data deduplication in a Wikipedia artificial dataset 标题：用于维基百科人工数据集中重复数据删除的LSH方法链接：https://arxiv.org/abs/2112.11478

作者：Juan Ciro,Daniel Galvez,Tim Schlippe,David Kanter 机构：Factored, NVIDIA, IU University of Applied Sciences, MLCommons 摘要：本文阐述了用于识别和删除文本数据集中几乎冗余数据的局部敏感hasing（LSH）模型。为了评估不同的模型，我们使用英文维基百科文章创建了一个用于重复数据消除的人工数据集。大多数模型的曲线下面积（AUC）超过0.9，最佳模型达到0.96。重复数据消除可防止模型因重复数据而学习到与真实分布不同的分布，从而实现更有效的模型训练。摘要：This paper illustrates locality sensitive hasing (LSH) models for the identification and removal of nearly redundant data in a text dataset. To evaluate the different models, we create an artificial dataset for data deduplication using English Wikipedia articles. Area-Under-Curve (AUC) over 0.9 were observed for most models, with the best model reaching 0.96. Deduplication enables more effective model training by preventing the model from learning a distribution that differs from the real one as a result of the repeated data.

【14】 Towards a Science of Human-AI Decision Making: A Survey of Empirical Studies 标题：走向人类-人工智能决策科学：实证研究综述链接：https://arxiv.org/abs/2112.11471

作者：Vivian Lai,Chacha Chen,Q. Vera Liao,Alison Smith-Renner,Chenhao Tan 机构： University of Colorado Boulder, University of Chicago 备注：36 pages, 2 figures, see this https URL for website 摘要：随着人工智能系统显示出越来越强的预测性能，它们在许多领域的应用也越来越多。然而，在刑事司法和医疗保健等高风险领域，出于安全、道德和法律方面的考虑，全自动化通常是不可取的，但全手动方法可能不准确且耗时。因此，研究界越来越有兴趣利用人工智能辅助人类决策。除了为此目的开发人工智能技术外，人类人工智能决策的新兴领域还必须包含经验方法，以形成对人类如何与人工智能互动并与人工智能一起做出决策的基本理解。为了邀请并帮助构建理解和改进人工智能决策的科学的研究工作，我们调查了关于这一主题的实证人类主体研究的最新文献。我们总结了100多篇论文在三个重要方面做出的研究设计选择：（1）决策任务，（2）人工智能模型和人工智能辅助元素，以及（3）评估指标。对于每个方面，我们总结当前趋势，讨论该领域当前实践中的差距，并为未来的研究提出建议。我们的调查强调需要开发通用框架来解释人工智能决策的设计和研究空间，以便研究人员能够在研究设计中做出严格的选择，研究社区能够在彼此的工作基础上发展，产生可概括的科学知识。我们还希望这项调查将成为HCI和AI社区合作的桥梁，共同塑造人类AI决策的经验科学和计算技术。摘要：As AI systems demonstrate increasingly strong predictive performance, their adoption has grown in numerous domains. However, in high-stakes domains such as criminal justice and healthcare, full automation is often not desirable due to safety, ethical, and legal concerns, yet fully manual approaches can be inaccurate and time consuming. As a result, there is growing interest in the research community to augment human decision making with AI assistance. Besides developing AI technologies for this purpose, the emerging field of human-AI decision making must embrace empirical approaches to form a foundational understanding of how humans interact and work with AI to make decisions. To invite and help structure research efforts towards a science of understanding and improving human-AI decision making, we survey recent literature of empirical human-subject studies on this topic. We summarize the study design choices made in over 100 papers in three important aspects: (1) decision tasks, (2) AI models and AI assistance elements, and (3) evaluation metrics. For each aspect, we summarize current trends, discuss gaps in current practices of the field, and make a list of recommendations for future research. Our survey highlights the need to develop common frameworks to account for the design and research spaces of human-AI decision making, so that researchers can make rigorous choices in study design, and the research community can build on each other's work and produce generalizable scientific knowledge. We also hope this survey will serve as a bridge for HCI and AI communities to work together to mutually shape the empirical science and computational technologies for human-AI decision making.

【15】 Variational Quantum Soft Actor-Critic 标题：变分量子软演员-批评家链接：https://arxiv.org/abs/2112.11921

作者：Qingfeng Lan 机构：Department of Computing Science, University of Alberta, Edmonton, Canada 备注：A course project paper 摘要：量子计算在解决诸如整数分解和西蒙问题等具体问题方面具有优越的优势。对于机器学习中更一般的任务，通过应用变分量子电路，近年来提出了越来越多的量子算法，特别是在有监督学习和无监督学习中。然而，在强化学习方面几乎没有做过什么工作，可以说强化学习更重要、更具挑战性。量子强化学习的前期工作主要集中于离散控制任务，其中动作空间是离散的。在这项工作中，我们开发了一种基于软参与者批评的量子强化学习算法——连续控制的最新方法之一。具体来说，我们使用了一个混合量子经典策略网络，该网络由一个变分量子电路和一个经典人工神经网络组成。在一个标准的强化学习基准测试中，我们表明，这个量子版本的软演员批评家与原始的软演员批评家相当，使用更少的可调参数。此外，我们还分析了不同超参数和策略网络结构的影响，指出了结构设计对量子强化学习的重要性。摘要：Quantum computing has a superior advantage in tackling specific problems, such as integer factorization and Simon's problem. For more general tasks in machine learning, by applying variational quantum circuits, more and more quantum algorithms have been proposed recently, especially in supervised learning and unsupervised learning. However, little work has been done in reinforcement learning, arguably more important and challenging. Previous work in quantum reinforcement learning mainly focuses on discrete control tasks where the action space is discrete. In this work, we develop a quantum reinforcement learning algorithm based on soft actor-critic -- one of the state-of-the-art methods for continuous control. Specifically, we use a hybrid quantum-classical policy network consisting of a variational quantum circuit and a classical artificial neural network. Tested in a standard reinforcement learning benchmark, we show that this quantum version of soft actor-critic is comparable with the original soft actor-critic, using much less adjustable parameters. Furthermore, we analyze the effect of different hyper-parameters and policy network architectures, pointing out the importance of architecture design for quantum reinforcement learning.

机器翻译，仅供参考

本文参与腾讯云自媒体同步曝光计划，分享自微信公众号。

原始发表：2021-12-23，如有侵权请联系 cloudcommunity@tencent.com 删除

linux