机器学习学术速递[7.28]

公众号-arXiv每日学术速递

发布于 2021-07-29 14:10:26

1.2K0

发布于 2021-07-29 14:10:26

文章被收录于专栏：arXiv每日学术速递arXiv每日学术速递

访问www.arxivdaily.com获取含摘要速递，涵盖CS|物理|数学|经济|统计|金融|生物|电气领域，更有搜索、收藏、发帖等功能！点击阅读原文即可访问

cs.LG 方向，今日共计84篇

Graph相关(图学习|图神经网络|图优化等)(3篇)

【1】 Short-Term Electricity Price Forecasting based on Graph Convolution Network and Attention Mechanism 标题：基于图卷积网络和注意力机制的短期电价预测

作者：Yuyun Yang,Zhenfei Tan,Haitao Yang,Guangchun Ruan,Haiwang Zhong 机构：State Key Laboratory of Power Systems, Department of Electrical Engineering, Tsinghua University, Beijing , China 备注：Submitted to IET RPG. 9 pages, 15 figures, 6 tables 链接：https://arxiv.org/abs/2107.12794 摘要：在电力市场中，位置边际电价（LMP）预测对于市场参与者制定合理的竞价策略、管理潜在的交易风险、支持有效的系统规划和运行具有重要意义。与现有的仅考虑LMPs时态特征的方法不同，本文提出了一种谱图卷积网络（GCN），极大地提高了短期LMP预测的精度。然后设计一个三分支网络结构来匹配LMPs的组成结构。这种网络能够提取LMPs的时空特征，同时为所有节点提供快速、高质量的预测。注意机制也被用来在不同的节点和时隙之间分配不同的重要性权重。基于IEEE-118测试系统和PJM的实际数据的实例分析表明，该模型的预测精度优于现有的预测模型，并通过避免极端误差保持了良好的性能。摘要：In electricity markets, locational marginal price (LMP) forecasting is particularly important for market participants in making reasonable bidding strategies, managing potential trading risks, and supporting efficient system planning and operation. Unlike existing methods that only consider LMPs' temporal features, this paper tailors a spectral graph convolutional network (GCN) to greatly improve the accuracy of short-term LMP forecasting. A three-branch network structure is then designed to match the structure of LMPs' compositions. Such kind of network can extract the spatial-temporal features of LMPs, and provide fast and high-quality predictions for all nodes simultaneously. The attention mechanism is also implemented to assign varying importance weights between different nodes and time slots. Case studies based on the IEEE-118 test system and real-world data from the PJM validate that the proposed model outperforms existing forecasting models in accuracy, and maintains a robust performance by avoiding extreme errors.

【2】 The Graph Neural Networking Challenge: A Worldwide Competition for Education in AI/ML for Networks 标题：图形神经网络挑战赛：面向网络的AI/ML全球教育竞赛

作者：José Suárez-Varela,Miquel Ferriol-Galmés,Albert López,Paul Almasan,Guillermo Bernárdez,David Pujol-Perich,Krzysztof Rusek,Loïck Bonniot,Christoph Neumann,François Schnitzler,François Taïani,Martin Happ,Christian Maier,Jia Lei Du,Matthias Herlich,Peter Dorfinger,Nick Vincent Hainke,Stefan Venz,Johannes Wegener,Henrike Wissing,Bo Wu,Shihan Xiao,Pere Barlet-Ros,Albert Cabellos-Aparicio 机构：Barcelona Neural Networking center, Universitat Politècnica de Catalunya, Spain, AGH University of Science and Technology, Department of Telecommunications, Poland, InterDigital, France, Univ. Rennes, Inria, CNRS, IRISA, France 备注：None 链接：https://arxiv.org/abs/2107.12433 摘要：近十年来，机器学习逐渐成为计算机网络领域的一个热门话题，并有望在实际部署中逐渐应用于大量的控制、监视和管理任务。这就需要依靠新一代的学生、研究人员和实践者，他们都有扎实的ML应用于网络的背景。在2020年期间，国际电信联盟（ITU）组织了“ITU AI/ML 5G挑战赛”，这是一项开放的全球竞赛，向广大观众介绍了当前ML网络面临的一些主要挑战。这一大规模举措汇集了网络运营商、设备制造商和学术界提出的23项不同挑战，吸引了来自60多个国家的1300多名参与者。本文叙述了我们组织其中一项挑战的经验：“2020年图形神经网络挑战”。我们将介绍向参与者提出的问题、提供的工具和资源、一些组织方面和参与统计数据、前三名获奖解决方案的概要以及在整个过程中吸取的一些经验教训。因此，这一挑战使得任何对该主题感兴趣的人都可以公开获得一套精心策划的教育资源。摘要：During the last decade, Machine Learning (ML) has increasingly become a hot topic in the field of Computer Networks and is expected to be gradually adopted for a plethora of control, monitoring and management tasks in real-world deployments. This poses the need to count on new generations of students, researchers and practitioners with a solid background in ML applied to networks. During 2020, the International Telecommunication Union (ITU) has organized the "ITU AI/ML in 5G challenge'', an open global competition that has introduced to a broad audience some of the current main challenges in ML for networks. This large-scale initiative has gathered 23 different challenges proposed by network operators, equipment manufacturers and academia, and has attracted a total of 1300+ participants from 60+ countries. This paper narrates our experience organizing one of the proposed challenges: the "Graph Neural Networking Challenge 2020''. We describe the problem presented to participants, the tools and resources provided, some organization aspects and participation statistics, an outline of the top-3 awarded solutions, and a summary with some lessons learned during all this journey. As a result, this challenge leaves a curated set of educational resources openly available to anyone interested in the topic.

【3】 Graph Autoencoders for Embedding Learning in Brain Networks and Major Depressive Disorder Identification 标题：用于脑网络嵌入学习的图形自动编码器及抑郁症识别

作者：Fuad Noman,Chee-Ming Ting,Hakmook Kang,Raphael C. -W. Phan,Brian D. Boyd,Warren D. Taylor,Hernando Ombao 机构： Kang is with the Department of Biostatistics 链接：https://arxiv.org/abs/2107.12838 摘要：脑功能连接性（FC）揭示了识别各种神经精神疾病的生物标志物。近年来，深度神经网络（DNNs）在连接组分类中的应用主要依赖于基于规则欧氏网格的输入连通矩阵的传统卷积神经网络。我们提出一个图形深度学习框架，结合非欧几里德信息的图形结构分类功能磁共振成像（fMRI）衍生的脑网络在抑郁症（MDD）。设计了一种基于图卷积网络（GCNs）的新型图自动编码器（GAE）结构，将大型fMRI网络的拓扑结构和节点内容嵌入到低维的潜在表示中。在网络构建中，我们采用了Ledoit-Wolf（LDW）收缩方法来有效地估计fMRI数据中的高维FC度量。我们考虑监督和非监督方法的图形嵌入式学习。然后将学习到的嵌入信息作为深度全连接神经网络（FCNN）的特征输入，用于区分MDD和健康对照组。在43名受试者的静息态fMRI MDD数据集上进行评估，结果表明，所提出的GAE-FCNN模型显著优于几种最先进的DNN方法，使用LDW-FC度量作为节点特征，准确率达到72.50%。GAE学习的fMRI-FC网络的图形嵌入也揭示了MDD和HC之间明显的组间差异。我们的新框架证明了在脑网络上学习图形嵌入的可行性，从而为脑疾病的诊断提供鉴别信息。摘要：Brain functional connectivity (FC) reveals biomarkers for identification of various neuropsychiatric disorders. Recent application of deep neural networks (DNNs) to connectome-based classification mostly relies on traditional convolutional neural networks using input connectivity matrices on a regular Euclidean grid. We propose a graph deep learning framework to incorporate the non-Euclidean information about graph structure for classifying functional magnetic resonance imaging (fMRI)- derived brain networks in major depressive disorder (MDD). We design a novel graph autoencoder (GAE) architecture based on the graph convolutional networks (GCNs) to embed the topological structure and node content of large-sized fMRI networks into low-dimensional latent representations. In network construction, we employ the Ledoit-Wolf (LDW) shrinkage method to estimate the high-dimensional FC metrics efficiently from fMRI data. We consider both supervised and unsupervised approaches for the graph embedded learning. The learned embeddings are then used as feature inputs for a deep fully-connected neural network (FCNN) to discriminate MDD from healthy controls. Evaluated on a resting-state fMRI MDD dataset with 43 subjects, results show that the proposed GAE-FCNN model significantly outperforms several state-of-the-art DNN methods for brain connectome classification, achieving accuracy of 72.50% using the LDW-FC metrics as node features. The graph embeddings of fMRI FC networks learned by the GAE also reveal apparent group differences between MDD and HC. Our new framework demonstrates feasibility of learning graph embeddings on brain networks to provide discriminative information for diagnosis of brain disorders.

Transformer(2篇)

【1】 Exploring Sequence Feature Alignment for Domain Adaptive Detection Transformers 标题：基于域自适应检测Transformer的序列特征比对研究

作者：Wen Wang,Yang Cao,Jing Zhang,Fengxiang He,Zheng-Jun Zha,Yonggang Wen,Dacheng Tao 机构： University of Science and Technology of China, Institute of Artificial Intelligence, Hefei Comprehensive National Science Center 备注：Accepted by ACM MM2021. Source code is available at: this https URL 链接：https://arxiv.org/abs/2107.12636 摘要：近年来，检测Transformer在目标检测方面显示出了良好的效果，并引起了人们越来越多的关注。然而，如何开发有效的域自适应技术来提高其跨域性能仍然是一个未知数。在本文中，我们深入研究了这个主题，并通过实证发现，CNN主干上的直接特征分布对齐只能带来有限的改进，因为它不能保证Transformer中用于预测的域不变序列特征。为了解决这个问题，我们提出了一种新的序列特征对齐（SFA）方法，该方法是专为检测Transformer的自适应设计的。从技术上讲，SFA由一个基于域查询的特征对齐（DQFA）模块和一个基于令牌的特征对齐（TDA）模块组成。在DQFA中，一种新的域查询被用来从两个域的令牌序列聚合和对齐全局上下文。当DQFA分别部署在transformer编码器和解码器中时，它减少了全局特征表示和对象关系中的域差异。同时，TDA在两个域的序列中对齐令牌特征，这分别减少了transformer编码器和解码器中本地和实例级特征表示的域间隙。此外，本文还提出了一种新的二部匹配一致性损失算法，以增强鲁棒目标检测的特征鉴别能力。在三个具有挑战性的基准上进行的实验表明，SFA的性能优于现有的领域自适应目标检测方法。代码已在以下位置提供：https://github.com/encounter1997/SFA. 摘要：Detection transformers have recently shown promising object detection results and attracted increasing attention. However, how to develop effective domain adaptation techniques to improve its cross-domain performance remains unexplored and unclear. In this paper, we delve into this topic and empirically find that direct feature distribution alignment on the CNN backbone only brings limited improvements, as it does not guarantee domain-invariant sequence features in the transformer for prediction. To address this issue, we propose a novel Sequence Feature Alignment (SFA) method that is specially designed for the adaptation of detection transformers. Technically, SFA consists of a domain query-based feature alignment (DQFA) module and a token-wise feature alignment (TDA) module. In DQFA, a novel domain query is used to aggregate and align global context from the token sequence of both domains. DQFA reduces the domain discrepancy in global feature representations and object relations when deploying in the transformer encoder and decoder, respectively. Meanwhile, TDA aligns token features in the sequence from both domains, which reduces the domain gaps in local and instance-level feature representations in the transformer encoder and decoder, respectively. Besides, a novel bipartite matching consistency loss is proposed to enhance the feature discriminability for robust object detection. Experiments on three challenging benchmarks show that SFA outperforms state-of-the-art domain adaptive object detection methods. Code has been made available at: https://github.com/encounter1997/SFA.

【2】 Don't Sweep your Learning Rate under the Rug: A Closer Look at Cross-modal Transfer of Pretrained Transformers 标题：不要把你的学习速度藏在地毯下：仔细观察预先训练的Transformer的跨模式转移

作者：Danielle Rothermel,Margaret Li,Tim Rocktäschel,Jakob Foerster 机构： sentiment analysis (Maas 1Facebook AI Research 2University of Washington 3UniversityCollege London 备注：Accepted to ICML 2021 Workshop: Self-Supervised Learning for Reasoning and Perception 链接：https://arxiv.org/abs/2107.12460 摘要：在文本语料库上对大规模Transformer模型进行自监督预训练，然后进行微调，在许多自然语言处理任务上取得了最先进的成果。最近，Lu等人（2021，arXiv:2103.05247）声称，冻结的预训练Transformer（FPT）在一系列转移任务中与从零开始的训练以及未冻结的（微调的）预训练Transformer相匹配或优于训练。在我们的工作中，我们发现这个结果实际上是一个没有调整学习率的伪影。在仔细地重新设计了经验设置之后，我们发现当适当地调整学习率时，预训练的Transformer在我们所有的任务中都会表现得更好或者与从头开始的训练相匹配，但前提是整个模型都经过了精调。因此，虽然从预先训练的语言模型到其他模式的转换确实为未来的工作提供了令人兴奋的可能性，但适当调整超参数对于获得可靠的结果是重要的。摘要：Self-supervised pre-training of large-scale transformer models on text corpora followed by finetuning has achieved state-of-the-art on a number of natural language processing tasks. Recently, Lu et al. (2021, arXiv:2103.05247) claimed that frozen pretrained transformers (FPTs) match or outperform training from scratch as well as unfrozen (fine-tuned) pretrained transformers in a set of transfer tasks to other modalities. In our work, we find that this result is, in fact, an artifact of not tuning the learning rates. After carefully redesigning the empirical setup, we find that when tuning learning rates properly, pretrained transformers do outperform or match training from scratch in all of our tasks, but only as long as the entire model is finetuned. Thus, while transfer from pretrained language models to other modalities does indeed provide gains and hints at exciting possibilities for future work, properly tuning hyperparameters is important for arriving at robust findings.

GAN|对抗|攻击|生成相关(10篇)

【1】 Adversarial Stacked Auto-Encoders for Fair Representation Learning 标题：用于公平表示学习的对抗性堆叠自动编码器

作者：Patrik Joslin Kenfack,Adil Mehmood Khan,Rasheed Hussain,S. M. Ahsan Kazmi 机构： Innopolis University 备注：ICML2021 ML4data Workshop Paper 链接：https://arxiv.org/abs/2107.12826 摘要：以精确性为最终目标的机器学习模型训练可能会促进数据中的偏见和歧视行为。一种解决方案是学习满足特定公平性度量的潜在表示。不同类型的学习方法被用来将数据映射到公平的表征空间。其主要目的是学习一种潜在的数据表示方法，这种方法在保持下游任务可用性的同时，在公平性度量上得分很高。在本文中，我们提出了一种新的公平表示学习方法，利用不同层次的数据表示来收紧学习表示的公平界限。我们的研究结果表明，堆叠不同的自动编码器和在不同的潜在空间执行公平性的结果相比，其他现有的方法在公平性的改善。摘要：Training machine learning models with the only accuracy as a final goal may promote prejudices and discriminatory behaviors embedded in the data. One solution is to learn latent representations that fulfill specific fairness metrics. Different types of learning methods are employed to map data into the fair representational space. The main purpose is to learn a latent representation of data that scores well on a fairness metric while maintaining the usability for the downstream task. In this paper, we propose a new fair representation learning approach that leverages different levels of representation of data to tighten the fairness bounds of the learned representation. Our results show that stacking different auto-encoders and enforcing fairness at different latent spaces result in an improvement of fairness compared to other existing approaches.

【2】 MFAGAN: A Compression Framework for Memory-Efficient On-Device Super-Resolution GAN 标题：MFAGAN：一种存储效率高的器件内超分辨率GaN压缩框架

作者：Wenlong Cheng,Mingbo Zhao,Zhiling Ye,Shuhang Gu 机构：City University of Hong Kong, Donghua University, Tencent Computer System Co., Ltd., The University of Sydney 链接：https://arxiv.org/abs/2107.12679 摘要：生成对抗网络（Generative敌对网络，GANs）通过恢复照片真实感图像，促进了单图像超分辨率（single-image super-resolution，SR）的显著发展。然而，GAN基SR（通常是发生器）的高内存消耗导致性能下降和更多的能量消耗，阻碍了GAN基SR在资源受限的移动设备中的部署。本文提出了一种新的基于聚合网的多尺度压缩框架MFAGAN，以降低生成器的内存访问开销。首先，为了克服密集连接的内存爆炸问题，我们使用了一个内存高效的多尺度特征聚合网络作为生成器。其次，为了更快更稳定的训练，我们的方法引入了PatchGAN鉴别器。第三，为了平衡学生鉴别器和压缩生成器，我们同时提取生成器和鉴别器。最后，我们执行一个硬件感知的神经架构搜索（NAS）来为目标手机寻找一个专门的子生成器。得益于这些改进，与ESRGAN相比，所提出的MFAGAN实现了高达\textbf{8.3}$\times$的内存节省和\textbf{42.9}$\times$的计算量减少，并且视觉质量下降很小。实证研究还显示，高通Snapdragon 865芯片组的延迟为$\sim$\textbf{70}毫秒。摘要：Generative adversarial networks (GANs) have promoted remarkable advances in single-image super-resolution (SR) by recovering photo-realistic images. However, high memory consumption of GAN-based SR (usually generators) causes performance degradation and more energy consumption, hindering the deployment of GAN-based SR into resource-constricted mobile devices. In this paper, we propose a novel compression framework \textbf{M}ulti-scale \textbf{F}eature \textbf{A}ggregation Net based \textbf{GAN} (MFAGAN) for reducing the memory access cost of the generator. First, to overcome the memory explosion of dense connections, we utilize a memory-efficient multi-scale feature aggregation net as the generator. Second, for faster and more stable training, our method introduces the PatchGAN discriminator. Third, to balance the student discriminator and the compressed generator, we distill both the generator and the discriminator. Finally, we perform a hardware-aware neural architecture search (NAS) to find a specialized SubGenerator for the target mobile phone. Benefiting from these improvements, the proposed MFAGAN achieves up to \textbf{8.3}$\times$ memory saving and \textbf{42.9}$\times$ computation reduction, with only minor visual quality degradation, compared with ESRGAN. Empirical studies also show $\sim$\textbf{70} milliseconds latency on Qualcomm Snapdragon 865 chipset.

【3】 Toward Co-creative Dungeon Generation via Transfer Learning 标题：通过迁移学习走向共同创造的地下城世代

作者：Zisen Zhou,Matthew Guzdial 机构：University of Alberta, Edmonton, Canada 备注：None 链接：https://arxiv.org/abs/2107.12533 摘要：通过机器学习（PCGML）的协同创作过程内容生成（Co-creative Procedural Content Generation via Machine Learning，PCGML）是指PCGML代理和人类共同工作以生成输出内容的系统。协同创造PCGML的局限性之一是它需要协同创造训练数据，PCGML代理才能学会与人类交互。然而，获取这些数据是一个困难和耗时的过程。在这项工作中，我们提出了近似的人机交互数据，并采用转移学习，以适应从一个游戏学习到不同的游戏共同创造的知识。我们探索这种方法为共同创造塞尔达地下城房间一代。摘要：Co-creative Procedural Content Generation via Machine Learning (PCGML) refers to systems where a PCGML agent and a human work together to produce output content. One of the limitations of co-creative PCGML is that it requires co-creative training data for a PCGML agent to learn to interact with humans. However, acquiring this data is a difficult and time-consuming process. In this work, we propose approximating human-AI interaction data and employing transfer learning to adapt learned co-creative knowledge from one game to a different game. We explore this approach for co-creative Zelda dungeon room generation.

【4】 Generating Lode Runner Levels by Learning Player Paths with LSTMs 标题：利用LSTM学习玩家路径生成Lode Runner级别

作者：Kynan Sorochan,Jerry Chen,Yakun Yu,Matthew Guzdial 机构：University of Alberta, Edmonton, Canada 备注：None 链接：https://arxiv.org/abs/2107.12532 摘要：机器学习已经成为许多不同领域的流行工具，包括过程内容生成。然而，通过机器学习（PCGML）方法生成的过程性内容在可控性和一致性方面存在困难。在本文中，我们尝试通过学习生成类人路径，然后基于这些路径生成层次来解决这些问题。我们从游戏视频中提取玩家路径数据，训练LSTM根据这些数据生成新的路径，然后根据这些路径数据生成游戏关卡。我们证明了与现有的PCGML方法相比，我们的方法使得游戏Lode Runner的层次更加一致。摘要：Machine learning has been a popular tool in many different fields, including procedural content generation. However, procedural content generation via machine learning (PCGML) approaches can struggle with controllability and coherence. In this paper, we attempt to address these problems by learning to generate human-like paths, and then generating levels based on these paths. We extract player path data from gameplay video, train an LSTM to generate new paths based on this data, and then generate game levels based on this path data. We demonstrate that our approach leads to more coherent levels for the game Lode Runner in comparison to an existing PCGML approach.

【5】 Ensemble Learning For Mega Man Level Generation 标题：Mega Man级别生成的集成学习

作者：Bowei Li,Ruohan Chen,Yuqing Xue,Ricky Wang,Wenwen Li,Matthew Guzdial 机构：University of Alberta, Edmonton, Canada 备注：None 链接：https://arxiv.org/abs/2107.12524 摘要：通过机器学习（PCGML）的过程性内容生成（Procedural content generation via machine learning，PCGML）是使用在现有游戏内容上训练的模型按程序生成游戏内容的过程。PCGML方法很难用一个模型捕捉底层数据中的真实方差。在这篇文章中，我们研究了马尔可夫链的集合在程序性生成emph{Mega Man}水平中的应用。我们对我们的方法进行了初步的调查，并将其与现有的马尔可夫链方法进行了比较，从可玩性和风格相似性的角度进行了评估。摘要：Procedural content generation via machine learning (PCGML) is the process of procedurally generating game content using models trained on existing game content. PCGML methods can struggle to capture the true variance present in underlying data with a single model. In this paper, we investigated the use of ensembles of Markov chains for procedurally generating \emph{Mega Man} levels. We conduct an initial investigation of our approach and evaluate it on measures of playability and stylistic similarity in comparison to a non-ensemble, existing Markov chain approach.

【6】 TaikoNation: Patterning-focused Chart Generation for Rhythm Action Games 标题：太古民族：节奏动作游戏中以图案为中心的图表生成

作者：Emily Halina,Matthew Guzdial 机构：University of Alberta, Edmonton, Canada 备注：None 链接：https://arxiv.org/abs/2107.12506 摘要：通过机器学习从歌曲中生成节奏游戏图是近年来人们越来越感兴趣的问题。然而，所有现存的系统都在努力复制类似人类的模式：将游戏对象放置在彼此相关的位置，以形成基于歌曲中事件的一致模式。模式是高质量节奏游戏内容的关键标识符，被视为人类排名的必要组成部分。我们建立了一种新的图表生成方法，该方法生成的图表具有比以前工作中看到的更一致、更像人类的模式。摘要：Generating rhythm game charts from songs via machine learning has been a problem of increasing interest in recent years. However, all existing systems struggle to replicate human-like patterning: the placement of game objects in relation to each other to form congruent patterns based on events in the song. Patterning is a key identifier of high quality rhythm game content, seen as a necessary component in human rankings. We establish a new approach for chart generation that produces charts with more congruent, human-like patterning than seen in prior work.

【7】 Adversarial Random Forest Classifier for Automated Game Design 标题：用于自动游戏设计的对抗性随机森林分类器

作者：Thomas Maurer,Matthew Guzdial 机构：University of Alberta, Edmonton, Canada 备注：None 链接：https://arxiv.org/abs/2107.12501 摘要：自主游戏设计，即通过算法生成游戏，一直是技术游戏研究领域的一个长期目标。然而，现有的自主游戏设计系统在很大程度上依赖于人类对游戏设计知识的创作，例如基于搜索的方法中的适应度函数。在这篇文章中，我们描述了一个实验，试图学习一个类似人类的适应度函数，以对抗的方式进行自主游戏设计。虽然我们的实验工作并没有达到我们的预期，但是我们对我们的系统和结果进行了分析，希望对未来的自主游戏设计研究有所帮助。摘要：Autonomous game design, generating games algorithmically, has been a longtime goal within the technical games research field. However, existing autonomous game design systems have relied in large part on human-authoring for game design knowledge, such as fitness functions in search-based methods. In this paper, we describe an experiment to attempt to learn a human-like fitness function for autonomous game design in an adversarial manner. While our experimental work did not meet our expectations, we present an analysis of our system and results that we hope will be informative to future autonomous game design research.

【8】 LEGATO: A LayerwisE Gradient AggregaTiOn Algorithm for Mitigating Byzantine Attacks in Federated Learning 标题：LEGATO：一种减轻联邦学习中拜占庭攻击的层次化梯度聚集算法

作者：Kamala Varma,Yi Zhou,Nathalie Baracaldo,Ali Anwar 机构：University of Maryland, College Park, IBM Research, Almaden Research Center 链接：https://arxiv.org/abs/2107.12490 摘要：联合学习是一种允许多个参与者在不共享数据的情况下协作训练模型的机制。在这些环境中，参与者（工作者）可能不完全信任彼此；例如，一组竞争对手可以协作训练机器学习模型来检测欺诈。workers提供中心服务器用于更新全局模型的本地渐变。当拜占庭工作人员发送恶意梯度时，这种全局模型可能会被破坏，这就需要使用健壮的方法来聚合梯度，以减轻拜占庭输入的不利影响。现有的鲁棒聚合算法往往计算量大，且仅在严格的假设下有效。在本文中，我们介绍了分层梯度聚合（LEGATO），一种聚合算法，相比之下，它具有可扩展性和可推广性。LEGATO通过研究梯度对拜占庭攻击的层特定响应，采用了一种基于层特定鲁棒性的动态梯度加权方案。我们表明，LEGATO在计算效率上比多种最先进的技术更高，而且在实践中对各种攻击设置的鲁棒性也更高。我们还演示了LEGATO在没有攻击的情况下进行梯度下降收敛的好处。摘要：Federated learning has arisen as a mechanism to allow multiple participants to collaboratively train a model without sharing their data. In these settings, participants (workers) may not trust each other fully; for instance, a set of competitors may collaboratively train a machine learning model to detect fraud. The workers provide local gradients that a central server uses to update a global model. This global model can be corrupted when Byzantine workers send malicious gradients, which necessitates robust methods for aggregating gradients that mitigate the adverse effects of Byzantine inputs. Existing robust aggregation algorithms are often computationally expensive and only effective under strict assumptions. In this paper, we introduce LayerwisE Gradient AggregatTiOn (LEGATO), an aggregation algorithm that is, by contrast, scalable and generalizable. Informed by a study of layer-specific responses of gradients to Byzantine attacks, LEGATO employs a dynamic gradient reweighing scheme that is novel in its treatment of gradients based on layer-specific robustness. We show that LEGATO is more computationally efficient than multiple state-of-the-art techniques and more generally robust across a variety of attack settings in practice. We also demonstrate LEGATO's benefits for gradient descent convergence in the absence of an attack.

【9】 High-Dimensional Distribution Generation Through Deep Neural Networks 标题：基于深度神经网络的高维分布生成

作者：Dmytro Perekrestenko,Léandre Eberhard,Helmut Bölcskei 链接：https://arxiv.org/abs/2107.12466 摘要：我们证明了每一个$d$维有界支持概率分布都可以由一个$1$维均匀输入分布通过深ReLU网络生成。更重要的是，相对于从$d$独立随机变量生成$d$维目标分布而言，这在不产生成本的情况下是可能的——以瓦塞尔斯坦距离测量的近似误差而言。这是由在（Bailey&Telgarsky，2018）中发现的空间填充方法的广泛概括所促成的。我们提出的结构引出了网络深度在驱动目标分布与其神经网络近似值之间的Wasserstein距离为零方面的重要性。最后，我们发现，对于直方图目标分布，编码相应生成网络所需的比特数等于量化理论所规定的编码概率分布的基本极限。摘要：We show that every $d$-dimensional probability distribution of bounded support can be generated through deep ReLU networks out of a $1$-dimensional uniform input distribution. What is more, this is possible without incurring a cost - in terms of approximation error measured in Wasserstein-distance - relative to generating the $d$-dimensional target distribution from $d$ independent random variables. This is enabled by a vast generalization of the space-filling approach discovered in (Bailey & Telgarsky, 2018). The construction we propose elicits the importance of network depth in driving the Wasserstein distance between the target distribution and its neural network approximation to zero. Finally, we find that, for histogram target distributions, the number of bits needed to encode the corresponding generative network equals the fundamental limit for encoding probability distributions as dictated by quantization theory.

【10】 Realistic Ultrasound Image Synthesis for Improved Classification of Liver Disease 标题：用于改进肝病分类的真实感超声图像合成

作者：Hui Che,Sumana Ramanathan,David Foran,John L Nosher,Vishal M Patel,Ilker Hacihaliloglu 机构： Department of Biomedical Engineering, Rutgers University, NJ, USA, Department of Radiology, Rutgers Robert Wood Johnson Medical School, NJ, USA, Department of Electrical and Computer Engineering, Johns Hopkins University, MD, USA 备注：Accepted for presentation at the 2021 MICCAI-International Workshop of Advances in Simplifying Medical UltraSound (ASMUS2021) 链接：https://arxiv.org/abs/2107.12775 摘要：随着基于深度学习的方法在医学图像分析中的成功应用，卷积神经网络（CNNs）被用于从超声（US）数据中分类肝脏疾病。然而，可利用的大规模标记的美国数据的稀缺性阻碍了CNNs从美国数据中对肝病进行分类的成功。在这项工作中，我们提出了一种新的生成性对抗网络（GAN）架构，用于真实的健康肝脏超声图像合成。我们采用叠加的概念来合成真实的肝脏超声数据。对55例受试者的550张活体肝脏B型超声图像进行了定量和定性评价。我们还表明，合成的图像，连同真实的体内数据，可以用来显著提高非酒精性脂肪肝（NAFLD）分类的传统CNN架构的性能。摘要：With the success of deep learning-based methods applied in medical image analysis, convolutional neural networks (CNNs) have been investigated for classifying liver disease from ultrasound (US) data. However, the scarcity of available large-scale labeled US data has hindered the success of CNNs for classifying liver disease from US data. In this work, we propose a novel generative adversarial network (GAN) architecture for realistic diseased and healthy liver US image synthesis. We adopt the concept of stacking to synthesize realistic liver US data. Quantitative and qualitative evaluation is performed on 550 in-vivo B-mode liver US images collected from 55 subjects. We also show that the synthesized images, together with real in vivo data, can be used to significantly improve the performance of traditional CNN architectures for Nonalcoholic fatty liver disease (NAFLD) classification.

半/弱/无/有监督|不确定性|主动学习(4篇)

【1】 Energy-Based Open-World Uncertainty Modeling for Confidence Calibration 标题：基于能量的开放世界不确定性建模的置信度校正

作者：Yezhen Wang,Bo Li,Tong Che,Kaiyang Zhou,Dongsheng Li,Ziwei Liu 机构：Microsoft Research Asia, MILA, S-Lab, Nanyang Technological University 备注：ICCV 2021 (Poster) 链接：https://arxiv.org/abs/2107.12628 摘要：置信度校正对于机器学习系统决策的可靠性具有重要意义。然而，基于深度神经网络的判别式分类器常常因产生过度自信的预测而受到批评，这种预测不能反映分类准确率的真实正确性和可能性。我们认为，这种建模不确定性的能力主要是由softmax的封闭世界特性造成的：一个由交叉熵损失训练的模型将被迫以高概率将输入分类到一个$K$预定义的类别中。为了解决这个问题，我们首次提出了一个新的$K$+1路softmax公式，它将开放世界的不确定性建模作为额外维度。为了将原来的$K$方式分类任务的学习和建模不确定性的额外维度统一起来，我们提出了一种新的基于能量的目标函数，并从理论上证明了优化这样一个目标本质上迫使额外维度捕获边缘数据分布。大量实验表明，基于能量的openworldsoftmax（EOW-Softmax）方法在提高置信度方面优于现有的最新方法。摘要：Confidence calibration is of great importance to the reliability of decisions made by machine learning systems. However, discriminative classifiers based on deep neural networks are often criticized for producing overconfident predictions that fail to reflect the true correctness likelihood of classification accuracy. We argue that such an inability to model uncertainty is mainly caused by the closed-world nature in softmax: a model trained by the cross-entropy loss will be forced to classify input into one of $K$ pre-defined categories with high probability. To address this problem, we for the first time propose a novel $K$+1-way softmax formulation, which incorporates the modeling of open-world uncertainty as the extra dimension. To unify the learning of the original $K$-way classification task and the extra dimension that models uncertainty, we propose a novel energy-based objective function, and moreover, theoretically prove that optimizing such an objective essentially forces the extra dimension to capture the marginal data distribution. Extensive experiments show that our approach, Energy-based Open-World Softmax (EOW-Softmax), is superior to existing state-of-the-art methods in improving confidence calibration.

【2】 Unsupervised Deep Anomaly Detection for Multi-Sensor Time-Series Signals 标题：多传感器时间序列信号的无监督深度异常检测

作者：Yuxin Zhang,Yiqiang Chen,Jindong Wang,Zhiwen Pan 机构： Pan are with Beijing Key Laboratory of Mobile Comput-ing and Pervasive Device, Institute of Computing Technology, ChineseAcademy of Sciences and University of Chinese Academy of Sciences 备注：Accepted to IEEE Transactions on Knowledge and Data Engineering (IEEE TKDE) as a regular paper; 14 pages 链接：https://arxiv.org/abs/2107.12626 摘要：目前，多传感器技术已广泛应用于医疗保健、人类活动识别、工业控制等领域。这些传感器可以产生大量的多元时间序列数据。基于多传感器时间序列数据的无监督异常检测在机器学习研究中具有重要意义。关键的挑战是通过捕获多传感器数据的时空相关性来发现广义正态模式。除此之外，噪声数据往往与训练数据交织在一起，这很可能使模型难以区分正常数据、异常数据和噪声数据。以往的研究很少能够共同应对这两个挑战。本文提出了一种新的基于深度学习的异常检测算法&深度卷积自编码记忆网络（CAE-M）。我们首先构建了一个深度卷积自动编码器，用最大平均差（MMD）来描述多传感器数据的空间相关性，以便更好地区分噪声、正常和异常数据。然后，我们构造一个由线性（自回归模型）和非线性预测（带注意的双向LSTM）组成的记忆网络，从时间序列数据中获取时间相关性。最后，CAE-M对这两个子网进行联合优化。在HAR和HC数据集上，我们将该方法与几种最新的异常检测方法进行了比较。实验结果表明，本文提出的模型优于现有的方法。摘要：Nowadays, multi-sensor technologies are applied in many fields, e.g., Health Care (HC), Human Activity Recognition (HAR), and Industrial Control System (ICS). These sensors can generate a substantial amount of multivariate time-series data. Unsupervised anomaly detection on multi-sensor time-series data has been proven critical in machine learning researches. The key challenge is to discover generalized normal patterns by capturing spatial-temporal correlation in multi-sensor data. Beyond this challenge, the noisy data is often intertwined with the training data, which is likely to mislead the model by making it hard to distinguish between the normal, abnormal, and noisy data. Few of previous researches can jointly address these two challenges. In this paper, we propose a novel deep learning-based anomaly detection algorithm called Deep Convolutional Autoencoding Memory network (CAE-M). We first build a Deep Convolutional Autoencoder to characterize spatial dependence of multi-sensor data with a Maximum Mean Discrepancy (MMD) to better distinguish between the noisy, normal, and abnormal data. Then, we construct a Memory Network consisting of linear (Autoregressive Model) and non-linear predictions (Bidirectional LSTM with Attention) to capture temporal dependence from time-series data. Finally, CAE-M jointly optimizes these two subnetworks. We empirically compare the proposed approach with several state-of-the-art anomaly detection methods on HAR and HC datasets. Experimental results demonstrate that our proposed model outperforms these existing methods.

【3】 Combining Probabilistic Logic and Deep Learning for Self-Supervised Learning 标题：概率逻辑与深度学习相结合的自监督学习

作者：Hoifung Poon,Hai Wang,Hunter Lang 备注：Book chapter. arXiv admin note: substantial text overlap with arXiv:2012.12474, arXiv:1808.08485, arXiv:2008.12878 链接：https://arxiv.org/abs/2107.12591 摘要：深度学习已被证明对各种应用任务有效，但其适用性受到对注释示例的依赖的限制。自监督学习已经成为缓解监督瓶颈的一个很有前途的方向，但是现有的工作主要集中在利用未标记数据中的共现来进行任务不可知表征学习，例如蒙面语言模型预训练。在本章中，我们将探讨特定于任务的自我监督，它利用领域知识为最终应用程序自动注释有噪声的训练示例，方法是引入用于注释单个实例的标记函数，或者对相互依赖的标记决策施加约束。我们首先提出了深度概率逻辑（deep probabilistic logic，DPL），它通过将概率逻辑与深度学习结合起来，为特定任务的自我监控提供了一个统一的框架。DPL将未知标签表示为潜在变量，并利用概率逻辑结合多种自监督机制，利用变分EM端到端地训练深度神经网络。接下来，我们提出了自监督自监督（self-supervisory，S4），它增加了DPL自动学习新自监督的能力。从一个初始的种子自我监督开始，S4迭代地使用深度神经网络来提出新的自我监督。它们要么直接添加（结构化自我训练的一种形式），要么由人类专家验证（如在基于特征的主动学习中）。在生物医学机器阅读和各种文本分类任务等实际应用中的实验表明，特定于任务的自我监督可以有效地利用领域专业知识，并且通常只需很少的人力就可以达到监督方法的准确性。摘要：Deep learning has proven effective for various application tasks, but its applicability is limited by the reliance on annotated examples. Self-supervised learning has emerged as a promising direction to alleviate the supervision bottleneck, but existing work focuses on leveraging co-occurrences in unlabeled data for task-agnostic representation learning, as exemplified by masked language model pretraining. In this chapter, we explore task-specific self-supervision, which leverages domain knowledge to automatically annotate noisy training examples for end applications, either by introducing labeling functions for annotating individual instances, or by imposing constraints over interdependent label decisions. We first present deep probabilistic logic(DPL), which offers a unifying framework for task-specific self-supervision by composing probabilistic logic with deep learning. DPL represents unknown labels as latent variables and incorporates diverse self-supervision using probabilistic logic to train a deep neural network end-to-end using variational EM. Next, we present self-supervised self-supervision(S4), which adds to DPL the capability to learn new self-supervision automatically. Starting from an initial seed self-supervision, S4 iteratively uses the deep neural network to propose new self supervision. These are either added directly (a form of structured self-training) or verified by a human expert (as in feature-based active learning). Experiments on real-world applications such as biomedical machine reading and various text classification tasks show that task-specific self-supervision can effectively leverage domain expertise and often match the accuracy of supervised methods with a tiny fraction of human effort.

【4】 CFLOW-AD: Real-Time Unsupervised Anomaly Detection with Localization via Conditional Normalizing Flows 标题：CFLOW-AD：基于条件归一化流定位的实时无监督异常检测

作者：Denis Gudovskiy,Shun Ishizaka,Kazuki Kozuka 机构：Panasonic AI Lab, USA, Panasonic Technology Division, Japan 备注：Accepted to WACV 2022. Preprint 链接：https://arxiv.org/abs/2107.12571 摘要：在标记不可行的情况下，以及在异常样本在列车数据中完全缺失的情况下，基于定位的无监督异常检测具有许多实际应用。虽然最近提出的用于此类数据设置的模型实现了高精度度量，但其复杂性是实时处理的限制因素。在本文中，我们提出了一个实时模型，并分析推导了它与先验方法的关系。我们的CFLOW-AD模型是基于一个用于定位异常检测的条件规范化流框架。特别地，CFLOW-AD由一个有区别的预训练编码器和一个多尺度生成解码器组成，后者显式地估计编码特征的可能性。我们的方法产生了一个计算效率和内存效率都很高的模型：CFLOW-AD在相同的输入设置下比现有的最新技术快10倍，而且更小。我们在MVTec数据集上的实验表明，CFLOW-AD在检测任务上比以前的方法有0.36%的AUROC，在定位任务上比以前的方法有1.12%的AUROC和2.5%的AUPRO。我们用完全可复制的实验来开放代码。摘要：Unsupervised anomaly detection with localization has many practical applications when labeling is infeasible and, moreover, when anomaly examples are completely missing in the train data. While recently proposed models for such data setup achieve high accuracy metrics, their complexity is a limiting factor for real-time processing. In this paper, we propose a real-time model and analytically derive its relationship to prior methods. Our CFLOW-AD model is based on a conditional normalizing flow framework adopted for anomaly detection with localization. In particular, CFLOW-AD consists of a discriminatively pretrained encoder followed by a multi-scale generative decoders where the latter explicitly estimate likelihood of the encoded features. Our approach results in a computationally and memory-efficient model: CFLOW-AD is faster and smaller by a factor of 10x than prior state-of-the-art with the same input setting. Our experiments on the MVTec dataset show that CFLOW-AD outperforms previous methods by 0.36% AUROC in detection task, by 1.12% AUROC and 2.5% AUPRO in localization task, respectively. We open-source our code with fully reproducible experiments.

迁移|Zero/Few/One-Shot|自适应(5篇)

【1】 A Physiologically-adapted Gold Standard for Arousal During a Stress Induced Scenario 标题：在应激诱导情景中生理适应的唤醒黄金标准

作者：Alice Baird,Lukas Stappen,Lukas Christ,Lea Schumann,Eva-Maria Meßner,Björn W. Schuller 机构：Chair EIHW, University of Augsburg, Augsburg, Germany, KPP, University of Ulm, Ulm, Germany, GLAM, Imperial College London, London, United Kingdom 链接：https://arxiv.org/abs/2107.12964 摘要：情感是人类固有的主观心理生理状态，要产生连续情感的一致表示（黄金标准），需要对多个人类注释者进行耗时且昂贵的训练。文献中有强有力的证据表明，生理信号是情绪状态，特别是觉醒状态的充分客观标志。在这篇文章中，我们使用了一个数据集，其中包括在压力诱导情景（特里尔社会压力测试）中捕获的连续情绪和生理信号——每分钟心跳（BPM）、皮肤电活动（EDA）和呼吸频率。我们利用长-短记忆，递归神经网络来探索融合这些生理信号的好处，以唤醒为目标，学习各种音频，视频和文本为基础的功能。我们利用最先进的缪斯工具箱来考虑注释延迟和帧间协议加权时融合目标信号。当EDA与唤醒融合时，与仅唤醒的金标准结果相比，特征集之间的一致性相关系数（CCC）有所提高。此外，基于BERT的文本特征对于唤醒和所有生理信号的结果都有所改善，与仅对唤醒的0.2118ccc相比，获得了高达0.3344ccc的结果。多模态融合还提高了整体CCC的音频和视频功能，获得高达0.6157 CCC识别唤醒加EDA和BPM。摘要：Emotion is an inherently subjective psychophysiological human-state and to produce an agreed-upon representation (gold standard) for continuous emotion requires a time-consuming and costly training procedure of multiple human annotators. There is strong evidence in the literature that physiological signals are sufficient objective markers for states of emotion, particularly arousal. In this contribution, we utilise a dataset which includes continuous emotion and physiological signals - Heartbeats per Minute (BPM), Electrodermal Activity (EDA), and Respiration-rate - captured during a stress induced scenario (Trier Social Stress Test). We utilise a Long Short-Term Memory, Recurrent Neural Network to explore the benefit of fusing these physiological signals with arousal as the target, learning from various audio, video, and textual based features. We utilise the state-of-the-art MuSe-Toolbox to consider both annotation delay and inter-rater agreement weighting when fusing the target signals. An improvement in Concordance Correlation Coefficient (CCC) is seen across features sets when fusing EDA with arousal, compared to the arousal only gold standard results. Additionally, BERT-based textual features' results improved for arousal plus all physiological signals, obtaining up to .3344 CCC compared to .2118 CCC for arousal only. Multimodal fusion also improves overall CCC with audio plus video features obtaining up to .6157 CCC to recognize arousal plus EDA and BPM.

【2】 Finding Failures in High-Fidelity Simulation using Adaptive Stress Testing and the Backward Algorithm 标题：基于自适应压力测试和反向算法的高保真仿真故障查找

作者：Mark Koren,Ahmed Nassar,Mykel J. Kochenderfer 机构： Stanford University 备注：Accepted to IROS 2021 链接：https://arxiv.org/abs/2107.12940 摘要：验证自治系统的安全性通常需要使用高保真模拟器，以充分捕捉真实世界场景的可变性。然而，在模拟场景的空间中彻底搜索故障通常是不可行的。自适应应力测试（AST）是一种利用强化学习来发现系统最可能失效的方法。AST与深度强化学习求解器已被证明是有效的，在寻找跨一系列不同系统的失败。这种方法通常涉及运行许多模拟，这在使用高保真模拟器时可能非常昂贵。为了提高效率，我们提出了一种在低保真度模拟器中首先发现故障的方法。然后使用反向算法，通过单个专家演示来训练深度神经网络策略，从而将低保真度故障调整为高保真度故障。我们已经创建了一系列自主车辆验证案例研究，这些案例代表了低保真度和高保真度模拟器的一些不同方式，例如时间离散化。我们在大量的案例研究中证明，这种新的AST方法能够用比直接高保真运行AST所需的高保真模拟步骤少得多的步骤来发现故障。作为概念证明，我们还演示了在NVIDIA的DriveSim模拟器上的AST，这是一个行业最先进的高保真模拟器，用于查找自动驾驶车辆的故障。摘要：Validating the safety of autonomous systems generally requires the use of high-fidelity simulators that adequately capture the variability of real-world scenarios. However, it is generally not feasible to exhaustively search the space of simulation scenarios for failures. Adaptive stress testing (AST) is a method that uses reinforcement learning to find the most likely failure of a system. AST with a deep reinforcement learning solver has been shown to be effective in finding failures across a range of different systems. This approach generally involves running many simulations, which can be very expensive when using a high-fidelity simulator. To improve efficiency, we present a method that first finds failures in a low-fidelity simulator. It then uses the backward algorithm, which trains a deep neural network policy using a single expert demonstration, to adapt the low-fidelity failures to high-fidelity. We have created a series of autonomous vehicle validation case studies that represent some of the ways low-fidelity and high-fidelity simulators can differ, such as time discretization. We demonstrate in a variety of case studies that this new AST approach is able to find failures with significantly fewer high-fidelity simulation steps than are needed when just running AST directly in high-fidelity. As a proof of concept, we also demonstrate AST on NVIDIA's DriveSim simulator, an industry state-of-the-art high-fidelity simulator for finding failures in autonomous vehicles.

【3】 Transfer Learning in Electronic Health Records through Clinical Concept Embedding 标题：临床概念嵌入在电子病历中的迁移学习

作者：Jose Roberto Ayala Solares,Yajie Zhu,Abdelaali Hassaine,Shishir Rao,Yikuan Li,Mohammad Mamouei,Dexter Canoy,Kazem Rahimi,Gholamreza Salimi-Khorshidi 机构： University of Oxford 链接：https://arxiv.org/abs/2107.12919 摘要：深度学习模型在学习表征方面显示出巨大的潜力，它能够捕获数据的一些关键属性。这使得他们成为迁移学习的最佳人选：利用不同学习任务之间的共性，将知识从一个任务转移到另一个任务。电子健康档案（EHR）研究是一个领域，已经见证了越来越多的深度学习技术用于学习医学概念（如疾病和药物）的临床意义的表现。尽管这种增长，基准和评估这种学习表示（或嵌入）的方法仍在研究中；当这种嵌入被共享以促进迁移学习时，这可能是一个大问题。在这项研究中，我们的目标是：（1）在310万患者的综合EHR数据上训练一些最突出的疾病嵌入技术，（2）采用定性和定量评估技术来评估这些嵌入，以及（3）为转移学习提供预先训练的疾病嵌入。本研究是临床概念植入评价的第一个综合方法，可应用于任何植入技术和任何EHR概念。摘要：Deep learning models have shown tremendous potential in learning representations, which are able to capture some key properties of the data. This makes them great candidates for transfer learning: Exploiting commonalities between different learning tasks to transfer knowledge from one task to another. Electronic health records (EHR) research is one of the domains that has witnessed a growing number of deep learning techniques employed for learning clinically-meaningful representations of medical concepts (such as diseases and medications). Despite this growth, the approaches to benchmark and assess such learned representations (or, embeddings) is under-investigated; this can be a big issue when such embeddings are shared to facilitate transfer learning. In this study, we aim to (1) train some of the most prominent disease embedding techniques on a comprehensive EHR data from 3.1 million patients, (2) employ qualitative and quantitative evaluation techniques to assess these embeddings, and (3) provide pre-trained disease embeddings for transfer learning. This study can be the first comprehensive approach for clinical concept embedding evaluation and can be applied to any embedding techniques and for any EHR concept.

【4】 A Low-Cost Neural ODE with Depthwise Separable Convolution for Edge Domain Adaptation on FPGAs 标题：一种用于FPGA边缘域自适应的低成本可分离卷积神经ODE

作者：Hiroki Kawakami,Hirohisa Watanabe,Keisuke Sugiura,Hiroki Matsutani 机构：Keio University, -,-, Hiyoshi, Kohoku-ku, Yokohama, Japan 链接：https://arxiv.org/abs/2107.12824 摘要：虽然边缘环境对高性能的深度神经网络提出了很高的要求，但是边缘设备的计算资源受到了严格的限制，并且发展了诸如深度可分离卷积（DSC）等轻量级的神经网络技术。ResNet是一种传统的深度神经网络模型，它可以叠加大量的层和参数以获得更高的精度。为了减小ResNet的参数规模，神经ODE利用与常微分方程（ODE）的相似性，不需要大量的参数，而是重复使用大部分的权值参数。因此，与ResNet相比，神经ODE变得非常小，因此可以在资源有限的边缘设备中实现。本文设计并实现了一种用于现场可编程门阵列（fpga）的神经ODE与DSC相结合的dsODENet。然后将dsODENet作为一个实际应用实例应用于边缘域自适应，并用图像分类数据集进行了评价。它在xilinxzcu104板上实现，并与软件执行相比，在域适配精度、训练速度、FPGA资源利用率和加速率方面进行了评估。结果表明，dsODENet在域自适应精度方面与我们的基线神经ODE实现相当或略好，而在没有前后处理层的情况下，总参数大小减少了54.2%至79.8%。FPGA实现比软件实现的预测任务速度快27.9倍。摘要：Although high-performance deep neural networks are in high demand in edge environments, computation resources are strictly limited in edge devices, and light-weight neural network techniques, such as Depthwise Separable Convolution (DSC), have been developed. ResNet is one of conventional deep neural network models that stack a lot of layers and parameters for a higher accuracy. To reduce the parameter size of ResNet, by utilizing a similarity to ODE (Ordinary Differential Equation), Neural ODE repeatedly uses most of weight parameters instead of having a lot of different parameters. Thus, Neural ODE becomes significantly small compared to that of ResNet so that it can be implemented in resource-limited edge devices. In this paper, a combination of Neural ODE and DSC, called dsODENet, is designed and implemented for FPGAs (Field-Programmable Gate Arrays). dsODENet is then applied to edge domain adaptation as a practical use case and evaluated with image classification datasets. It is implemented on Xilinx ZCU104 board and evaluated in terms of domain adaptation accuracy, training speed, FPGA resource utilization, and speedup rate compared to a software execution. The results demonstrate that dsODENet is comparable to or slightly better than our baseline Neural ODE implementation in terms of domain adaptation accuracy, while the total parameter size without pre- and post-processing layers is reduced by 54.2% to 79.8%. The FPGA implementation accelerates the prediction tasks by 27.9 times faster than a software implementation.

【5】 Parallel Surrogate-assisted Optimization Using Mesh Adaptive Direct Search 标题：基于网格自适应直接搜索的并行代理辅助优化

作者：Bastien Talgorn,Stéphane Alarie,Michael Kokkolaras 机构：McGill University, GERAD, Montr´eal, Qu´ebec, Canada, Hydro-Qu´ebec’s Research Institute, GERAD, Montr´eal, Qu´ebec, Canada 链接：https://arxiv.org/abs/2107.12421 摘要：我们考虑计算昂贵的黑盒优化问题，并提出了一种方法，采用代理模型和并行计算在搜索步骤的网格自适应直接搜索（MADS）算法。具体地说，我们使用局部加权散点图平滑（LOWESS）模型来解决一个代理优化问题，以找到有希望的候选点，并用黑盒进行评估。我们考虑了几种方法从大量的点中选择有希望的点。我们通过五个工程设计问题进行了数值实验，以评估改进的MADS算法在可用CPU资源方面的性能。摘要：We consider computationally expensive blackbox optimization problems and present a method that employs surrogate models and concurrent computing at the search step of the mesh adaptive direct search (MADS) algorithm. Specifically, we solve a surrogate optimization problem using locally weighted scatterplot smoothing (LOWESS) models to find promising candidate points to be evaluated by the blackboxes. We consider several methods for selecting promising points from a large number of points. We conduct numerical experiments to assess the performance of the modified MADS algorithm with respect to available CPU resources by means of five engineering design problems.

强化学习(4篇)

【1】 Reinforcement Learning with Formal Performance Metrics for Quadcopter Attitude Control under Non-nominal Contexts 标题：非标称环境下四轴飞行器姿态控制的形式化性能度量强化学习

作者：Nicola Bernini,Mikhail Bessa,Rémi Delmas,Arthur Gold,Eric Goubault,Romain Pennec,Sylvie Putot,François Sillion 机构：Fran¸cois Silliona, Uber ATCP, Paris, France, LIX, Ecole polytechnique, CNRS, IP-Paris, Palaiseau, France 链接：https://arxiv.org/abs/2107.12942 摘要：通过对四直升机姿态控制器实例的深入讨论，探讨了控制器设计的强化学习方法。我们提供了所有的细节，可以重现我们的方法，从一个CrazyFlie2.0在各种标称和非标称条件下的动力学模型开始，包括部分电机故障和阵风。我们发展了一种稳健的信号时序逻辑，以定量评估车辆的行为和衡量控制器的性能。针对不同的性能指标，本文详细描述了训练算法、神经网络结构、超参数、观测空间的选择。我们讨论了所得到的控制器对一个转子的部分功率损失和阵风的鲁棒性，并通过强化学习得出了实际控制器设计的结论。摘要：We explore the reinforcement learning approach to designing controllers by extensively discussing the case of a quadcopter attitude controller. We provide all details allowing to reproduce our approach, starting with a model of the dynamics of a crazyflie 2.0 under various nominal and non-nominal conditions, including partial motor failures and wind gusts. We develop a robust form of a signal temporal logic to quantitatively evaluate the vehicle's behavior and measure the performance of controllers. The paper thoroughly describes the choices in training algorithms, neural net architecture, hyperparameters, observation space in view of the different performance metrics we have introduced. We discuss the robustness of the obtained controllers, both to partial loss of power for one rotor and to wind gusts and finish by drawing conclusions on practical controller design by reinforcement learning.

【2】 Persistent Reinforcement Learning via Subgoal Curricula 标题：通过子目标课程进行持续强化学习

作者：Archit Sharma,Abhishek Gupta,Sergey Levine,Karol Hausman,Chelsea Finn 机构：† Stanford University, ‡ Google Brain, # UC Berkeley 链接：https://arxiv.org/abs/2107.12931 摘要：强化学习（RL）有望实现对不同智能体复杂行为的自主获取。然而，当前强化学习算法的成功取决于一个经常被忽视的要求——每次试验都需要从一个固定的初始状态分布开始。不幸的是，在每次试验后将环境重置为初始状态需要大量的人类监督和广泛的环境仪器，这破坏了自主强化学习的目的。在这项工作中，我们提出了价值加速持续强化学习（VaPRL），它产生了一个初始状态的课程，使得代理可以在较容易的任务成功的基础上进行引导，从而有效地学习较难的任务。代理还学习达到课程建议的初始状态，最大限度地减少对人类干预学习的依赖。我们观察到，与幕式RL相比，VaPRL减少了三个数量级所需的干预，同时在各种模拟机器人问题的样本效率和渐近性能方面都优于现有的无重置RL方法。摘要：Reinforcement learning (RL) promises to enable autonomous acquisition of complex behaviors for diverse agents. However, the success of current reinforcement learning algorithms is predicated on an often under-emphasised requirement -- each trial needs to start from a fixed initial state distribution. Unfortunately, resetting the environment to its initial state after each trial requires substantial amount of human supervision and extensive instrumentation of the environment which defeats the purpose of autonomous reinforcement learning. In this work, we propose Value-accelerated Persistent Reinforcement Learning (VaPRL), which generates a curriculum of initial states such that the agent can bootstrap on the success of easier tasks to efficiently learn harder tasks. The agent also learns to reach the initial states proposed by the curriculum, minimizing the reliance on human interventions into the learning. We observe that VaPRL reduces the interventions required by three orders of magnitude compared to episodic RL while outperforming prior state-of-the art methods for reset-free RL both in terms of sample efficiency and asymptotic performance on a variety of simulated robotics problems.

【3】 Deep Reinforcement Learning for L3 Slice Localization in Sarcopenia Assessment 标题：深度强化学习在肉质减少症L3切片定位中的应用

作者：Othmane Laousy,Guillaume Chassagnon,Edouard Oyallon,Nikos Paragios,Marie-Pierre Revel,Maria Vakalopoulou 链接：https://arxiv.org/abs/2107.12800 摘要：肌肉减少症是一种以肌肉质量和功能减少为特征的疾病。定量诊断技术包括定位穿过第三腰椎区域（L3）中部的CT切片，并在此水平分割肌肉。在本文中，我们提出了一种深度强化学习方法来精确定位L3 CT切片。我们的方法通过激励强化学习代理发现正确的位置来训练它。具体地说，一个深度Q网络被训练来寻找解决这个问题的最佳策略。可视化的训练过程显示，代理模拟有经验的放射科医生滚动。通过对其他基于深度学习的L3定位方法的大量实验，证明了该方法的优越性，即使在数据量和注释量有限的情况下也能取得良好的定位效果。摘要：Sarcopenia is a medical condition characterized by a reduction in muscle mass and function. A quantitative diagnosis technique consists of localizing the CT slice passing through the middle of the third lumbar area (L3) and segmenting muscles at this level. In this paper, we propose a deep reinforcement learning method for accurate localization of the L3 CT slice. Our method trains a reinforcement learning agent by incentivizing it to discover the right position. Specifically, a Deep Q-Network is trained to find the best policy to follow for this problem. Visualizing the training process shows that the agent mimics the scrolling of an experienced radiologist. Extensive experiments against other state-of-the-art deep learning based methods for L3 localization prove the superiority of our technique which performs well even with limited amount of data and annotations.

【4】 Asynchronous Distributed Reinforcement Learning for LQR Control via Zeroth-Order Block Coordinate Descent 标题：基于零阶挡路坐标下降的异步分布式强化学习

作者：Gangshan Jing,He Bai,Jemin George,Aranya Chakrabortty,Piyush K. Sharma 机构： Chkarabortty are with North Carolina State University, Bai is with Oklahoma State University 链接：https://arxiv.org/abs/2107.12416 摘要：最近提出的分布式零阶优化（ZOO）算法在分布式强化学习（RL）中得到了广泛的应用。然而，在梯度估计过程中，几乎所有的算法都需要与全局变量维数相同的随机样本和/或需要计算全局代价函数，这可能导致大规模网络的估计方差较大。在本文中，我们提出了一种新的分布式零阶算法，利用优化目标中固有的网络结构，使得每个代理可以独立地通过局部代价评估来估计其局部梯度，而不需要使用任何一致性协议。该算法采用异步更新方案，并基于块坐标下降法设计了一个可行域可能非凸的随机非凸优化问题。该算法后来被用作分布式线性二次调节器设计的分布式无模型RL算法，其中设计了一个学习图来描述分布式学习中agent之间所需的交互关系。我们提供了一个实验验证所提出的算法，以测试其性能的收敛速度和方差相比，集中动物园算法。摘要：Recently introduced distributed zeroth-order optimization (ZOO) algorithms have shown their utility in distributed reinforcement learning (RL). Unfortunately, in the gradient estimation process, almost all of them require random samples with the same dimension as the global variable and/or require evaluation of the global cost function, which may induce high estimation variance for large-scale networks. In this paper, we propose a novel distributed zeroth-order algorithm by leveraging the network structure inherent in the optimization objective, which allows each agent to estimate its local gradient by local cost evaluation independently, without use of any consensus protocol. The proposed algorithm exhibits an asynchronous update scheme, and is designed for stochastic non-convex optimization with a possibly non-convex feasible domain based on the block coordinate descent method. The algorithm is later employed as a distributed model-free RL algorithm for distributed linear quadratic regulator design, where a learning graph is designed to describe the required interaction relationship among agents in distributed learning. We provide an empirical validation of the proposed algorithm to benchmark its performance on convergence rate and variance against a centralized ZOO algorithm.

医学相关(3篇)

【1】 Physics-constrained Deep Learning for Robust Inverse ECG Modeling 标题：基于物理约束的深度学习鲁棒逆心电建模算法

作者：Jianxin Xie,Bing Yao 机构：edu)are with the School of Industrial Engineering and Management, OklahomaState University 链接：https://arxiv.org/abs/2107.12780 摘要：先进传感和成像技术的迅速发展带来了一个数据丰富的环境，促进了复杂系统的有效建模、监控和控制。例如，身体传感器网络捕获与心脏电活动（即心电图（ECG））相关的多通道信息，这使得医学科学家能够监视和检测异常心脏状况。然而，高维遥感数据结构复杂，充分发挥数据潜力在很大程度上依赖于先进的分析和预测方法。提出了一种物理约束深度学习（P-DL）的高维心电逆建模框架。该方法将复杂系统的物理规律与先进的深度学习基础设施相结合，实现了对系统动力学的有效预测。提出的P-DL方法用于求解逆心电模型，并根据体表传感器网络测得的心电数据预测心脏内电位的时变分布。实验结果表明，所提出的P-DL方法明显优于现有的常用方法。摘要：The rapid developments in advanced sensing and imaging bring about a data-rich environment, facilitating the effective modeling, monitoring, and control of complex systems. For example, the body-sensor network captures multi-channel information pertinent to the electrical activity of the heart (i.e., electrocardiograms (ECG)), which enables medical scientists to monitor and detect abnormal cardiac conditions. However, the high-dimensional sensing data are generally complexly structured and realizing the full data potential depends to a great extent on advanced analytical and predictive methods. This paper presents a physics-constrained deep learning (P-DL) framework for high-dimensional inverse ECG modeling. This method integrates the physical laws of the complex system with the advanced deep learning infrastructure for effective prediction of the system dynamics. The proposed P-DL approach is implemented to solve the inverse ECG model and predict the time-varying distribution of electric potentials in the heart from the ECG data measured by the body-surface sensor network. Experimental results show that the proposed P-DL method significantly outperforms existing methods that are commonly used in current practice.

【2】 Improved-Mask R-CNN: Towards an Accurate Generic MSK MRI instance segmentation platform (Data from the Osteoarthritis Initiative) 标题：改进的Mask R-CNN：迈向准确的通用MSK MRI实例分割平台(来自骨关节炎计划的数据)

作者：Banafshe Felfeliyan,Abhilash Hareendranathan,Gregor Kuntze,Jacob L. Jaremko,Janet L. Ronsky 机构：Ronsky , Affiliations, Schulich School of Engineering, University of Calgary, McCaig Institute for Bone and Joint Health University of Calgary, Calgary, Department of Radiology & Diagnostic Imaging, University of Alberta 链接：https://arxiv.org/abs/2107.12889 摘要：目的探讨磁共振成像（MRI）对骨关节炎（OA）的诊断价值。骨、软骨和关节液的分割是骨性关节炎客观评估的必要条件。现有的分割方法大多没有进行实例分割，存在类不平衡的问题。本研究采用maskr-CNN实例分割法，并对其进行改进（改进maskr-CNN（iMaskRCNN））以获得更准确的OA相关组织广义分割。使用来自骨关节炎倡议（OAI）数据集的500个MRI膝关节和症状性髋关节OA患者的97个MRI扫描，对该方法进行训练和验证。对Mask R-CNN的三个修改产生了iMaskRCNN：添加第二个roi对齐块，向Mask报头添加额外的解码器层，并通过跳过连接将它们连接起来。使用Hausdorff距离、dice评分和变异系数（CoV）对结果进行评估。与掩模RCNN相比，iMaskRCNN改善了骨和软骨的分割，如股骨的dice评分从95%增加到98%，胫骨的dice评分从95%增加到97%，股骨软骨的dice评分从71%增加到80%，胫骨软骨的dice评分从81%增加到82%。对于渗出液检测，dice使用iMaskRCNN提高了72%，而MaskRCNN提高了71%。Reader1和Mask R-CNN（0.33）、Reader1和iMaskRCNN（0.34）、Reader2和Mask R-CNN（0.22）、Reader2和iMaskRCNN（0.29）之间的渗出检测CoV值接近两个阅读器之间的CoV值（0.21），表明人类阅读器和Mask R-CNN和iMaskRCNN之间高度一致。掩模R－CNN和IMASKRCNN能可靠地同时提取与OA相关的不同尺度关节组织，为OA的自动化评估奠定基础。iMaskRCNN结果表明，改进后的网络边缘性能得到了改善。摘要：Objective assessment of Magnetic Resonance Imaging (MRI) scans of osteoarthritis (OA) can address the limitation of the current OA assessment. Segmentation of bone, cartilage, and joint fluid is necessary for the OA objective assessment. Most of the proposed segmentation methods are not performing instance segmentation and suffer from class imbalance problems. This study deployed Mask R-CNN instance segmentation and improved it (improved-Mask R-CNN (iMaskRCNN)) to obtain a more accurate generalized segmentation for OA-associated tissues. Training and validation of the method were performed using 500 MRI knees from the Osteoarthritis Initiative (OAI) dataset and 97 MRI scans of patients with symptomatic hip OA. Three modifications to Mask R-CNN yielded the iMaskRCNN: adding a 2nd ROIAligned block, adding an extra decoder layer to the mask-header, and connecting them by a skip connection. The results were assessed using Hausdorff distance, dice score, and coefficients of variation (CoV). The iMaskRCNN led to improved bone and cartilage segmentation compared to Mask RCNN as indicated with the increase in dice score from 95% to 98% for the femur, 95% to 97% for tibia, 71% to 80% for femoral cartilage, and 81% to 82% for tibial cartilage. For the effusion detection, dice improved with iMaskRCNN 72% versus MaskRCNN 71%. The CoV values for effusion detection between Reader1 and Mask R-CNN (0.33), Reader1 and iMaskRCNN (0.34), Reader2 and Mask R-CNN (0.22), Reader2 and iMaskRCNN (0.29) are close to CoV between two readers (0.21), indicating a high agreement between the human readers and both Mask R-CNN and iMaskRCNN. Mask R-CNN and iMaskRCNN can reliably and simultaneously extract different scale articular tissues involved in OA, forming the foundation for automated assessment of OA. The iMaskRCNN results show that the modification improved the network performance around the edges.

【3】 Linear Prediction Residual for Efficient Diagnosis of Parkinson's Disease from Gait 标题：线性预测残差在步态帕金森病诊断中的应用

作者：Shanmukh Alle,U. Deva Priyakumar 机构：Center for Computational Natural Sciences and Bioinformatics, IIIT Hyderabad. 备注：11 pages, 4 figures, 2 tables, 4 equations, 23 citations, to be published in Medical Image Computing and Computer Assisted Intervention (MICCAI) 2021 链接：https://arxiv.org/abs/2107.12878 摘要：帕金森病（PD）是一种慢性进行性神经系统疾病，导致僵硬、震颤和姿势不稳定。目前还没有明确的医学测试来诊断帕金森病，诊断大多是一个临床练习。尽管有指导方针，大约10-30%的患者被错误地诊断为帕金森病。因此，需要一种准确、无偏、快速的诊断方法。在本研究中，我们提出一种快速且准确的从步态诊断PD的方法LPGNet。LPGNet使用线性预测残差（LPR）从步态记录中提取识别模式，然后使用一维卷积神经网络和深度可分离卷积进行诊断。LPGNet的AUC为0.91，与现有技术相比，该模型的加速比为21倍，参数减少了99%。我们还对文献中用于步态PD诊断的各种交叉验证策略进行了分析，发现大多数方法都会受到不同折叠之间某种形式的数据泄漏的影响，从而导致不必要的大模型和过度拟合导致的性能膨胀。分析为今后正确评价其方法的工作指明了方向。摘要：Parkinson's Disease (PD) is a chronic and progressive neurological disorder that results in rigidity, tremors and postural instability. There is no definite medical test to diagnose PD and diagnosis is mostly a clinical exercise. Although guidelines exist, about 10-30% of the patients are wrongly diagnosed with PD. Hence, there is a need for an accurate, unbiased and fast method for diagnosis. In this study, we propose LPGNet, a fast and accurate method to diagnose PD from gait. LPGNet uses Linear Prediction Residuals (LPR) to extract discriminating patterns from gait recordings and then uses a 1D convolution neural network with depth-wise separable convolutions to perform diagnosis. LPGNet achieves an AUC of 0.91 with a 21 times speedup and about 99% lesser parameters in the model compared to the state of the art. We also undertake an analysis of various cross-validation strategies used in literature in PD diagnosis from gait and find that most methods are affected by some form of data leakage between various folds which leads to unnecessarily large models and inflated performance due to overfitting. The analysis clears the path for future works in correctly evaluating their methods.

推荐(2篇)

【1】 Deep Variational Models for Collaborative Filtering-based Recommender Systems 标题：基于协同过滤的推荐系统的深度变分模型

作者：Jesús Bobadilla,Fernando Ortega,Abraham Gutiérrez,Ángel González-Prieto 机构：Departamento de Sistemas Inform´aticos, ETSI Sistemas Inform´aticos, Universidad Polit´ecnica de Madrid, Madrid, Spain., KNODIS Research Group, ETSI Sistemas Inform´aticos, Departamento de ´Algebra, Geometr´ıa y Topolog´ıa, Facultad de Ciencias Matem´aticas 备注：14 pages, 8 figures, 3 tables 链接：https://arxiv.org/abs/2107.12677 摘要：深度学习提供了精确的协同过滤模型来提高推荐系统的效果。深度矩阵分解及其相关的协同神经网络是该领域的研究现状；然而，这两种模型都缺乏必要的随机性来创建变分自动编码器所表现出的健壮、连续和结构化的潜在空间。另一方面，在协同过滤领域，由于推荐系统的高度稀疏性，通过变分自动编码器进行的数据扩充不能提供准确的结果。我们提出的模型应用变分概念在深层结构的潜在空间中注入随机性，将变分技术引入到神经协同过滤领域。该方法不依赖于用于生成潜在表示的特定模型。通过这种方式，这种方法可以作为插件应用于任何当前和未来的特定模型。提出的模型已经用四个有代表性的开放数据集、三种不同的质量度量和最新的基线进行了测试。结果表明，在变分富集超过注入噪声影响的情况下，该方法具有优越性。此外，还提供了一个框架，以实现所进行实验的再现性。摘要：Deep learning provides accurate collaborative filtering models to improve recommender system results. Deep matrix factorization and their related collaborative neural networks are the state-of-art in the field; nevertheless, both models lack the necessary stochasticity to create the robust, continuous, and structured latent spaces that variational autoencoders exhibit. On the other hand, data augmentation through variational autoencoder does not provide accurate results in the collaborative filtering field due to the high sparsity of recommender systems. Our proposed models apply the variational concept to inject stochasticity in the latent space of the deep architecture, introducing the variational technique in the neural collaborative filtering field. This method does not depend on the particular model used to generate the latent representation. In this way, this approach can be applied as a plugin to any current and future specific models. The proposed models have been tested using four representative open datasets, three different quality measures, and state-of-art baselines. The results show the superiority of the proposed approach in scenarios where the variational enrichment exceeds the injected noise effect. Additionally, a framework is provided to enable the reproducibility of the conducted experiments.

【2】 Combining Reward and Rank Signals for Slate Recommendation 标题：组合奖励和排名信号进行板材推荐

作者：Imad Aouali,Sergey Ivanov,Mike Gartrell,David Rohde,Flavian Vasile,Victor Zaytsev,Diego Legrand 机构：Criteo AI Lab, Paris, France 备注：None 链接：https://arxiv.org/abs/2107.12455 摘要：我们考虑SLAST推荐的问题，推荐系统向用户呈现由k个推荐项目同时组成的集合或SLAT。如果用户发现推荐项目有吸引力，那么用户可以单击，推荐系统会收到一些反馈。推荐系统提供了两条信息：是否单击了石板(奖励），如果石板被点击了，点击了哪个项目(排名）。在本文中，我们建立了几个贝叶斯模型，其中包括奖励信号（奖励模型），排名信号（排名模型），或两者（全模型），为非个性化板岩推荐。在我们的实验中，我们分析了完整模型的性能增益，并表明随着目录中产品数量的增加或板岩大小的增加，该模型的误差显著降低。摘要：We consider the problem of slate recommendation, where the recommender system presents a user with a collection or slate composed of K recommended items at once. If the user finds the recommended items appealing then the user may click and the recommender system receives some feedback. Two pieces of information are available to the recommender system: was the slate clicked? (the reward), and if the slate was clicked, which item was clicked? (rank). In this paper, we formulate several Bayesian models that incorporate the reward signal (Reward model), the rank signal (Rank model), or both (Full model), for non-personalized slate recommendation. In our experiments, we analyze performance gains of the Full model and show that it achieves significantly lower error as the number of products in the catalog grows or as the slate size increases.

聚类(1篇)

【1】 A Simplified Framework for Air Route Clustering Based on ADS-B Data 标题：基于ADS-B数据的航线聚类简化框架

作者：Quan Duong,Tan Tran,Duc-Thinh Pham,An Mai 机构：ICT Department, John von Neumann Institute, Ho Chi Minh, Vietnam, Air Traffic Management Research Institute, School of Mechanical and Aerospace Engineering, Nanyang Technological University, Singapore, Singapore 备注：None 链接：https://arxiv.org/abs/2107.12869 摘要：随着时间的推移，飞行交通量不断增加，这使得战略交通流管理成为一个具有挑战性的问题，因为它需要大量的计算资源来模拟整个交通数据。另一方面，自动相关监视广播（ADS-B）技术被认为是一种很有前途的数据技术，它可以为飞行机组和地面控制人员安全有效地提供有关特定区域内飞机位置和速度的必要信息。为了解决这一问题，本文提出了一种基于ADS-B数据的机场间典型航线检测的简化框架。具体地说，基于相似性度量将航班流量划分为主要的分组，这有助于减少机场之间的航线数量。事实上，我们的框架可以考虑实际降低气流优化的计算成本和评估操作性能。最后，以三对不同机场的ADS-B流量飞行数据为例进行了实验，说明了该框架的潜在应用价值。通过将聚类性能的两个指标结合起来，并将人类的判断融入到目视检测中，检测出的机场间典型航线显示出了良好的效果。摘要：The volume of flight traffic gets increasing over the time, which makes the strategic traffic flow management become one of the challenging problems since it requires a lot of computational resources to model entire traffic data. On the other hand, Automatic Dependent Surveillance - Broadcast (ADS-B) technology has been considered as a promising data technology to provide both flight crews and ground control staff the necessary information safely and efficiently about the position and velocity of the airplanes in a specific area. In the attempt to tackle this problem, we presented in this paper a simplified framework that can support to detect the typical air routes between airports based on ADS-B data. Specifically, the flight traffic will be classified into major groups based on similarity measures, which helps to reduce the number of flight paths between airports. As a matter of fact, our framework can be taken into account to reduce practically the computational cost for air flow optimization and evaluate the operational performance. Finally, in order to illustrate the potential applications of our proposed framework, an experiment was performed using ADS-B traffic flight data of three different pairs of airports. The detected typical routes between each couple of airports show promising results by virtue of combining two indices for measuring the clustering performance and incorporating human judgment into the visual inspection.

推理|分析|理解|解释(3篇)

【1】 Pointer Value Retrieval: A new benchmark for understanding the limits of neural network generalization 标题：指针值检索：理解神经网络泛化极限的新基准

作者：Chiyuan Zhang,Maithra Raghu,Jon Kleinberg,Samy Bengio 机构：Google Research, Brain Team, Cornell University 链接：https://arxiv.org/abs/2107.12580 摘要：深度学习的成功在很大程度上依赖于神经网络对看不见的数据输出有意义的预测的能力——泛化。然而，尽管它的重要性，仍然有基本的开放性问题，如何神经网络推广。神经网络在多大程度上依赖于记忆——看到高度相似的训练实例——它们在多大程度上能够进行人类智能化的推理——识别数据背后的抽象规则？在本文中，我们介绍了一个新的基准，指针值检索（PVR）任务，探索神经网络泛化的局限性。虽然PVR任务可以由视觉输入和符号输入组成，每种输入都有不同的难度，但它们都有一个简单的基本规则。PVR任务输入的一部分充当指针，给出输入的另一部分的位置，该部分构成值（和输出）。我们证明了这种任务结构为理解泛化提供了一个丰富的测试平台，我们的实证研究表明，基于数据集大小、任务复杂性和模型结构的神经网络性能有很大的变化。位置、值和指针规则的交互作用还允许通过引入分布偏移和增加函数复杂性来开发细致入微的泛化测试。这些既揭示了微妙的失败，也揭示了令人惊讶的成功，表明了在这个基准上许多有希望的探索方向。摘要：The successes of deep learning critically rely on the ability of neural networks to output meaningful predictions on unseen data -- generalization. Yet despite its criticality, there remain fundamental open questions on how neural networks generalize. How much do neural networks rely on memorization -- seeing highly similar training examples -- and how much are they capable of human-intelligence styled reasoning -- identifying abstract rules underlying the data? In this paper we introduce a novel benchmark, Pointer Value Retrieval (PVR) tasks, that explore the limits of neural network generalization. While PVR tasks can consist of visual as well as symbolic inputs, each with varying levels of difficulty, they all have a simple underlying rule. One part of the PVR task input acts as a pointer, giving the location of a different part of the input, which forms the value (and output). We demonstrate that this task structure provides a rich testbed for understanding generalization, with our empirical study showing large variations in neural network performance based on dataset size, task complexity and model architecture. The interaction of position, values and the pointer rule also allow the development of nuanced tests of generalization, by introducing distribution shift and increasing functional complexity. These reveal both subtle failures and surprising successes, suggesting many promising directions of exploration on this benchmark.

【2】 Feature Synergy, Redundancy, and Independence in Global Model Explanations using SHAP Vector Decomposition 标题：使用Shap矢量分解的全局模型解释中的特征协同、冗余和独立

作者：Jan Ittner,Lukasz Bolikowski,Konstantin Hemker,Ricardo Kennedy 备注：7 pages, 2 figures 链接：https://arxiv.org/abs/2107.12436 摘要：我们提供了一种新的形式来解释监督模型中的成对特征依赖和交互作用。基于SHAP值和SHAP交互值，我们的方法将特征贡献分解为协同、冗余和独立的分量（SHAP向量的S-R-I分解）。我们提出一个几何解释的组成部分，并正式证明其基本性质。最后，我们通过将它们应用到构建的数据集和模型中，展示了协同、冗余和独立的效用。摘要：We offer a new formalism for global explanations of pairwise feature dependencies and interactions in supervised models. Building upon SHAP values and SHAP interaction values, our approach decomposes feature contributions into synergistic, redundant and independent components (S-R-I decomposition of SHAP vectors). We propose a geometric interpretation of the components and formally prove its basic properties. Finally, we demonstrate the utility of synergy, redundancy and independence by applying them to a constructed data set and model.

【3】 Wasserstein-Splitting Gaussian Process Regression for Heterogeneous Online Bayesian Inference 标题：异质在线贝叶斯推理的Wasserstein-Split高斯过程回归

作者：Michael E. Kepler,Alec Koppel,Amrit Singh Bedi,Daniel J. Stilwell 机构： Stilwell are with the Bradley Department ofElectrical and Computer Engineering, Virginia Polytechnic Institute andState University 链接：https://arxiv.org/abs/2107.12797 摘要：高斯过程（Gaussian processes，GPs）是一种著名的非参数贝叶斯推理技术，但在大样本情况下存在可扩展性问题，在非平稳或空间异构数据情况下性能会下降。在这项工作中，我们试图克服这些问题，通过（i）采用变分自由能近似GPs操作与在线期望传播步骤相结合；以及（ii）引入一个局部分裂步骤，当后验分布发生显著变化时，该步骤实例化一个新的GP，如后验分布上的Wasserstein度量所量化的那样。然后，随着时间的推移，这产生了一个可以增量更新的稀疏GPs集合，并适应训练数据中的局部性、异质性和非平稳性。摘要：Gaussian processes (GPs) are a well-known nonparametric Bayesian inference technique, but they suffer from scalability problems for large sample sizes, and their performance can degrade for non-stationary or spatially heterogeneous data. In this work, we seek to overcome these issues through (i) employing variational free energy approximations of GPs operating in tandem with online expectation propagation steps; and (ii) introducing a local splitting step which instantiates a new GP whenever the posterior distribution changes significantly as quantified by the Wasserstein metric over posterior distributions. Over time, then, this yields an ensemble of sparse GPs which may be updated incrementally, and adapts to locality, heterogeneity, and non-stationarity in training data.

检测相关(3篇)

【1】 Clickbait Detection in YouTube Videos 标题：YouTube视频中的点击诱饵检测

作者：Ruchira Gothankar,Fabio Di Troia,Mark Stamp 链接：https://arxiv.org/abs/2107.12791 摘要：YouTube视频通常包括迷人的描述和有趣的缩略图，旨在增加浏览量，从而增加发布视频的人的收入。这就鼓励人们发布clickbait视频，其中的内容可能与标题、描述或缩略图有很大的不同。实际上，用户被诱骗点击点击诱饵视频。在这项研究中，我们认为检测点击点击YouTube视频的挑战性问题。我们用多种最先进的机器学习技术进行实验，使用各种文本特征。摘要：YouTube videos often include captivating descriptions and intriguing thumbnails designed to increase the number of views, and thereby increase the revenue for the person who posted the video. This creates an incentive for people to post clickbait videos, in which the content might deviate significantly from the title, description, or thumbnail. In effect, users are tricked into clicking on clickbait videos. In this research, we consider the challenging problem of detecting clickbait YouTube videos. We experiment with multiple state-of-the-art machine learning techniques using a variety of textual features.

【2】 Optimizing Operating Points for High Performance Lesion Detection and Segmentation Using Lesion Size Reweighting 标题：基于病灶大小加权的高性能病变检测与分割操作点优化

作者：Brennan Nichyporuk,Justin Szeto,Douglas L. Arnold,Tal Arbel 机构： Centre for Intelligent Machines, McGill University, Montreal, Canada, MILA (Quebec Artificial Intelligence Institute), McGill University, Montreal, Canada, Montreal Neurological Institute, McGill University, Montreal, Canada 备注：Accepted at MIDL 2021 链接：https://arxiv.org/abs/2107.12978 摘要：有许多临床情况需要准确检测和分割患者图像中的所有局灶性病变（如病变、肿瘤）。在小病灶和大病灶混合的情况下，标准的二进制交叉熵损失将导致以丢失小病灶为代价更好地分割大病灶。调整操作点以准确检测所有病变通常会导致大病变的过度分割。在这项工作中，我们提出了一种新的重新加权策略，以消除这一性能差距，提高小病理检测性能，同时保持分割精度。我们在多发性硬化症患者图像的大规模、多扫描仪、多中心数据集上的实验表明，我们的重新加权策略大大优于竞争策略。摘要：There are many clinical contexts which require accurate detection and segmentation of all focal pathologies (e.g. lesions, tumours) in patient images. In cases where there are a mix of small and large lesions, standard binary cross entropy loss will result in better segmentation of large lesions at the expense of missing small ones. Adjusting the operating point to accurately detect all lesions generally leads to oversegmentation of large lesions. In this work, we propose a novel reweighing strategy to eliminate this performance gap, increasing small pathology detection performance while maintaining segmentation accuracy. We show that our reweighing strategy vastly outperforms competing strategies based on experiments on a large scale, multi-scanner, multi-center dataset of Multiple Sclerosis patient images.

【3】 Source-Agnostic Gravitational-Wave Detection with Recurrent Autoencoders 标题：用递归自动编码器探测不可知震源的引力波

作者：Eric A. Moreno,Jean-Roch Vlimant,Maria Spiropulu,Bartlomiej Borzyszkowski,Maurizio Pierini 机构：California Institute of Technology, Pasadena, California , USA, ‡, Gdansk University of Technology, Narutowicza ,,-, Gdansk, Poland, European Organization for Nuclear Research, Geneva , Switzerland 备注：16 pages, 6 figures 链接：https://arxiv.org/abs/2107.12698 摘要：针对激光干涉仪中重力波信号的检测问题，提出了一种基于深度递归自动编码器的异常检测技术。这类算法以噪声数据为基础，采用无监督的检测策略，即不针对特定的信号源。我们开发了一个定制的架构来分析来自两个干涉仪的数据。我们将得到的性能与其他自动编码器结构和卷积分类器进行了比较。与更传统的有监督技术相比，所提出的策略的无监督性带来了准确性方面的成本。另一方面，将实验灵敏度推广到预先计算的信号模板集合之外，在质量上有所提高。递归自动编码器的性能优于其他基于不同结构的自动编码器。本文提出的一类循环自动编码器可以补充引力波探测所采用的搜索策略，扩大正在进行的探测活动的范围。摘要：We present an application of anomaly detection techniques based on deep recurrent autoencoders to the problem of detecting gravitational wave signals in laser interferometers. Trained on noise data, this class of algorithms could detect signals using an unsupervised strategy, i.e., without targeting a specific kind of source. We develop a custom architecture to analyze the data from two interferometers. We compare the obtained performance to that obtained with other autoencoder architectures and with a convolutional classifier. The unsupervised nature of the proposed strategy comes with a cost in terms of accuracy, when compared to more traditional supervised techniques. On the other hand, there is a qualitative gain in generalizing the experimental sensitivity beyond the ensemble of pre-computed signal templates. The recurrent autoencoder outperforms other autoencoders based on different architectures. The class of recurrent autoencoders presented in this paper could complement the search strategy employed for gravitational wave detection and extend the reach of the ongoing detection campaigns.

分类|识别(2篇)

【1】 Sparse Bayesian Deep Learning for Dynamic System Identification 标题：稀疏贝叶斯深度学习在动态系统辨识中的应用

作者：Hongpeng Zhou,Chahine Ibrahim,Wei Xing Zheng,Wei Pan 备注：16 pages, 18 figures, 4 tables 链接：https://arxiv.org/abs/2107.12910 摘要：提出了一种用于系统辨识的深度神经网络稀疏贝叶斯处理方法。尽管DNNs在各个领域都表现出了令人印象深刻的逼近能力，但对于系统辨识问题仍然存在一些挑战。首先，dnn过于复杂，很容易过拟合训练数据。第二，系统辨识中输入回归系数的选取是非常重要的。第三，模型参数和预测的不确定性量化是必要的。提出的贝叶斯方法通过边缘似然/模型证据逼近和结构化群稀疏性诱导先验构造，为缓解上述挑战提供了一条原则性的途径。该辨识算法是一种迭代正则化的优化过程，可以像训练典型的DNNs一样有效地求解。在此基础上，提出了一种基于montecarlo积分法的实用计算方法来量化参数和预测的不确定性。在多个线性和非线性系统辨识基准上验证了该方法的有效性，取得了良好的仿真精度和竞争性。摘要：This paper proposes a sparse Bayesian treatment of deep neural networks (DNNs) for system identification. Although DNNs show impressive approximation ability in various fields, several challenges still exist for system identification problems. First, DNNs are known to be too complex that they can easily overfit the training data. Second, the selection of the input regressors for system identification is nontrivial. Third, uncertainty quantification of the model parameters and predictions are necessary. The proposed Bayesian approach offers a principled way to alleviate the above challenges by marginal likelihood/model evidence approximation and structured group sparsity-inducing priors construction. The identification algorithm is derived as an iterative regularized optimization procedure that can be solved as efficiently as training typical DNNs. Furthermore, a practical calculation approach based on the Monte-Carlo integration method is derived to quantify the uncertainty of the parameters and predictions. The effectiveness of the proposed Bayesian approach is demonstrated on several linear and nonlinear systems identification benchmarks with achieving good and competitive simulation accuracy.

【2】 ENHANCE (ENriching Health data by ANnotations of Crowd and Experts): A case study for skin lesion classification 标题：增强(通过人群和专家的注释丰富健康数据)：皮肤病变分类的案例研究

作者：Ralf Raumanns,Gerard Schouten,Max Joosten,Josien P. W. Pluim,Veronika Cheplygina 机构： Fontys University of Applied Science, Eindhoven, The Netherlands, Eindhoven University of Technology, Eindhoven, The Netherlands, IT University of Copenhagen, Denmark 链接：https://arxiv.org/abs/2107.12734 摘要：我们提出了一个开放的多注释数据集ENHANCE，以补充现有的ISIC和PH2皮肤病变分类数据集。此数据集包含来自非专家注释来源的视觉ABC（不对称、边界、颜色）特征注释：本科生、来自Amazon MTurk的人群工作者和经典图像处理算法。在本文中，我们首先分析了注释和病变诊断标签之间的相关性，以及不同注释来源之间的一致性。总体而言，我们发现非专家注释与诊断标签的相关性较弱，不同注释源之间的一致性较低。然后，我们研究了多任务学习（MTL）和注释作为附加标签，并表明非专家注释可以通过MTL改进（集成）最先进的卷积神经网络。我们希望我们的数据集可以用于多注释和/或MTL的进一步研究。Github上提供了所有数据和模型：https://github.com/raumannsr/ENHANCE. 摘要：We present ENHANCE, an open dataset with multiple annotations to complement the existing ISIC and PH2 skin lesion classification datasets. This dataset contains annotations of visual ABC (asymmetry, border, colour) features from non-expert annotation sources: undergraduate students, crowd workers from Amazon MTurk and classic image processing algorithms. In this paper we first analyse the correlations between the annotations and the diagnostic label of the lesion, as well as study the agreement between different annotation sources. Overall we find weak correlations of non-expert annotations with the diagnostic label, and low agreement between different annotation sources. We then study multi-task learning (MTL) with the annotations as additional labels, and show that non-expert annotations can improve (ensembles of) state-of-the-art convolutional neural networks via MTL. We hope that our dataset can be used in further research into multiple annotations and/or MTL. All data and models are available on Github: https://github.com/raumannsr/ENHANCE.

表征(1篇)

【1】 Geometric Deep Learning on Molecular Representations 标题：分子表示的几何深度学习

作者：Kenneth Atz,Francesca Grisoni,Gisbert Schneider 机构：ETH Zurich, Dept. Chemistry and Applied Biosciences, RETHINK, Vladimir-Prelog-Weg , Zurich, Switzerland., Eindhoven University of Technology, Dept. Biomedical Engineering, Groene Loper ,AZ Eindhoven, Netherlands. 链接：https://arxiv.org/abs/2107.12375 摘要：几何深度学习（Geometric deep learning，GDL）是近年来人工智能领域出现的一种新的研究范式，它是基于融合和处理对称信息的神经网络结构。GDL在分子建模应用中有着特殊的前景，其中存在着具有不同对称性质和抽象层次的各种分子表示。本文综述了分子GDL在药物发现、化学合成预测和量子化学中的应用。重点放在所学的分子特征的相关性和它们与已建立的分子描述符的互补性上。本文综述了当前的挑战和机遇，并对GDL在分子科学中的应用前景进行了展望。摘要：Geometric deep learning (GDL), which is based on neural network architectures that incorporate and process symmetry information, has emerged as a recent paradigm in artificial intelligence. GDL bears particular promise in molecular modeling applications, in which various molecular representations with different symmetry properties and levels of abstraction exist. This review provides a structured and harmonized overview of molecular GDL, highlighting its applications in drug discovery, chemical synthesis prediction, and quantum chemistry. Emphasis is placed on the relevance of the learned molecular features and their complementarity to well-established molecular descriptors. This review provides an overview of current challenges and opportunities, and presents a forecast of the future of GDL for molecular sciences.

3D|3D重建等相关(1篇)

【1】 Language Grounding with 3D Objects 标题：3D对象的语言基础

作者：Jesse Thomason,Mohit Shridhar,Yonatan Bisk,Chris Paxton,Luke Zettlemoyer 机构：University of Southern California, University of Washington, Carnegie Mellon University, NVIDIA 备注：this https URL 链接：https://arxiv.org/abs/2107.12514 摘要：对机器人看似简单的自然语言请求通常没有明确规定，例如“你能给我拿无线鼠标吗？”当查看架子上的鼠标时，从某些角度或位置可能看不到按钮的数量或电线的存在。候选小鼠的平面图像可能无法提供“无线”所需的鉴别信息。世界和其中的物体不是平面的图像，而是复杂的三维形状。如果人类根据物体的任何基本属性（如颜色、形状或纹理）请求物体，机器人应该进行必要的探索以完成任务。特别是，虽然在明确理解颜色和类别等视觉属性方面做出了大量的努力和进展，但在理解形状和轮廓的语言方面取得的进展相对较少。在这项工作中，我们介绍了一种新的推理任务，目标都是视觉和非视觉语言的三维物体。我们的新基准，ShapeNet注解引用表达式（SNARE），需要一个模型来选择两个对象中的哪一个被自然语言描述引用。我们介绍了几种基于剪辑的模型来区分物体，并证明了尽管视觉和语言联合建模的最新进展有助于机器人的语言理解，但这些模型在理解物体的三维本质（在操纵中起关键作用的属性）方面仍然较弱。特别是，我们发现在语言基础模型中添加视图估计可以提高SNARE和在机器人平台上识别语言中引用的对象的准确性。摘要：Seemingly simple natural language requests to a robot are generally underspecified, for example "Can you bring me the wireless mouse?" When viewing mice on the shelf, the number of buttons or presence of a wire may not be visible from certain angles or positions. Flat images of candidate mice may not provide the discriminative information needed for "wireless". The world, and objects in it, are not flat images but complex 3D shapes. If a human requests an object based on any of its basic properties, such as color, shape, or texture, robots should perform the necessary exploration to accomplish the task. In particular, while substantial effort and progress has been made on understanding explicitly visual attributes like color and category, comparatively little progress has been made on understanding language about shapes and contours. In this work, we introduce a novel reasoning task that targets both visual and non-visual language about 3D objects. Our new benchmark, ShapeNet Annotated with Referring Expressions (SNARE), requires a model to choose which of two objects is being referenced by a natural language description. We introduce several CLIP-based models for distinguishing objects and demonstrate that while recent advances in jointly modeling vision and language are useful for robotic language understanding, it is still the case that these models are weaker at understanding the 3D nature of objects -- properties which play a key role in manipulation. In particular, we find that adding view estimation to language grounding models improves accuracy on both SNARE and when identifying objects referred to in language on a robot platform.

优化|敛散性(6篇)

【1】 Learning Numeric Optimal Differentially Private Truncated Additive Mechanisms 标题：学习数值最优差分私有加法机制

作者：David M. Sommer,Lukas Abfalterer,Sheila Zingg,Esfandiar Mohammadi 备注：Code is available at this https URL 链接：https://arxiv.org/abs/2107.12957 摘要：差异私有（DP）机制面临的挑战是在保护其输入的同时提供准确的结果：隐私效用权衡。一种简单但功能强大的DP技术将噪声添加到灵敏度有界的查询输出中，以模糊精确的查询输出：加法机制。虽然大量的工作考虑了无限宽的噪声分布，但一些应用程序（例如，实时操作系统）需要对与实际查询的偏差进行硬限制，并且对此类机制的工作有限。具有截断噪声（即范围有界）的加法机制可以提供这样的硬边界。我们引入了一个基于梯度下降的工具来学习具有强效用边界的加性机制的截断噪声，同时优化顺序组合下的差异隐私，即在相同数据上显示多个噪声查询的场景。我们的方法可以学习离散的噪声模式，而不仅仅是一个预定义的概率分布的超参数。对于灵敏度有限的机制，我们表明，它是足够的考虑对称性和新{，从平均单调下降噪声，}确保一对代表性查询输出的隐私保证所有输入对的隐私（在一个元素中不同）。我们发现，我们产生的噪声的效用-隐私权衡曲线非常接近截断高斯函数，甚至在效用损失为$l_2$的情况下复制它们的形状。对于少量的成分，我们还改进了DP-SGD（子采样）。此外，我们将矩会计扩展到截断分布，允许合并具有不同输入相关零发生概率的机制输出事件。摘要：Differentially private (DP) mechanisms face the challenge of providing accurate results while protecting their inputs: the privacy-utility trade-off. A simple but powerful technique for DP adds noise to sensitivity-bounded query outputs to blur the exact query output: additive mechanisms. While a vast body of work considers infinitely wide noise distributions, some applications (e.g., real-time operating systems) require hard bounds on the deviations from the real query, and only limited work on such mechanisms exist. An additive mechanism with truncated noise (i.e., with bounded range) can offer such hard bounds. We introduce a gradient-descent-based tool to learn truncated noise for additive mechanisms with strong utility bounds while simultaneously optimizing for differential privacy under sequential composition, i.e., scenarios where multiple noisy queries on the same data are revealed. Our method can learn discrete noise patterns and not only hyper-parameters of a predefined probability distribution. For sensitivity bounded mechanisms, we show that it is sufficient to consider symmetric and that\new{, for from the mean monotonically falling noise,} ensuring privacy for a pair of representative query outputs guarantees privacy for all pairs of inputs (that differ in one element). We find that the utility-privacy trade-off curves of our generated noise are remarkably close to truncated Gaussians and even replicate their shape for $l_2$ utility-loss. For a low number of compositions, we also improved DP-SGD (sub-sampling). Moreover, we extend Moments Accountant to truncated distributions, allowing to incorporate mechanism output events with varying input-dependent zero occurrence probability.

【2】 Robust Optimization Framework for Training Shallow Neural Networks Using Reachability Method 标题：用可达性方法训练浅层神经网络的鲁棒优化框架

作者：Yejiang Yang,Weiming Xiang 机构： andYejiang Yang is with Department of Electrical Engineering, SouthwestJiaotong University, comWeiming Xiang is with School of Computer and Cyber Sciences, AugustaUniversity 链接：https://arxiv.org/abs/2107.12801 摘要：基于神经网络的可达性分析，提出了一种训练浅层神经网络的鲁棒优化框架。为了刻画输入数据的噪声，在区间集的描述中对输入训练数据进行了扰动。然后对隐藏层进行基于区间的可达性分析。根据可达性分析结果，在鲁棒最小二乘问题的框架下，提出了一种鲁棒优化训练方法。然后，将所提出的鲁棒最小二乘问题松弛为半定规划问题。结果表明，所提出的鲁棒学习方法可以在一定程度上以损失训练精度为代价，提供更好的抗扰动鲁棒性。最后，以机器人手臂模型学习为例，对该方法进行了评价。摘要：In this paper, a robust optimization framework is developed to train shallow neural networks based on reachability analysis of neural networks. To characterize noises of input data, the input training data is disturbed in the description of interval sets. Interval-based reachability analysis is then performed for the hidden layer. With the reachability analysis results, a robust optimization training method is developed in the framework of robust least-square problems. Then, the developed robust least-square problem is relaxed to a semidefinite programming problem. It has been shown that the developed robust learning method can provide better robustness against perturbations at the price of loss of training accuracy to some extent. At last, the proposed method is evaluated on a robot arm model learning example.

【3】 On the Role of Optimization in Double Descent: A Least Squares Study 标题：论优化在双重下降中的作用：一项最小二乘研究

作者：Ilja Kuzborskij,Csaba Szepesvári,Omar Rivasplata,Amal Rannen-Triki,Razvan Pascanu 机构：DeepMind, Canada, University of Alberta, Edmonton, University College London 链接：https://arxiv.org/abs/2107.12685 摘要：经验上已经观察到，随着模型尺寸的增加，深度神经网络的性能稳步提高，这与经典的过度拟合和泛化观点相矛盾。最近，有人提出双下降现象来调和这一观察结果与理论，这表明当模型变得足够过参数化时，测试误差有第二次下降，因为模型大小本身充当了一个隐式正则化器。在本文中，我们加入到这个领域不断增长的工作主体中，提供了一个学习动态的仔细研究，作为最小二乘情景下模型大小的函数。给出了最小二乘目标梯度下降解的超额风险界。边界依赖于输入特征的协方差矩阵的最小非零特征值，通过具有双重下降行为的函数形式。这为文献报道的双下降曲线提供了一个新的视角。我们对超额风险的分析允许将优化和泛化误差的影响解耦。特别地，我们发现在无噪声回归的情况下，双下降仅由优化相关的量来解释，这在Moore-Penrose伪逆解的研究中被忽略了。我们相信，我们的推导提供了一个与现有工作相比的另一种观点，揭示了这种现象的可能原因，至少在考虑的最小二乘法设置中。我们从经验上探讨我们的预测是否适用于神经网络，特别是中介隐藏激活的协方差是否与我们的推导预测的行为相似。摘要：Empirically it has been observed that the performance of deep neural networks steadily improves as we increase model size, contradicting the classical view on overfitting and generalization. Recently, the double descent phenomena has been proposed to reconcile this observation with theory, suggesting that the test error has a second descent when the model becomes sufficiently overparameterized, as the model size itself acts as an implicit regularizer. In this paper we add to the growing body of work in this space, providing a careful study of learning dynamics as a function of model size for the least squares scenario. We show an excess risk bound for the gradient descent solution of the least squares objective. The bound depends on the smallest non-zero eigenvalue of the covariance matrix of the input features, via a functional form that has the double descent behavior. This gives a new perspective on the double descent curves reported in the literature. Our analysis of the excess risk allows to decouple the effect of optimization and generalization error. In particular, we find that in case of noiseless regression, double descent is explained solely by optimization-related quantities, which was missed in studies focusing on the Moore-Penrose pseudoinverse solution. We believe that our derivation provides an alternative view compared to existing work, shedding some light on a possible cause of this phenomena, at least in the considered least squares setting. We empirically explore if our predictions hold for neural networks, in particular whether the covariance of intermediary hidden activations has a similar behavior as the one predicted by our derivations.

【4】 Convergence of Deep ReLU Networks 标题：深度RELU网络的收敛性

作者：Yuesheng Xu,Haizhang Zhang 机构：and 链接：https://arxiv.org/abs/2107.12530 摘要：当深度趋于无穷大时，我们用流行的ReLU激活函数来研究深度神经网络的收敛性。为此，我们引入了ReLU网络的激活域和激活矩阵的概念。用激活域上的激活矩阵乘法代替ReLU激活函数的应用，得到了ReLU网络的显式表达式。然后，我们将ReLU网络的收敛性证明为一类无穷矩阵乘积的收敛性。研究了这些矩阵无穷乘积收敛的充要条件。因此，我们建立了ReLU网络收敛的必要条件，即当ReLU网络的深度增加到无穷大时，权矩阵序列收敛到单位矩阵，偏差向量序列收敛到零。此外，我们还利用权矩阵和隐层偏差向量得到了深ReLU网络点态收敛的充分条件。这些结果为图像分类中著名的深度残差网络的设计策略提供了数学依据。摘要：We explore convergence of deep neural networks with the popular ReLU activation function, as the depth of the networks tends to infinity. To this end, we introduce the notion of activation domains and activation matrices of a ReLU network. By replacing applications of the ReLU activation function by multiplications with activation matrices on activation domains, we obtain an explicit expression of the ReLU network. We then identify the convergence of the ReLU networks as convergence of a class of infinite products of matrices. Sufficient and necessary conditions for convergence of these infinite products of matrices are studied. As a result, we establish necessary conditions for ReLU networks to converge that the sequence of weight matrices converges to the identity matrix and the sequence of the bias vectors converges to zero as the depth of ReLU networks increases to infinity. Moreover, we obtain sufficient conditions in terms of the weight matrices and bias vectors at hidden layers for pointwise convergence of deep ReLU networks. These results provide mathematical insights to the design strategy of the well-known deep residual networks in image classification.

【5】 Debiasing In-Sample Policy Performance for Small-Data, Large-Scale Optimization 标题：针对小数据、大规模优化消除样本内策略性能偏差

作者：Vishal Gupta,Michael Huang,Paat Rusmevichientong 机构：Data Science and Operations, USC Marshall School of Business, Los Angles, CA 链接：https://arxiv.org/abs/2107.12438 摘要：在数据匮乏的环境下，交叉验证的表现不佳，提出了一种新的数据驱动优化策略的样本外性能估计方法，该方法利用优化问题的灵敏度分析来估计最优目标值相对于数据中噪声量的梯度，并将估计的梯度作为策略的样本内性能演出与交叉验证技术不同，我们的方法避免了为测试集牺牲数据，在训练时利用所有数据，因此非常适合于数据稀缺的环境。对于具有不确定线性目标但已知的潜在非凸可行域的优化问题，我们证明了估计量的偏差和方差的界。对于可行域在某种意义上是弱耦合的更特殊的优化问题，我们证明了更强的结果。具体地说，我们给出了我们的估计的误差的显式高概率界，它一致地保持在一个策略类上，并且依赖于问题的维数和策略类的复杂性。我们的界表明，在温和的条件下，我们的估计误差随着优化问题维数的增加而消失，即使可用数据量保持小而恒定。换言之，我们证明了我们的估计器在小数据、大尺度情况下表现良好。最后，我们通过一个使用真实数据调度紧急医疗响应服务的案例研究，将我们提出的方法与最新的方法进行了数值比较。我们的方法提供了对样本外性能更准确的估计，并学习了性能更好的策略。摘要：Motivated by the poor performance of cross-validation in settings where data are scarce, we propose a novel estimator of the out-of-sample performance of a policy in data-driven optimization.Our approach exploits the optimization problem's sensitivity analysis to estimate the gradient of the optimal objective value with respect to the amount of noise in the data and uses the estimated gradient to debias the policy's in-sample performance. Unlike cross-validation techniques, our approach avoids sacrificing data for a test set, utilizes all data when training and, hence, is well-suited to settings where data are scarce. We prove bounds on the bias and variance of our estimator for optimization problems with uncertain linear objectives but known, potentially non-convex, feasible regions. For more specialized optimization problems where the feasible region is ``weakly-coupled" in a certain sense, we prove stronger results. Specifically, we provide explicit high-probability bounds on the error of our estimator that hold uniformly over a policy class and depends on the problem's dimension and policy class's complexity. Our bounds show that under mild conditions, the error of our estimator vanishes as the dimension of the optimization problem grows, even if the amount of available data remains small and constant. Said differently, we prove our estimator performs well in the small-data, large-scale regime. Finally, we numerically compare our proposed method to state-of-the-art approaches through a case-study on dispatching emergency medical response services using real data. Our method provides more accurate estimates of out-of-sample performance and learns better-performing policies.

【6】 Global optimization using random embeddings 标题：基于随机嵌入的全局优化

作者：Coralia Cartis,Estelle Massart,Adilet Otemissov 机构： This work was supported byThe Alan Turing Institute under The Engineering and Physical Sciences Research Council (EPSRC) grantEPN 5 10 1 29 1 and under the Turing project scheme, ‡Mathematical Institute, University of Oxford 备注：41 pages 链接：https://arxiv.org/abs/2107.12102 摘要：提出了一种求解Lipschitz连续目标全局优化问题的随机子空间算法框架，并利用圆锥积分几何的新工具分析了其收敛性。X-REGO以连续或同时的方式将高维的原始问题随机投影为低维的子问题，然后可以用任何全局甚至局部的优化求解器来求解。我们估计了随机嵌入子问题与原问题共享（近似）相同全局最优解的概率。在对问题（具有严格可行的全局解）和解算器（保证以足够高的概率找到约化问题的近似全局解）的弱假设下，利用这个成功概率证明了X-REGO收敛到原问题的近似全局解。在低有效维数的无约束目标只在一个低维子空间上变化的特殊情况下，我们提出了一个X-REGO变量，它探索增加维数的随机子空间，直到找到问题的有效维数，导致X-REGO在有限个嵌入后全局收敛，与有效尺寸成比例。数值结果表明，该方法能有效地找到原问题的有效维数和近似全局极小值。摘要：We propose a random-subspace algorithmic framework for global optimization of Lipschitz-continuous objectives, and analyse its convergence using novel tools from conic integral geometry. X-REGO randomly projects, in a sequential or simultaneous manner, the high-dimensional original problem into low-dimensional subproblems that can then be solved with any global, or even local, optimization solver. We estimate the probability that the randomly-embedded subproblem shares (approximately) the same global optimum as the original problem. This success probability is then used to show convergence of X-REGO to an approximate global solution of the original problem, under weak assumptions on the problem (having a strictly feasible global solution) and on the solver (guaranteed to find an approximate global solution of the reduced problem with sufficiently high probability). In the particular case of unconstrained objectives with low effective dimension, that only vary over a low-dimensional subspace, we propose an X-REGO variant that explores random subspaces of increasing dimension until finding the effective dimension of the problem, leading to X-REGO globally converging after a finite number of embeddings, proportional to the effective dimension. We show numerically that this variant efficiently finds both the effective dimension and an approximate global minimizer of the original problem.

预测|估计(4篇)

【1】 Comparing Prophet and Deep Learning to ARIMA in Forecasting Wholesale Food Prices 标题：预测食品批发价格的预言者和深度学习与ARIMA的比较

作者：Lorenzo Menculini,Andrea Marini,Massimiliano Proietti,Alberto Garinei,Alessio Bozza,Cecilia Moretti,Marcello Marconi 机构：Idea-re S.r.l., Perugia, Italy, Department of Engineering Sciences, Guglielmo Marconi University, Rome, Italy, Cancelloni Food Service S.p.A., Magione, Italy 备注：15 pages, 6 figures, 10 tables 链接：https://arxiv.org/abs/2107.12770 摘要：正确制定销售价格对企业具有重要意义，因此，对价格时间序列的研究和预测不仅是数据科学的研究课题，而且是经济和应用的研究课题。在本文中，我们使用不同的技术来预测一家意大利食品批发商所使用的三种食品的销售价格，作为实现通常由人力负责的定价任务自动化的一个步骤。我们考虑arima模型，并将其与先知，一个可扩展的预测工具，由脸谱网开发和基于广义加性模型，以及基于长短期记忆（LSTM）和卷积神经网络（CNNs）的深度学习模型。ARIMA模型是计量经济分析中常用的模型，为研究中的问题提供了一个很好的范例。我们的结果表明，对于所研究的问题，ARIMA的性能类似于LSTM神经网络，而CNNs和LSTMs的组合可以获得最佳的整体精度，但需要更多的时间来调整。相反，Prophet的使用速度很快，但不太准确。摘要：Setting sale prices correctly is of great importance for firms, and the study and forecast of prices time series is therefore a relevant topic not only from a data science perspective but also from an economic and applicative one. In this paper we exhamine different techniques to forecast the sale prices of three food products applied by an Italian food wholesaler, as a step towards the automation of pricing tasks usually taken care by human workforce. We consider ARIMA models and compare them to Prophet, a scalable forecasting tool developed by Facebook and based on a generalized additive model, and to deep learning models based on Long Short--Term Memory (LSTM) and Convolutional Neural Networks (CNNs). ARIMA models are frequently used in econometric analyses, providing a good bechmark for the problem under study. Our results indicate that ARIMA performs similarly to LSTM neural networks for the problem under study, while the combination of CNNs and LSTMs attains the best overall accuracy, but requires more time to be tuned. On the contrary, Prophet is very fast to use, but less accurate.

【2】 Vision-Guided Forecasting -- Visual Context for Multi-Horizon Time Series Forecasting 标题：视觉引导预测--多层次时间序列预测的视觉语境

作者：Eitan Kosman,Dotan Di Castro 机构：Technion - Israel Institute of Technology, Bosch Center for Artificial Intelligence 链接：https://arxiv.org/abs/2107.12674 摘要：近年来，自动驾驶获得了巨大的吸引力，因为它有可能改变我们的通勤方式。为了估计车辆的状态，人们付出了很大的努力。同时，学习预测前方车辆的状态会引入新的功能，例如预测危险情况。此外，预测带来了新的监督机会，学习预测更丰富的背景，表达了多个视野。直观地说，来自前置摄像头的视频流是必要的，因为它对即将到来的道路信息进行编码。此外，车辆状态的历史痕迹提供了更多的背景。在这篇论文中，我们将这两种模式融合，来处理车辆状态的多视界预测。我们设计并实验了3种端到端架构，分别利用三维卷积进行视觉特征提取和一维卷积进行速度和转向角轨迹的特征提取。为了证明我们的方法的有效性，我们在两个公开的真实世界数据集，Comma2k19和Udacity challenge上进行了广泛的实验。我们证明，我们能够预测一辆车的状态到不同的水平，同时在驾驶状态估计的相关任务上优于目前最新的结果。我们检验了视觉特征的贡献，发现在Udacity和Comma2k19数据集上，一个输入视觉特征的模型的误差分别是不使用这些特征的模型误差的56.6%和66.9%。摘要：Autonomous driving gained huge traction in recent years, due to its potential to change the way we commute. Much effort has been put into trying to estimate the state of a vehicle. Meanwhile, learning to forecast the state of a vehicle ahead introduces new capabilities, such as predicting dangerous situations. Moreover, forecasting brings new supervision opportunities by learning to predict richer a context, expressed by multiple horizons. Intuitively, a video stream originated from a front-facing camera is necessary because it encodes information about the upcoming road. Besides, historical traces of the vehicle's states give more context. In this paper, we tackle multi-horizon forecasting of vehicle states by fusing the two modalities. We design and experiment with 3 end-to-end architectures that exploit 3D convolutions for visual features extraction and 1D convolutions for features extraction from speed and steering angle traces. To demonstrate the effectiveness of our method, we perform extensive experiments on two publicly available real-world datasets, Comma2k19 and the Udacity challenge. We show that we are able to forecast a vehicle's state to various horizons, while outperforming the current state-of-the-art results on the related task of driving state estimation. We examine the contribution of vision features, and find that a model fed with vision features achieves an error that is 56.6% and 66.9% of the error of a model that doesn't use those features, on the Udacity and Comma2k19 datasets respectively.

【3】 A Data-driven feature selection and machine-learning model benchmark for the prediction of longitudinal dispersion coefficient 标题：用于纵向离散系数预测的数据驱动特征选择和机器学习模型基准

作者：Yifeng Zhao,Pei Zhang,S. A. Galindo-Torres,Stan Z. Li 机构：Department of Environmental Science and Engineering, Zhejiang University, Hangzhou, CHN, &, School of Engineering, Westlake University, Zicheng Liu, S.A. Galindo-Torres∗ 链接：https://arxiv.org/abs/2107.12970 摘要：纵向色散（LD）是自然流中标量输运的主要过程。对LD系数（Dl）的准确预测可以在相关仿真中产生性能飞跃。新兴的机器学习技术为解决这一问题提供了一种自适应工具。然而，现有的研究大多利用一个未经验证的四元数特征集，通过简单的理论推导得到。很少有研究关注它的可靠性和合理性。此外，由于缺乏可比较性，ML模型在不同场景下的正确选择仍然是个未知数。本研究首先采用特徵梯度选取器，直接从多变数资料中提取局部最佳特徵集。然后，通过与典型ML模型进行性能比较，提出了一个全局最优特征集（通道宽度、流速、通道坡度和横截面积）。信道斜率是预测LDC的关键参数。此外，我们设计了一个加权评估指标，使综合模型比较。以简单线性模型为基准，给出了单模型和集成学习模型的基准。讨论了各种方法的优缺点。结果表明，支持向量机的性能明显优于其他模型。决策树的泛化能力较差，不适合于该问题。值得注意的是，在这个低维问题上，简单模型比复杂模型更具优势，因为它们在回归和泛化之间有更好的平衡。摘要：Longitudinal Dispersion(LD) is the dominant process of scalar transport in natural streams. An accurate prediction on LD coefficient(Dl) can produce a performance leap in related simulation. The emerging machine learning(ML) techniques provide a self-adaptive tool for this problem. However, most of the existing studies utilize an unproved quaternion feature set, obtained through simple theoretical deduction. Few studies have put attention on its reliability and rationality. Besides, due to the lack of comparative comparison, the proper choice of ML models in different scenarios still remains unknown. In this study, the Feature Gradient selector was first adopted to distill the local optimal feature sets directly from multivariable data. Then, a global optimal feature set (the channel width, the flow velocity, the channel slope and the cross sectional area) was proposed through numerical comparison of the distilled local optimums in performance with representative ML models. The channel slope is identified to be the key parameter for the prediction of LDC. Further, we designed a weighted evaluation metric which enables comprehensive model comparison. With the simple linear model as the baseline, a benchmark of single and ensemble learning models was provided. Advantages and disadvantages of the methods involved were also discussed. Results show that the support vector machine has significantly better performance than other models. Decision tree is not suitable for this problem due to poor generalization ability. Notably, simple models show superiority over complicated model on this low-dimensional problem, for their better balance between regression and generalization.

【4】 Initial Foundation for Predicting Individual Earthquake's Location and Magnitude by Using Glass-Box Physics Rule Learner 标题：利用玻璃盒物理规则学习器预测个体地震位置和震级的初步基础

作者：In Ho Cho 机构：CCEE Department, Iowa State University, Ames, IA , USA 链接：https://arxiv.org/abs/2107.12915 摘要：尽管研究人员积累了有关地震成因的知识和数十年的地震数据，但在特定时间和地点预测即将发生的单个地震仍然是一个长期的谜。本研究假设观察到的数据隐藏了一个新的玻璃盒（相对于黑盒）物理规则学习者（GPRL）框架可以解开的隐藏规则。GPRL的两个基本要素，即卷积信息索引和透明链接函数，没有任何预先定义的地震相关机制或统计规律，直接从数据中寻找规则的通用表达式。GPRL用10年的数据进行训练，似乎发现了合理的规则，表明岩石圈释放能量的伪功率和伪涡度的结合。独立的可行性测试支持了被解开的规则在预测地震的震级和具体位置上的有希望的作用。已确定的规则和GPRL尚处于初级阶段，需要进行实质性的改进。尽管如此，这项研究仍然暗示了数据引导的潜在路径的存在，以预测即将发生的个别地震。摘要：Although researchers accumulated knowledge about seismogenesis and decades-long earthquake data, predicting imminent individual earthquakes at a specific time and location remains a long-standing enigma. This study hypothesizes that the observed data conceal the hidden rules which may be unraveled by a novel glass-box (as opposed to black-box) physics rule learner (GPRL) framework. Without any predefined earthquake-related mechanisms or statistical laws, GPRL's two essentials, convolved information index and transparent link function, seek generic expressions of rules directly from data. GPRL's training with 10-years data appears to identify plausible rules, suggesting a combination of the pseudo power and the pseudo vorticity of released energy in the lithosphere. Independent feasibility test supports the promising role of the unraveled rules in predicting earthquakes' magnitudes and their specific locations. The identified rules and GPRL are in their infancy requiring substantial improvement. Still, this study hints at the existence of the data-guided hidden pathway to imminent individual earthquake prediction.

其他神经网络|深度学习|模型|建模(16篇)

【1】 Verifiable Coded Computing: Towards Fast, Secure and Private Distributed Machine Learning 标题：可验证编码计算：迈向快速、安全和私有的分布式机器学习

作者：Tingting Tang,Ramy E. Ali,Hanieh Hashemi,Tynan Gangwani,Salman Avestimehr,Murali Annavaram 机构： University of Southern California 链接：https://arxiv.org/abs/2107.12958 摘要：散兵游勇、拜占庭工人和数据隐私是分布式云计算的主要瓶颈。先前的一些工作提出了编码计算策略来共同应对这三个挑战。它们需要大量的工作人员、显著的通信开销或显著的计算复杂性来容忍恶意工作人员。在以前的方案中，大部分开销来自于这样一个事实：它们将所有三个问题的编码紧密地耦合到一个单一的框架中。在这项工作中，我们提出了可验证编码计算（VCC）框架，该框架将拜占庭节点检测挑战与散乱容错解耦。VCC利用编码计算来处理散乱者和隐私，然后使用可验证计算的正交方法来处理拜占庭节点。此外，VCC动态地调整其编码方案，以在具有拜占庭保护的情况下折衷抗散乱性，反之亦然。我们评估了VCC在计算密集型分布logistic回归中的应用。我们的实验表明，VCC使传统的分布式logistic回归的非编码实现速度提高了3.2倍-6.9倍，测试精度也提高了12.6%。摘要：Stragglers, Byzantine workers, and data privacy are the main bottlenecks in distributed cloud computing. Several prior works proposed coded computing strategies to jointly address all three challenges. They require either a large number of workers, a significant communication cost or a significant computational complexity to tolerate malicious workers. Much of the overhead in prior schemes comes from the fact that they tightly couple coding for all three problems into a single framework. In this work, we propose Verifiable Coded Computing (VCC) framework that decouples Byzantine node detection challenge from the straggler tolerance. VCC leverages coded computing just for handling stragglers and privacy, and then uses an orthogonal approach of verifiable computing to tackle Byzantine nodes. Furthermore, VCC dynamically adapts its coding scheme to tradeoff straggler tolerance with Byzantine protection and vice-versa. We evaluate VCC on compute intensive distributed logistic regression application. Our experiments show that VCC speeds up the conventional uncoded implementation of distributed logistic regression by $3.2\times-6.9\times$, and also improves the test accuracy by up to $12.6\%$.

【2】 Experiments on Properties of Hidden Structures of Sparse Neural Networks 标题：稀疏神经网络隐层结构性质的实验研究

作者：Julian Stier,Harshil Darji,Michael Granitzer 机构：,, ∼ b,d 链接：https://arxiv.org/abs/2107.12917 摘要：神经网络结构的稀疏性可以减少能量消耗，减少内存使用，在方便的硬件上加快计算速度，以及自动机器学习。如果稀疏性产生某种结构，它可以解释学习过程中自动获得的特征。我们提供了深入的实验，在实验中，我们展示了如何稀疏可以通过预先初始化，修剪，并在学习过程中，回答问题之间的关系，神经网络的结构和他们的表现。这包括将网络理论中的先验知识引入到递归神经网络中的第一项工作，以及在神经网络结构搜索期间的结构性能预测。在我们的实验中，我们展示了数量级盲剪枝在80%压缩和再训练的MNIST上如何达到97.5%，比没有压缩的情况多0.5个点，数量级均匀剪枝明显不如它，以及在CIFAR10上增强了性能预测的遗传搜索如何达到82.4%。此外，学习Reber语法的递归网络的性能预测显示，仅给定结构信息时，R^2$最高可达0.81。摘要：Sparsity in the structure of Neural Networks can lead to less energy consumption, less memory usage, faster computation times on convenient hardware, and automated machine learning. If sparsity gives rise to certain kinds of structure, it can explain automatically obtained features during learning. We provide insights into experiments in which we show how sparsity can be achieved through prior initialization, pruning, and during learning, and answer questions on the relationship between the structure of Neural Networks and their performance. This includes the first work of inducing priors from network theory into Recurrent Neural Networks and an architectural performance prediction during a Neural Architecture Search. Within our experiments, we show how magnitude class blinded pruning achieves 97.5% on MNIST with 80% compression and re-training, which is 0.5 points more than without compression, that magnitude class uniform pruning is significantly inferior to it and how a genetic search enhanced with performance prediction achieves 82.4% on CIFAR10. Further, performance prediction for Recurrent Networks learning the Reber grammar shows an $R^2$ of up to 0.81 given only structural information.

【3】 Model Free Barrier Functions via Implicit Evading Maneuvers 标题：通过隐式规避策略实现无模型屏障功能

作者：Eric Squires,Rohit Konda,Samuel Coogan,Magnus Egerstedt 机构：Georgia Tech Research Institute, UC Santa Barbara, Georgia Institute of Technology, UC Irvine 链接：https://arxiv.org/abs/2107.12871 摘要：本文证明，在某些情况下，由于使用屏障功能而产生的安全超控可能会受到不必要的限制。特别地，我们研究了固定翼避免碰撞的情况，并表明当使用障碍函数时，有两个固定翼飞机比完全没有障碍函数时更接近碰撞的情况。此外，我们构造了这样的情况：即使车辆以任意距离启动，屏障功能也会将系统标记为不安全的。换言之，屏障功能可确保安全，但对性能造成不必要的成本。因此，我们引入了无模型屏障函数，它采用数据驱动的方法来创建屏障函数。在两个固定翼飞机的避碰仿真中，我们证明了无模型障碍函数的有效性。摘要：This paper demonstrates that in some cases the safety override arising from the use of a barrier function can be needlessly restrictive. In particular, we examine the case of fixed wing collision avoidance and show that when using a barrier function, there are cases where two fixed wing aircraft can come closer to colliding than if there were no barrier function at all. In addition, we construct cases where the barrier function labels the system as unsafe even when the vehicles start arbitrarily far apart. In other words, the barrier function ensures safety but with unnecessary costs to performance. We therefore introduce model free barrier functions which take a data driven approach to creating a barrier function. We demonstrate the effectiveness of model free barrier functions in a collision avoidance simulation of two fixed-wing aircraft.

【4】 Neural Network Branch-and-Bound for Neural Network Verification 标题：神经网络验证中的神经网络分枝定界法

作者：Florian Jaeckle,Jingyue Lu,M. Pawan Kumar 机构：Department of Engineering, University of Oxford, Oxford OX,PJ, Department of Statistics, Oxford OX,LB, Editor: 备注：arXiv admin note: substantial text overlap with arXiv:1912.01329 链接：https://arxiv.org/abs/2107.12855 摘要：许多可用的形式化验证方法都是统一分枝定界（BaB）公式的实例。我们提出了一个新的机器学习框架，可以用来设计有效的分支策略以及计算更好的下界。具体来说，我们学习了两个图神经网络（GNN），它们都直接将我们要验证的网络作为图输入，并通过GNN层执行向前向后的传递。我们使用一个GNN来模拟强分支启发式行为，另一个GNN来计算凸松弛的可行对偶解，从而提供一个有效的下界。我们提供了一个新的验证数据集，它比文献中使用的数据集更具挑战性，从而为验证算法改进的测试提供了一个有效的替代方案。虽然只使用其中一个GNN可以减少验证时间，但在结合这两种GNN方法时，我们可以获得最佳性能。与几种最先进的验证方法相比，我们的组合框架在各种卷积网络上的验证所需的分支数和时间都减少了50%。此外，我们还证明了我们的GNN模型可以很好地推广到更大的不可见网络上的硬属性。摘要：Many available formal verification methods have been shown to be instances of a unified Branch-and-Bound (BaB) formulation. We propose a novel machine learning framework that can be used for designing an effective branching strategy as well as for computing better lower bounds. Specifically, we learn two graph neural networks (GNN) that both directly treat the network we want to verify as a graph input and perform forward-backward passes through the GNN layers. We use one GNN to simulate the strong branching heuristic behaviour and another to compute a feasible dual solution of the convex relaxation, thereby providing a valid lower bound. We provide a new verification dataset that is more challenging than those used in the literature, thereby providing an effective alternative for testing algorithmic improvements for verification. Whilst using just one of the GNNs leads to a reduction in verification time, we get optimal performance when combining the two GNN approaches. Our combined framework achieves a 50\% reduction in both the number of branches and the time required for verification on various convolutional networks when compared to several state-of-the-art verification methods. In addition, we show that our GNN models generalize well to harder properties on larger unseen networks.

【5】 Learning Local Recurrent Models for Human Mesh Recovery 标题：用于人体网格恢复的学习局部递归模型

作者：Runze Li,Srikrishna Karanam,Ren Li,Terrence Chen,Bir Bhanu,Ziyan Wu 机构：United Imaging Intelligence, Cambridge MA, USA, University of California Riverside, Riverside CA, USA 备注：10 pages, 6 figures, 2 tables 链接：https://arxiv.org/abs/2107.12847 摘要：我们考虑的问题估计帧级全人体网格给一个视频的人自然运动动力学。虽然这一领域在基于单个图像的网格估计方面取得了很大进展，但由于其在缓解深度模糊和遮挡等问题方面的作用，从视频中推断网格动力学的努力最近有所上升。然而，现有工作的一个关键限制是假设所有观测到的运动动力学可以使用一个动力学/循环模型来建模。虽然这在相对简单的动态情况下可能会很好地工作，但在野外视频中进行推理会带来许多挑战。特别地，典型的情况是，人的不同身体部位在视频中经历不同的动力学，例如，腿可以以与手（例如，跳舞的人）动力学不同的方式移动。为了解决这些问题，我们提出了一种新的视频网格恢复方法，该方法根据标准的骨架模型将人体网格划分为多个局部区域。然后，我们用单独的递归模型对每个局部部分的动力学进行建模，每个模型根据已知的人体运动学结构进行适当的调节。这就产生了一个基于结构的局部递归学习体系结构，它可以通过可用的注释以端到端的方式进行训练。我们在Human3.6M、MPI-INF-3DHP和3DPW等标准视频网格恢复基准数据集上进行了各种实验，证明了我们的局部动态建模设计的有效性，并建立了基于标准评估指标的最新结果。摘要：We consider the problem of estimating frame-level full human body meshes given a video of a person with natural motion dynamics. While much progress in this field has been in single image-based mesh estimation, there has been a recent uptick in efforts to infer mesh dynamics from video given its role in alleviating issues such as depth ambiguity and occlusions. However, a key limitation of existing work is the assumption that all the observed motion dynamics can be modeled using one dynamical/recurrent model. While this may work well in cases with relatively simplistic dynamics, inference with in-the-wild videos presents many challenges. In particular, it is typically the case that different body parts of a person undergo different dynamics in the video, e.g., legs may move in a way that may be dynamically different from hands (e.g., a person dancing). To address these issues, we present a new method for video mesh recovery that divides the human mesh into several local parts following the standard skeletal model. We then model the dynamics of each local part with separate recurrent models, with each model conditioned appropriately based on the known kinematic structure of the human body. This results in a structure-informed local recurrent learning architecture that can be trained in an end-to-end fashion with available annotations. We conduct a variety of experiments on standard video mesh recovery benchmark datasets such as Human3.6M, MPI-INF-3DHP, and 3DPW, demonstrating the efficacy of our design of modeling local dynamics as well as establishing state-of-the-art results based on standard evaluation metrics.

【6】 Open-Ended Learning Leads to Generally Capable Agents 标题：开放式学习造就具有普遍能力的代理

作者：Open-Ended Learning Team,Adam Stooke,Anuj Mahajan,Catarina Barros,Charlie Deck,Jakob Bauer,Jakub Sygnowski,Maja Trebacz,Max Jaderberg,Michael Mathieu,Nat McAleese,Nathalie Bradley-Schmieg,Nathaniel Wong,Nicolas Porcel,Roberta Raileanu,Steph Hughes-Fitt,Valentin Dalibard,Wojciech Marian Czarnecki 机构：DeepMind, London, UK 链接：https://arxiv.org/abs/2107.12808 摘要：在这项工作中，我们创建的代理可以很好地执行一个单独的任务，表现出更广泛的行为概括到一个巨大的，丰富的挑战空间。我们在一个环境域中定义了一系列任务，并展示了训练代理的能力，这些代理通常能够跨越这个广阔的空间和更远的地方。环境是天生的多智能体，跨越了竞争、合作和独立游戏的连续统一体，这些游戏位于程序生成的物理三维世界中。由此产生的空间在对代理的挑战方面异常多样化，因此，即使衡量代理的学习进度也是一个开放的研究问题。我们提出了一个在连续几代代理之间改进的迭代概念，而不是寻求单一目标的最大化，允许我们量化进度，尽管任务在可实现的回报方面是无法比拟的。通过构造一个开放的学习过程，动态地改变训练任务的分布和训练目标，使agent不停止学习，从而实现对新行为的一致学习。由此产生的代理能够在我们的每一个人类可解的评估级别中获得奖励，行为概括为任务宇宙中的许多突出点。这种零炮通用的例子包括良好的性能，隐藏和寻找，捕捉旗帜，和标签。通过分析和手工编写的探测任务，我们描述了我们的代理的行为，并发现有趣的紧急启发式行为，如试错实验、简单的工具使用、选项切换和合作。最后，我们证明了该代理的一般功能可以通过廉价的微调解锁更大规模的行为转移。摘要：In this work we create agents that can perform well beyond a single, individual task, that exhibit much wider generalisation of behaviour to a massive, rich space of challenges. We define a universe of tasks within an environment domain and demonstrate the ability to train agents that are generally capable across this vast space and beyond. The environment is natively multi-agent, spanning the continuum of competitive, cooperative, and independent games, which are situated within procedurally generated physical 3D worlds. The resulting space is exceptionally diverse in terms of the challenges posed to agents, and as such, even measuring the learning progress of an agent is an open research problem. We propose an iterative notion of improvement between successive generations of agents, rather than seeking to maximise a singular objective, allowing us to quantify progress despite tasks being incomparable in terms of achievable rewards. We show that through constructing an open-ended learning process, which dynamically changes the training task distributions and training objectives such that the agent never stops learning, we achieve consistent learning of new behaviours. The resulting agent is able to score reward in every one of our humanly solvable evaluation levels, with behaviour generalising to many held-out points in the universe of tasks. Examples of this zero-shot generalisation include good performance on Hide and Seek, Capture the Flag, and Tag. Through analysis and hand-authored probe tasks we characterise the behaviour of our agent, and find interesting emergent heuristic behaviours such as trial-and-error experimentation, simple tool use, option switching, and cooperation. Finally, we demonstrate that the general capabilities of this agent could unlock larger scale transfer of behaviour through cheap finetuning.

【7】 Continual Learning with Neuron Activation Importance 标题：持续学习与神经元激活的重要性

作者：Sohee Kim,Seungkyu Lee 机构：Kyunghee University, Department of Computer Engineering, Yongin, Republic of Korea 链接：https://arxiv.org/abs/2107.12657 摘要：持续学习是一个概念，在线学习与多个连续的任务。持续学习的一个关键障碍是，网络应该学习新任务，保持对旧任务的知识，而不访问旧任务的任何数据。本文提出了一种基于神经元激活重要性的正则化方法，用于不考虑任务顺序的稳定连续学习。我们在现有的基准数据集上进行了综合实验，不仅评估了该方法的稳定性和可塑性，提高了分类精度，而且评估了性能随任务顺序变化的鲁棒性。摘要：Continual learning is a concept of online learning with multiple sequential tasks. One of the critical barriers of continual learning is that a network should learn a new task keeping the knowledge of old tasks without access to any data of the old tasks. In this paper, we propose a neuron activation importance-based regularization method for stable continual learning regardless of the order of tasks. We conduct comprehensive experiments on existing benchmark data sets to evaluate not just the stability and plasticity of our method with improved classification accuracy also the robustness of the performance along the changes of task order.

【8】 Co-Transport for Class-Incremental Learning 标题：用于班级增量学习的协同传输

作者：Da-Wei Zhou,Han-Jia Ye,De-Chuan Zhan 机构：State Key Laboratory for Novel Software Technology, Nanjing University 备注：Accepted to ACM Multimedia 2021 链接：https://arxiv.org/abs/2107.12654 摘要：传统的学习系统是在封闭的世界中为固定数量的课程而训练的，需要预先收集数据集。然而，新的类经常出现在现实世界的应用程序中，并且应该逐步学习。例如，在电子商务中，每天都会出现新类型的产品，在社交媒体社区中，新的话题也会频繁出现。在这种情况下，增量模型应该一次学习几个新类而不会忘记。我们发现在增量学习中，新老课程之间有很强的相关性，可以用来相互联系和促进不同的学习阶段。因此，我们提出了类增量学习的共迁移（COIL），它通过类语义关系来学习跨增量任务的关联。具体来说，共迁移有两个方面：前瞻性迁移试图用最优的迁移知识扩充旧分类器，作为快速的模型自适应。回溯迁移的目的是将新的类分类器作为旧的类分类器向后迁移，以克服遗忘。有了这些运输，线圈有效地适应新的任务，并稳定地抵抗遗忘。在基准测试和真实多媒体数据集上的实验验证了该方法的有效性。摘要：Traditional learning systems are trained in closed-world for a fixed number of classes, and need pre-collected datasets in advance. However, new classes often emerge in real-world applications and should be learned incrementally. For example, in electronic commerce, new types of products appear daily, and in a social media community, new topics emerge frequently. Under such circumstances, incremental models should learn several new classes at a time without forgetting. We find a strong correlation between old and new classes in incremental learning, which can be applied to relate and facilitate different learning stages mutually. As a result, we propose CO-transport for class Incremental Learning (COIL), which learns to relate across incremental tasks with the class-wise semantic relationship. In detail, co-transport has two aspects: prospective transport tries to augment the old classifier with optimal transported knowledge as fast model adaptation. Retrospective transport aims to transport new class classifiers backward as old ones to overcome forgetting. With these transports, COIL efficiently adapts to new tasks, and stably resists forgetting. Experiments on benchmark and real-world multimedia datasets validate the effectiveness of our proposed method.

【9】 Probing neural networks with t-SNE, class-specific projections and a guided tour 标题：用t-SNE、特定类投影和导游探索神经网络

作者：Christopher R. Hoyt,Art B. Owen 机构：Stanford University 链接：https://arxiv.org/abs/2107.12547 摘要：我们使用图形化的方法来探测对图像进行分类的神经网络。网络中连续层的t-SNE输出图显示了数据点的日益有序的排列。它们还可以揭示当数据通过层时，网络如何减少甚至忘记类内结构。我们使用特定于类的主成分的类似物来可视化后续层如何将类分开。这使我们能够将给定类别的图像从最典型到最不典型（在数据中）进行排序，并且它们还可以作为数据可视化的非常有用的投影坐标。我们发现它们在定义用于动画数据可视化的向导版本时特别有用。摘要：We use graphical methods to probe neural nets that classify images. Plots of t-SNE outputs at successive layers in a network reveal increasingly organized arrangement of the data points. They can also reveal how a network can diminish or even forget about within-class structure as the data proceeds through layers. We use class-specific analogues of principal components to visualize how succeeding layers separate the classes. These allow us to sort images from a given class from most typical to least typical (in the data) and they also serve as very useful projection coordinates for data visualization. We find them especially useful when defining versions guided tours for animated data visualization.

【10】 Physics-Enforced Modeling for Insertion Loss of Transmission Lines by Deep Neural Networks 标题：传输线插入损耗的深度神经网络物理建模

作者：Liang Chen,Lesley Tan 机构：Department of Electrical and Computer Engineering, University of California, Riverside, CA , USA, Phillips Academy, Andover, MA , USA 链接：https://arxiv.org/abs/2107.12527 摘要：在本文中，我们研究了数据驱动的参数化建模的插入损耗的传输线相对于设计参数。我们首先证明了直接应用神经网络可以得到具有负插入损耗的非物理模型。为了缓解这个问题，我们提出了两个深度学习解决方案。一种解决方案是在最终损耗函数中添加一个表示被动条件的调节项，以强制执行插入损耗的负数量。在第二种方法中，首先定义一个三阶多项式表达式来逼近插入损耗，然后采用最近提出的用于函数和系统建模的DeepONet神经网络结构对多项式的系数进行建模。所得到的神经网络用于预测多项式表达式的系数。在一个开放源码的SI/PI数据库上的实验结果表明，两种方法都能保证插入损耗的正性。此外，两种方法都能得到相似的预测结果，而基于多项式的DeepONet方法在训练时间上比基于DeepONet的方法要快。摘要：In this paper, we investigate data-driven parameterized modeling of insertion loss for transmission lines with respect to design parameters. We first show that direct application of neural networks can lead to non-physics models with negative insertion loss. To mitigate this problem, we propose two deep learning solutions. One solution is to add a regulation term, which represents the passive condition, to the final loss function to enforce the negative quantity of insertion loss. In the second method, a third-order polynomial expression is defined first, which ensures positiveness, to approximate the insertion loss, then DeepONet neural network structure, which was proposed recently for function and system modeling, was employed to model the coefficients of polynomials. The resulting neural network is applied to predict the coefficients of the polynomial expression. The experimental results on an open-sourced SI/PI database of a PCB design show that both methods can ensure the positiveness for the insertion loss. Furthermore, both methods can achieve similar prediction results, while the polynomial-based DeepONet method is faster than DeepONet based method in training time.

【11】 Restricted Boltzmann Machine and Deep Belief Network: Tutorial and Survey 标题：受限玻尔兹曼机与深度信念网络：教程与综述

作者：Benyamin Ghojogh,Ali Ghodsi,Fakhri Karray,Mark Crowley 机构：Department of Electrical and Computer Engineering, Machine Learning Laboratory, University of Waterloo, Waterloo, ON, Canada, Department of Statistics and Actuarial Science & David R. Cheriton School of Computer Science 备注：To appear as a part of an upcoming textbook on dimensionality reduction and manifold learning 链接：https://arxiv.org/abs/2107.12521 摘要：这是一篇关于Boltzmann机器（BM）、受限Boltzmann机器（RBM）和深度信念网络（DBN）的教程和调查论文。我们从概率图形模型，马尔可夫随机场，吉布斯抽样，统计物理，伊辛模型和霍普菲尔德网络的背景知识开始。然后介绍了BM和RBM的结构。解释了显隐变量的条件分布、RBM中Gibbs抽样生成变量、最大似然估计训练BM和RBM以及对比散度。然后，我们讨论了变量可能的离散分布和连续分布。我们介绍了条件RBM及其训练方法。最后，我们将深层信念网络解释为一系列RBM模型。这篇关于玻耳兹曼机器的论文可以应用于数据科学、统计学、神经计算和统计物理学等各个领域。摘要：This is a tutorial and survey paper on Boltzmann Machine (BM), Restricted Boltzmann Machine (RBM), and Deep Belief Network (DBN). We start with the required background on probabilistic graphical models, Markov random field, Gibbs sampling, statistical physics, Ising model, and the Hopfield network. Then, we introduce the structures of BM and RBM. The conditional distributions of visible and hidden variables, Gibbs sampling in RBM for generating variables, training BM and RBM by maximum likelihood estimation, and contrastive divergence are explained. Then, we discuss different possible discrete and continuous distributions for the variables. We introduce conditional RBM and how it is trained. Finally, we explain deep belief network as a stack of RBM models. This paper on Boltzmann machines can be useful in various fields including data science, statistics, neural computation, and statistical physics.

【12】 Accelerated Gradient Descent Learning over Multiple Access Fading Channels 标题：多址衰落信道下的加速梯度下降学习

作者：Raz Paul,Yuval Friedman,Kobi Cohen 机构： Yuval Friedman and Kobi Cohen are with the School of Electrical and Computer Engineering, Ben-Gurion University ofthe Negev 备注：30 pages, 12 figures 链接：https://arxiv.org/abs/2107.12452 摘要：我们考虑在无线网络中的分布式学习问题，由N个分布式边缘设备和参数服务器（PS）组成。目标函数是边缘设备的局部损耗函数之和，其目的是通过在多址信道（MAC）上与PS通信来训练共享模型。这个问题在分布式传感系统中引起了越来越多的兴趣，最近在联邦学习中，被称为空中计算。在本文中，我们提出了一种新的加速梯度下降多址（AGMA）算法，该算法使用了噪声衰落MAC上基于动量的梯度信号来提高算法的收敛速度。此外，AGMA不需要功率控制或波束形成来消除衰落效应，简化了实现的复杂性。我们从理论上分析了AGMA，并建立了具有Lipschitz梯度的凸和强凸损失函数的误差的有限样本界。对于强凸情形，我们证明了随着网络的增加，AGMA的收敛速度接近于最著名的线性收敛速度。对于凸情形，我们证明了AGMA与现有方法相比，显著地提高了亚线性收敛速度。最后，我们给出了实际数据集的仿真结果，表明AGMA具有更好的性能。摘要：We consider a distributed learning problem in a wireless network, consisting of N distributed edge devices and a parameter server (PS). The objective function is a sum of the edge devices' local loss functions, who aim to train a shared model by communicating with the PS over multiple access channels (MAC). This problem has attracted a growing interest in distributed sensing systems, and more recently in federated learning, known as over-the-air computation. In this paper, we develop a novel Accelerated Gradient-descent Multiple Access (AGMA) algorithm that uses momentum-based gradient signals over noisy fading MAC to improve the convergence rate as compared to existing methods. Furthermore, AGMA does not require power control or beamforming to cancel the fading effect, which simplifies the implementation complexity. We analyze AGMA theoretically, and establish a finite-sample bound of the error for both convex and strongly convex loss functions with Lipschitz gradient. For the strongly convex case, we show that AGMA approaches the best-known linear convergence rate as the network increases. For the convex case, we show that AGMA significantly improves the sub-linear convergence rate as compared to existing methods. Finally, we present simulation results using real datasets that demonstrate better performance by AGMA.

【13】 Cross-architecture Tuning of Silicon and SiGe-based Quantum Devices Using Machine Learning 标题：基于机器学习的硅基和SiGe基量子器件的跨体系结构调谐

作者：B. Severin,D. T. Lennon,L. C. Camenzind,F. Vigneau,F. Fedele,D. Jirovec,A. Ballabio,D. Chrastina,G. Isella,M. de Kruijf,M. J. Carballido,S. Svab,A. V. Kuhlmann,F. R. Braakman,S. Geyer,F. N. M. Froning,H. Moon,M. A. Osborne,D. Sejdinovic,G. Katsaros,D. M. Zumbühl,G. A. D. Briggs,N. Ares 机构：Department of Materials, University of Oxford, Parks Road, Oxford, OX,PH, UK, Department of Physics, University of Basel, Basel, Switzerland, Institute of Science and Technology Austria, Am Campus , Klosterneuburg, Austria 链接：https://arxiv.org/abs/2107.12975 摘要：Si和SiGe基器件在量子电路中的应用潜力受到器件可变性的影响。每个设备都需要根据运行条件进行调整。我们给出了一个解决这种可变性的关键步骤，该算法无需修改，即可从零开始调谐4栅Si-FinFET、5栅GeSi纳米线和7栅SiGe异质结构双量子点器件。我们实现的调谐时间分别为30分钟、10分钟和92分钟。该算法还提供了对每个设备的参数空间景观的洞察。这些结果表明，量子器件调谐的总体解决方案是通过机器学习实现的。摘要：The potential of Si and SiGe-based devices for the scaling of quantum circuits is tainted by device variability. Each device needs to be tuned to operation conditions. We give a key step towards tackling this variability with an algorithm that, without modification, is capable of tuning a 4-gate Si FinFET, a 5-gate GeSi nanowire and a 7-gate SiGe heterostructure double quantum dot device from scratch. We achieve tuning times of 30, 10, and 92 minutes, respectively. The algorithm also provides insight into the parameter space landscape for each of these devices. These results show that overarching solutions for the tuning of quantum devices are enabled by machine learning.

【14】 Stability & Generalisation of Gradient Descent for Shallow Neural Networks without the Neural Tangent Kernel 标题：无神经切核的浅层神经网络梯度下降的稳定性及推广

作者：Dominic Richards,Ilja Kuzborskij 机构：University of Oxford, DeepMind 链接：https://arxiv.org/abs/2107.12723 摘要：我们重新讨论了训练超参数浅层神经网络的梯度下降法（GD）的平均算法稳定性，并证明了在不使用神经切线核（NTK）或Polyak-{\L}ojasiewicz（PL）假设的情况下新的推广和超额风险界。特别地，我们给出了oracle型界，它揭示了GD的泛化和超额风险是由一个具有最短GD路径的插值网络（在某种意义上，是一个具有最小相对范数的插值网络）控制的。虽然这是众所周知的核化插入，我们的证明直接适用于网络训练GD没有中间核化。同时，通过放松这里开发的oracle不等式，我们以一种简单的方式恢复了现有的基于NTK的风险边界，这表明我们的分析更加严密。最后，与大多数基于NTK的分析不同，我们着重于带标签噪声的回归分析，并表明提前停止的GD是一致的。摘要：We revisit on-average algorithmic stability of Gradient Descent (GD) for training overparameterised shallow neural networks and prove new generalisation and excess risk bounds without the Neural Tangent Kernel (NTK) or Polyak-{\L}ojasiewicz (PL) assumptions. In particular, we show oracle type bounds which reveal that the generalisation and excess risk of GD is controlled by an interpolating network with the shortest GD path from initialisation (in a sense, an interpolating network with the smallest relative norm). While this was known for kernelised interpolants, our proof applies directly to networks trained by GD without intermediate kernelisation. At the same time, by relaxing oracle inequalities developed here we recover existing NTK-based risk bounds in a straightforward way, which demonstrates that our analysis is tighter. Finally, unlike most of the NTK-based analyses we focus on regression with label noise and show that GD with early stopping is consistent.

【15】 Learning to Estimate RIS-Aided mmWave Channels 标题：学习估计RIS辅助的毫米波信道

作者：Jiguang He,Henk Wymeersch,Marco Di Renzo,Markku Juntti 机构：University of Oulu, Wymeersch is with Department of ElectricalEngineering, Chalmers University of Technology, DiRenzo is with Universit´e Paris-Saclay 备注：5 pages, 7 figures, submitted to IEEE WCL for a review 链接：https://arxiv.org/abs/2107.12631 摘要：受深度神经网络（DNNs）显著的学习和预测性能的启发，我们将一种特殊类型的DNN框架，即模型驱动的深度展开神经网络，应用于可重构智能表面（RIS）辅助毫米波（mmWave）单输入多输出（SIMO）系统。我们专注于上行链路级联信道估计，其中已知和固定基站组合和RIS相位控制矩阵被考虑用于收集观测值。为了提高估计性能和减少训练开销，在深度展开方法中利用了毫米波信道固有的信道稀疏性。结果表明，所提出的深度展开网络结构比最小二乘法具有更小的训练开销和在线计算复杂度。摘要：Inspired by the remarkable learning and prediction performance of deep neural networks (DNNs), we apply one special type of DNN framework, known as model-driven deep unfolding neural network, to reconfigurable intelligent surface (RIS)-aided millimeter wave (mmWave) single-input multiple-output (SIMO) systems. We focus on uplink cascaded channel estimation, where known and fixed base station combining and RIS phase control matrices are considered for collecting observations. To boost the estimation performance and reduce the training overhead, the inherent channel sparsity of mmWave channels is leveraged in the deep unfolding method. It is verified that the proposed deep unfolding network architecture can outperform the least squares (LS) method with a relatively smaller training overhead and online computational complexity.

【16】 Constraining dark matter annihilation with cosmic ray antiprotons using neural networks 标题：用神经网络约束宇宙线反质子对暗物质湮灭的约束

作者：Felix Kahlhoefer,Michael Korsmeier,Michael Krämer,Silvia Manconi,Kathrin Nippel 机构：Institute for Theoretical Particle Physics and Cosmology (TTK), RWTH Aachen University, D-, Aachen, Germany, The Oskar Klein Centre for Cosmoparticle Physics, Department of Physics, Stockholm, University, Alba Nova, Stockholm, Sweden 链接：https://arxiv.org/abs/2107.12395 摘要：为了解释从寻找暗物质湮灭的间接探测实验中得到的数据，需要对宇宙线传播进行昂贵的模拟计算。在这项工作中，我们提出了一种新的方法，基于递归神经网络，显着加速模拟次级和暗物质银河宇宙线反质子，同时取得了良好的精度。这种方法允许对宇宙线传播模型的有害参数进行有效的分析或边缘化，以便对广泛的暗物质模型进行参数扫描。我们认为重要性抽样特别适合于确保网络只在经过良好训练的参数区域中进行评估。我们使用最新的AMS-02反质子数据，对几个弱相互作用的大质量粒子模型进行了约束。经过充分训练的网络与这项工作一起作为DarkRayNet发布，与传统方法相比，运行时的速度至少提高了两个数量级。摘要：The interpretation of data from indirect detection experiments searching for dark matter annihilations requires computationally expensive simulations of cosmic-ray propagation. In this work we present a new method based on Recurrent Neural Networks that significantly accelerates simulations of secondary and dark matter Galactic cosmic ray antiprotons while achieving excellent accuracy. This approach allows for an efficient profiling or marginalisation over the nuisance parameters of a cosmic ray propagation model in order to perform parameter scans for a wide range of dark matter models. We identify importance sampling as particularly suitable for ensuring that the network is only evaluated in well-trained parameter regions. We present resulting constraints using the most recent AMS-02 antiproton data on several models of Weakly Interacting Massive Particles. The fully trained networks are released as DarkRayNet together with this work and achieve a speed-up of the runtime by at least two orders of magnitude compared to conventional approaches.

其他(14篇)

【1】 The social dilemma in AI development and why we have to solve it 标题：人工智能发展中的社会困境及其必须解决的原因

作者：Inga Strümke,Marija Slavkovik,Vince Madai 机构：the date of receipt and acceptance should be inserted later 链接：https://arxiv.org/abs/2107.12977 摘要：虽然对合乎道德的人工智能（AI）系统的需求在增加，但不合乎道德的人工智能使用的数量却在加快，尽管其中不乏合乎道德的准则。我们认为，一个主要的潜在原因是，人工智能开发人员面临着一个社会困境，在人工智能开发道德，阻止道德的最佳实践的广泛适应。我们定义了人工智能开发的社会困境，并描述了为什么当前人工智能开发伦理危机的解决离不开人工智能开发人员的社会困境。我们认为，人工智能的发展必须专业化，以克服社会困境，并讨论如何医学可以作为一个模板在这个过程中使用。摘要：While the demand for ethical artificial intelligence (AI) systems increases, the number of unethical uses of AI accelerates, even though there is no shortage of ethical guidelines. We argue that a main underlying cause for this is that AI developers face a social dilemma in AI development ethics, preventing the widespread adaptation of ethical best practices. We define the social dilemma for AI development and describe why the current crisis in AI development ethics cannot be solved without relieving AI developers of their social dilemma. We argue that AI development must be professionalised to overcome the social dilemma, and discuss how medicine can be used as a template in this process.

【2】 Channel-Wise Early Stopping without a Validation Set via NNK Polytope Interpolation 标题：通过NNK多面体插值在没有验证集的情况下实现通道智能提前停止

作者：David Bonet,Antonio Ortega,Javier Ruiz-Hidalgo,Sarath Shekkizhar 机构：∗ Universitat Politecnica de Catalunya, Barcelona, Spain, † University of Southern California, Los Angeles, USA 备注：Submitted to APSIPA 2021 链接：https://arxiv.org/abs/2107.12972 摘要：最先进的神经网络体系结构继续扩大规模，并提供令人印象深刻的概括结果，尽管这是以有限的解释性为代价的。特别是，一个关键的挑战是确定何时停止训练模型，因为这对泛化有重大影响。卷积神经网络（ConvNets）是由多个通道聚合而成的高维特征空间，由于维数灾难的存在，对中间数据表示和模型演化的分析具有挑战性。提出了一种新的基于非负核回归（NNK）图的信道广义化估计方法&CW-DeepNNK（CW-DeepNNK）。该方法使得所学习的数据表示和通道之间的关系都具有基于实例的可解释性。基于我们的观察，我们使用CW-DeepNNK提出了一个新的早期停止准则，该准则（i）不需要验证集，（ii）基于任务性能度量，并且（iii）允许在每个通道的不同点达到停止。实验结果表明，与基于验证集性能的标准准则相比，该方法具有明显的优势。摘要：State-of-the-art neural network architectures continue to scale in size and deliver impressive generalization results, although this comes at the expense of limited interpretability. In particular, a key challenge is to determine when to stop training the model, as this has a significant impact on generalization. Convolutional neural networks (ConvNets) comprise high-dimensional feature spaces formed by the aggregation of multiple channels, where analyzing intermediate data representations and the model's evolution can be challenging owing to the curse of dimensionality. We present channel-wise DeepNNK (CW-DeepNNK), a novel channel-wise generalization estimate based on non-negative kernel regression (NNK) graphs with which we perform local polytope interpolation on low-dimensional channels. This method leads to instance-based interpretability of both the learned data representations and the relationship between channels. Motivated by our observations, we use CW-DeepNNK to propose a novel early stopping criterion that (i) does not require a validation set, (ii) is based on a task performance metric, and (iii) allows stopping to be reached at different points for each channel. Our experiments demonstrate that our proposed method has advantages as compared to the standard criterion based on validation set performance.

【3】 Individual Survival Curves with Conditional Normalizing Flows 标题：具有条件归一化流的个体生存曲线

作者：Guillaume Ausset,Tom Ciffreo,Francois Portier,Stephan Clémençon,Timothée Papin 机构：∗Télécom Paris, Saclay, France, † BNP Paribas, Paris, France 备注：IEEE DSAA '21 链接：https://arxiv.org/abs/2107.12825 摘要：生存分析，或称事件时间模型，是一个经典的统计学问题，由于其在流行病学、人口统计学或精算学中的实际应用而引起了广泛的兴趣。在个体化医学兴起的推动下，从机器学习的角度研究这一课题的最新进展关注的是精确的个体预测，而不是人口研究。我们在这里介绍了一种基于条件归一化流的事件时间密度估计方法，作为一种高度灵活和个性化的条件生存分布模型。我们使用一个新的规范化流的层次公式来实现灵活的条件分布的有效拟合而不需要过度拟合，并展示了规范化流公式如何有效地适应删失设置。我们在一个综合数据集、四个开放的医学数据集和一个常见的财务问题的例子上对所提出的方法进行了实验验证。摘要：Survival analysis, or time-to-event modelling, is a classical statistical problem that has garnered a lot of interest for its practical use in epidemiology, demographics or actuarial sciences. Recent advances on the subject from the point of view of machine learning have been concerned with precise per-individual predictions instead of population studies, driven by the rise of individualized medicine. We introduce here a conditional normalizing flow based estimate of the time-to-event density as a way to model highly flexible and individualized conditional survival distributions. We use a novel hierarchical formulation of normalizing flows to enable efficient fitting of flexible conditional distributions without overfitting and show how the normalizing flow formulation can be efficiently adapted to the censored setting. We experimentally validate the proposed approach on a synthetic dataset as well as four open medical datasets and an example of a common financial problem.

【4】 Bayesian Optimisation for Sequential Experimental Design with Applications in Additive Manufacturing 标题：序贯试验设计的贝叶斯优化及其在添加剂制造中的应用

作者：Mimi Zhang,Andrew Parnell,Dermot Brabazon,Alessio Benavoli 机构：School of Computer Science and Statistics, Trinity College Dublin, Ireland, Hamilton Institute, Ireland, School of Mechanical & Manufacturing Engineering, Dublin City University, Ireland, I-Form Advanced Manufacturing Research Centre, Science Foundation Ireland 链接：https://arxiv.org/abs/2107.12809 摘要：贝叶斯优化（BO）是一种全局优化代价昂贵的黑盒目标函数的方法。BO驱动的实验设计在材料科学、化学、实验物理、药物开发等领域有着广泛的应用，本文旨在关注BO在设计实验中的作用，并提供一本BO手册，包括方法论和软件论，为了方便任何想申请或学习BO的人。特别地，我们简要地解释了BO技术，回顾了BO在加性制造中的所有应用，比较和举例说明了不同开放BO库的特点，揭示了BO对其他类型数据（如优先输出）的新的潜在应用。本文旨在读者与一些贝叶斯方法的理解，但不一定与知识的加性制造；软件性能概述和实现说明对于任何实验设计从业者都是有用的。此外，我们在加性制造领域的回顾突出了BO当前的知识和技术趋势。摘要：Bayesian optimization (BO) is an approach to globally optimizing black-box objective functions that are expensive to evaluate. BO-powered experimental design has found wide application in materials science, chemistry, experimental physics, drug development, etc. This work aims to bring attention to the benefits of applying BO in designing experiments and to provide a BO manual, covering both methodology and software, for the convenience of anyone who wants to apply or learn BO. In particular, we briefly explain the BO technique, review all the applications of BO in additive manufacturing, compare and exemplify the features of different open BO libraries, unlock new potential applications of BO to other types of data (e.g., preferential output). This article is aimed at readers with some understanding of Bayesian methods, but not necessarily with knowledge of additive manufacturing; the software performance overview and implementation instructions are instrumental for any experimental-design practitioner. Moreover, our review in the field of additive manufacturing highlights the current knowledge and technological trends of BO.

【5】 Improving ClusterGAN Using Self-AugmentedInformation Maximization of Disentangling LatentSpaces 标题：基于解缠延迟空间自增强信息最大化的聚类GAN改进

作者：Tanmoy Dam,Sreenatha G. Anavatti,Hussein A. Abbass 机构：School of Engineering and Information Technology, University of New South Wales Canberra, Australia. 备注：This paper is under review to IEEE TNNLS 链接：https://arxiv.org/abs/2107.12706 摘要：生成性对抗网络中的潜在空间聚类（ClusterGAN）方法在高维数据上取得了成功。然而，该方法假设在模式生成过程中先验信息是均匀分布的，这在实际数据中是一个限制性的假设，导致生成模式的多样性损失。本文提出了自增强信息最大化改进型Clus-terGAN（SIMI-ClusterGAN）算法，从数据中学习特征先验。所提出的SIMI-ClusterGAN算法由四个深层神经网络组成：自增强先验网络、生成器、鉴别器和聚类推理自动编码器，并用七个基准数据集对该算法进行了验证，结果表明该算法的性能优于现有方法。为了证明SIMI-ClusterGAN在不平衡数据集上的优越性，我们讨论了MNIST数据集上一类不平衡和三类不平衡的两种不平衡情况，结果突出了SIMI-ClusterGAN的优越性。摘要：The Latent Space Clustering in Generative adversarial networks (ClusterGAN) method has been successful with high-dimensional data. However, the method assumes uniformlydistributed priors during the generation of modes, which isa restrictive assumption in real-world data and cause loss ofdiversity in the generated modes. In this paper, we proposeself-augmentation information maximization improved Clus-terGAN (SIMI-ClusterGAN) to learn the distinctive priorsfrom the data. The proposed SIMI-ClusterGAN consists offour deep neural networks: self-augmentation prior network,generator, discriminator and clustering inference autoencoder.The proposed method has been validated using seven bench-mark data sets and has shown improved performance overstate-of-the art methods. To demonstrate the superiority ofSIMI-ClusterGAN performance on imbalanced dataset, wehave discussed two imbalanced conditions on MNIST datasetswith one-class imbalance and three classes imbalanced cases.The results highlight the advantages of SIMI-ClusterGAN.

【6】 COPS: Controlled Pruning Before Training Starts 标题：警察：训练开始前有控制的修剪

作者：Paul Wimmer,Jens Mehnert,Alexandru Condurache 机构： Controlled Pruning Before Training StartsWimmer PaulImage ProcessingRobert Bosch GmbH & Lübeck University7 1 2 29 Leonberg, Condurache AlexandruEngineering Cognitive SystemsRobert Bosch GmbH & Lübeck University70 499 Stuttgart 备注：Accepted by The International Joint Conference on Neural Network (IJCNN) 2021 链接：https://arxiv.org/abs/2107.12673 摘要：最先进的深度神经网络（DNN）剪枝技术，在训练开始前应用一次，用一个标准——剪枝分数——来评估稀疏结构。基于单独分数的权值修剪对于某些体系结构和修剪率很有效，但对于其他体系结构和修剪率也可能失败。作为剪除分数的一个常用基线，我们引入了广义突触分数（GSS）的概念。在这项工作中，我们不集中在一个单一的剪枝标准，但提供了一个框架，结合任意gss创建更强大的剪枝策略。这些组合剪枝得分（COPS）是通过求解一个约束优化问题得到的。对多个分数进行优化可防止稀疏网络过度专门化单个任务，从而在训练开始前控制修剪。将COPS给出的组合优化问题松弛在线性规划（LP）上。这个LP是解析求解的，并确定了COPS的一个解。此外，还提出了一种数值计算两个分数的算法，并进行了评价。用这种方法求解cop比一般的LP求解器复杂度要低。在我们的实验中，我们将剪枝和COPS与最新的方法进行了比较，得到了改进的结果。摘要：State-of-the-art deep neural network (DNN) pruning techniques, applied one-shot before training starts, evaluate sparse architectures with the help of a single criterion -- called pruning score. Pruning weights based on a solitary score works well for some architectures and pruning rates but may also fail for other ones. As a common baseline for pruning scores, we introduce the notion of a generalized synaptic score (GSS). In this work we do not concentrate on a single pruning criterion, but provide a framework for combining arbitrary GSSs to create more powerful pruning strategies. These COmbined Pruning Scores (COPS) are obtained by solving a constrained optimization problem. Optimizing for more than one score prevents the sparse network to overly specialize on an individual task, thus COntrols Pruning before training Starts. The combinatorial optimization problem given by COPS is relaxed on a linear program (LP). This LP is solved analytically and determines a solution for COPS. Furthermore, an algorithm to compute it for two scores numerically is proposed and evaluated. Solving COPS in such a way has lower complexity than the best general LP solver. In our experiments we compared pruning with COPS against state-of-the-art methods for different network architectures and image classification tasks and obtained improved results.

【7】 Greedy Gradient Ensemble for Robust Visual Question Answering 标题：贪婪梯度集成在鲁棒视觉问答中的应用

作者：Xinzhe Han,Shuhui Wang,Chi Su,Qingming Huang,Qi Tian 机构：Key Lab of Intell. Info. Process., Inst. of Comput. Tech., CAS, Beijing, China, University of Chinese Academy of Sciences, Beijing, China, Kingsoft Cloud, Beijing, China, Peng Cheng Laboratory, Shenzhen, China, Cloud BU, Huawei Technologies, Shenzhen, China. 备注：Accepted by ICCV 2021. Code: this https URL 链接：https://arxiv.org/abs/2107.12651 摘要：语言偏差是视觉问答（VQA）中的一个关键问题，在VQA中，模型往往利用数据集偏差进行最终决策，而不考虑图像信息。因此，他们遭受了性能下降的分布数据和不够直观的解释。在对现有的鲁棒VQA方法进行实验分析的基础上，着重分析了VQA中的语言偏差，即分布偏差和快捷偏差。我们进一步提出了一个新的去偏框架，即贪婪梯度集成（GGE），它结合了多个有偏模型进行无偏基模型学习。通过贪婪策略，GGE强制有偏模型优先拟合有偏数据分布，从而使基础模型更加关注有偏模型难以求解的实例。实验结果表明，该方法在不使用额外标注的情况下，较好地利用了视觉信息，在诊断数据集VQA-CP时达到了最先进的性能。摘要：Language bias is a critical issue in Visual Question Answering (VQA), where models often exploit dataset biases for the final decision without considering the image information. As a result, they suffer from performance drop on out-of-distribution data and inadequate visual explanation. Based on experimental analysis for existing robust VQA methods, we stress the language bias in VQA that comes from two aspects, i.e., distribution bias and shortcut bias. We further propose a new de-bias framework, Greedy Gradient Ensemble (GGE), which combines multiple biased models for unbiased base model learning. With the greedy strategy, GGE forces the biased models to over-fit the biased data distribution in priority, thus makes the base model pay more attention to examples that are hard to solve by biased models. The experiments demonstrate that our method makes better use of visual information and achieves state-of-the-art performance on diagnosing dataset VQA-CP without using extra annotations.

【8】 Uniformity in Heterogeneity:Diving Deep into Count Interval Partition for Crowd Counting 标题：异质性中的一致性：深入计数区间划分进行人群计数

作者：Changan Wang,Qingyu Song,Boshen Zhang,Yabiao Wang,Ying Tai,Xuyi Hu,Chengjie Wang,Jilin Li,Jiayi Ma,Yang Wu 机构：Tencent Youtu Lab,Applied Research Center (ARC), Tencent PCG, Department of Electronic & Electrical Engineering, University College London, United Kingdom, Electronic Information School, Wuhan University, Wuhan, China 备注：To be appear in ICCV2021 链接：https://arxiv.org/abs/2107.12619 摘要：近年来，群体计数中学习目标不准确的问题越来越引起人们的关注。在一些开创性工作的启发下，我们通过尝试预测计数的预定义区间箱的指数而不是计数值本身来解决这个问题。然而，不恰当的区间设置可能会使不同区间的计数误差贡献极不平衡，导致较差的计数性能。因此，我们提出了一种新的计数区间划分准则，称为均匀误差划分（UEP），它始终保持所有区间的期望计数误差贡献相等，以最小化预测风险。然后，为了减少在计数量化过程中不可避免地引入的离散化错误，我们提出了另一种称为平均计数代理（MCP）的准则。MCP准则为每个区间选择一个最佳计数代理来表示其计数值，使得图像的整体期望离散化误差几乎可以忽略不计。据我们所知，这项工作是第一次深入研究这样的分类任务，并最终得到了一个很有前途的解决方案计数区间划分。根据上述两个理论证明的准则，我们提出了一个简单而有效的统一错误划分网络（UEPNet）模型，该模型在多个具有挑战性的数据集上取得了最先进的性能。代码将在以下位置提供：https://github.com/TencentYoutuResearch/CrowdCounting-UEPNet. 摘要：Recently, the problem of inaccurate learning targets in crowd counting draws increasing attention. Inspired by a few pioneering work, we solve this problem by trying to predict the indices of pre-defined interval bins of counts instead of the count values themselves. However, an inappropriate interval setting might make the count error contributions from different intervals extremely imbalanced, leading to inferior counting performance. Therefore, we propose a novel count interval partition criterion called Uniform Error Partition (UEP), which always keeps the expected counting error contributions equal for all intervals to minimize the prediction risk. Then to mitigate the inevitably introduced discretization errors in the count quantization process, we propose another criterion called Mean Count Proxies (MCP). The MCP criterion selects the best count proxy for each interval to represent its count value during inference, making the overall expected discretization error of an image nearly negligible. As far as we are aware, this work is the first to delve into such a classification task and ends up with a promising solution for count interval partition. Following the above two theoretically demonstrated criterions, we propose a simple yet effective model termed Uniform Error Partition Network (UEPNet), which achieves state-of-the-art performance on several challenging datasets. The codes will be available at: https://github.com/TencentYoutuResearch/CrowdCounting-UEPNet.

【9】 Cross-speaker Style Transfer with Prosody Bottleneck in Neural Speech Synthesis 标题：神经语音合成中韵律瓶颈的交叉语体迁移

作者：Shifeng Pan,Lei He 机构：Microsoft, China 备注：in Proceedings of INTERSPEECH 2021 链接：https://arxiv.org/abs/2107.12562 摘要：跨说话人语体转换是多语体大规模表达性语音合成应用的关键。它不要求目标演讲者是表达所有风格的专家，也不需要为模特训练收集相应的录音。然而，现有的样式转换方法的性能还远远落后于实际应用的需要。根本原因主要有两方面。首先，从单一参考语音中提取的风格嵌入很难为任意文本的合成提供细粒度和合适的韵律信息。其次，在这些模型中，内容/文本、韵律和说话人音色通常是高度纠缠的，因此在自由组合这些成分（例如在说话人之间转换说话风格）时期望满意的结果是不现实的。本文提出了一种具有明显韵律瓶颈的跨说话人风格的文语转换（TTS）模型。韵律瓶颈建立了强有力的话语风格核，将韵律与内容、说话人音色分离开来，保证了高质量的跨说话人风格转换。评价结果表明，该方法在韵律客观评价方面与源说话人的说话人相关（SD）模型相当，在客观评价和主观评价方面明显优于循环一致性和基于GMVAE的基线。摘要：Cross-speaker style transfer is crucial to the applications of multi-style and expressive speech synthesis at scale. It does not require the target speakers to be experts in expressing all styles and to collect corresponding recordings for model training. However, the performances of existing style transfer methods are still far behind real application needs. The root causes are mainly twofold. Firstly, the style embedding extracted from single reference speech can hardly provide fine-grained and appropriate prosody information for arbitrary text to synthesize. Secondly, in these models the content/text, prosody, and speaker timbre are usually highly entangled, it's therefore not realistic to expect a satisfied result when freely combining these components, such as to transfer speaking style between speakers. In this paper, we propose a cross-speaker style transfer text-to-speech (TTS) model with explicit prosody bottleneck. The prosody bottleneck builds up the kernels accounting for speaking style robustly, and disentangles the prosody from content and speaker timbre, therefore guarantees high quality cross-speaker style transfer. Evaluation result shows the proposed method even achieves on-par performance with source speaker's speaker-dependent (SD) model in objective measurement of prosody, and significantly outperforms the cycle consistency and GMVAE-based baselines in objective and subjective evaluations.

【10】 Circular-Symmetric Correlation Layer based on FFT 标题：基于FFT的圆对称相关层

作者：Bahar Azari,Deniz Erdogmus 机构：Department of Electrical & Computer Engineering, Northeastern University, USA, Boston, MA , Deniz Erdo˘gmu¸s 链接：https://arxiv.org/abs/2107.12480 摘要：尽管标准的平面卷积神经网络取得了巨大的成功，但对于分析任意弯曲流形（如圆柱体）上的信号，它们并不是最有效的选择。当人们对这些信号进行平面投影时，问题就出现了，并且不可避免地导致它们在有价值信息的地方被扭曲或破坏。基于连续群$S^1\times\mathbb{R}$上旋转平移等变相关的形式，我们提出了一种圆对称相关层（CCL），并利用著名的快速傅立叶变换（FFT）算法有效地实现了它。我们展示了一个装有CCL的通用网络在各种识别和分类任务和数据集上的性能分析。CCL的PyTorch包实现是在线提供的。摘要：Despite the vast success of standard planar convolutional neural networks, they are not the most efficient choice for analyzing signals that lie on an arbitrarily curved manifold, such as a cylinder. The problem arises when one performs a planar projection of these signals and inevitably causes them to be distorted or broken where there is valuable information. We propose a Circular-symmetric Correlation Layer (CCL) based on the formalism of roto-translation equivariant correlation on the continuous group $S^1 \times \mathbb{R}$, and implement it efficiently using the well-known Fast Fourier Transform (FFT) algorithm. We showcase the performance analysis of a general network equipped with CCL on various recognition and classification tasks and datasets. The PyTorch package implementation of CCL is provided online.

【11】 Towards Low-Latency Energy-Efficient Deep SNNs via Attention-Guided Compression 标题：基于注意力引导压缩的低延时高能效深度信噪比网络

作者：Souvik Kundu,Gourav Datta,Massoud Pedram,Peter A. Beerel 机构： University of SouthernCalifornia 备注：10 Pages, 8 Figures, 5 Tables 链接：https://arxiv.org/abs/2107.12445 摘要：深度尖峰神经网络（SNNs）作为一种潜在的深度学习框架的替代方案，在事件驱动的神经网络硬件上提供了更高的计算效率。然而，为了在复杂的视觉应用中表现良好，大多数SNN训练框架产生较大的推理延迟，这会导致尖峰活动的增加和能量效率的降低。因此，在保持精确性的同时最小化平均尖峰活动仍然是一个重要的挑战和机遇。本文提出了一种非迭代的SNN训练技术，该技术在保持高推理精度的同时，在降低尖峰活动的情况下实现超高压缩。特别地，我们的框架首先使用未压缩元模型的注意映射来产生压缩的人工神经网络。这一步可以进行调整，以支持不规则和结构化的通道修剪，从而在广泛的平台上利用计算优势。然后，该框架使用直接输入执行基于稀疏学习的有监督SNN训练。在训练过程中，它联合优化SNN权重、阈值和泄漏参数，以在保持压缩的同时大大减少所需的时间步数。为了评估我们的方法的优点，我们在CIFAR-10和CIFAR-100上对VGG和ResNet的变体进行了实验，在Tiny-ImageNet上对VGG16进行了实验。通过所提出的技术生成的SNN模型产生的SOTA压缩比高达33.4倍，与基线未运行的同类压缩比相比，精确度没有显著下降。与现有的SNN剪枝方法相比，我们实现了高达8.3倍的压缩，并提高了精度。摘要：Deep spiking neural networks (SNNs) have emerged as a potential alternative to traditional deep learning frameworks, due to their promise to provide increased compute efficiency on event-driven neuromorphic hardware. However, to perform well on complex vision applications, most SNN training frameworks yield large inference latency which translates to increased spike activity and reduced energy efficiency. Hence,minimizing average spike activity while preserving accuracy indeep SNNs remains a significant challenge and opportunity.This paper presents a non-iterative SNN training technique thatachieves ultra-high compression with reduced spiking activitywhile maintaining high inference accuracy. In particular, our framework first uses the attention-maps of an un compressed meta-model to yield compressed ANNs. This step can be tuned to support both irregular and structured channel pruning to leverage computational benefits over a broad range of platforms. The framework then performs sparse-learning-based supervised SNN training using direct inputs. During the training, it jointly optimizes the SNN weight, threshold, and leak parameters to drastically minimize the number of time steps required while retaining compression. To evaluate the merits of our approach, we performed experiments with variants of VGG and ResNet, on both CIFAR-10 and CIFAR-100, and VGG16 on Tiny-ImageNet.The SNN models generated through the proposed technique yield SOTA compression ratios of up to 33.4x with no significant drops in accuracy compared to baseline unpruned counterparts. Compared to existing SNN pruning methods, we achieve up to 8.3x higher compression with improved accuracy.

【12】 Relational Boosted Regression Trees 标题：关系增强回归树

作者：Sonia Cromp,Alireza Samadian,Kirk Pruhs 机构：University of Pittsburgh 链接：https://arxiv.org/abs/2107.12373 摘要：许多任务使用关系数据库中的数据来训练增强的回归树模型。在本文中，我们给出了一个关系适应的贪婪算法训练增强回归树。对于计算数据集残差平方和的子问题，它决定了boosting算法的运行时间，我们使用张量草图技术提供了$（1+\epsilon）$-近似。在关系增强回归树算法中使用这种近似可以学习相似的模型参数，但运行时渐近性更好。摘要：Many tasks use data housed in relational databases to train boosted regression tree models. In this paper, we give a relational adaptation of the greedy algorithm for training boosted regression trees. For the subproblem of calculating the sum of squared residuals of the dataset, which dominates the runtime of the boosting algorithm, we provide a $(1 + \epsilon)$-approximation using the tensor sketch technique. Employing this approximation within the relational boosted regression trees algorithm leads to learning similar model parameters, but with asymptotically better runtime.

【13】 Statistical Guarantees for Fairness Aware Plug-In Algorithms 标题：公平性感知插件算法的统计保证

作者：Drona Khurana,Srinivasan Ravichandran,Sparsh Jain,Narayanan Unny Edakunni 备注：This paper was accepted at the workshop on Socially Responsible Machine Learning, ICML 2021 链接：https://arxiv.org/abs/2107.12783 摘要：在（Menon&Williamson，2018）中提出了一个插件算法，用于估计公平性感知二进制分类的Bayes最优分类器。然而，他们的方法的统计效力尚未确定。我们证明了插件算法的统计一致性。我们还推导了通过插件算法学习Bayes最优分类器的有限样本保证。最后，我们提出了一个协议，修改插件的方法，以便同时保证公平性和差异隐私的二进制特征被认为是敏感的。摘要：A plug-in algorithm to estimate Bayes Optimal Classifiers for fairness-aware binary classification has been proposed in (Menon & Williamson, 2018). However, the statistical efficacy of their approach has not been established. We prove that the plug-in algorithm is statistically consistent. We also derive finite sample guarantees associated with learning the Bayes Optimal Classifiers via the plug-in algorithm. Finally, we propose a protocol that modifies the plug-in approach, so as to simultaneously guarantee fairness and differential privacy with respect to a binary feature deemed sensitive.

【14】 Proof: Accelerating Approximate Aggregation Queries with Expensive Predicates 标题：证明：使用昂贵的谓词加速近似聚合查询

作者：Daniel Kang,John Guibas,Peter Bailis,Tatsunori Hashimoto,Yi Sun,Matei Zaharia 机构：Stanford University†, University of Chicago‡ 链接：https://arxiv.org/abs/2107.12525 摘要：给定一个数据集$\mathcal{D}$，我们感兴趣的是计算$\mathcal{D}$中与谓词匹配的子集的平均值\algname利用分层抽样和代理模型，在给定抽样预算$N$的情况下，有效地计算这一统计数据。在本文中，我们从理论上分析了估计的均方误差，并证明了估计的均方误差以$O的速率衰减，其中$N=K\cdot N\u 1+N\u 2^{1}+N\u 1^{1/2}N\u 2^{-3/2}）$，对于某些整数常数，$K$和$K\cdot N\u 1$和$N\u 2$分别表示在估计的第1阶段和第2阶段中使用的样本数。因此，如果将总样本预算$N$的一个常数部分分配给每个阶段，我们将获得$O（N^{-1}）$的均方误差，该均方误差与最佳分层抽样算法的均方误差率相匹配，前提是已知每个层的谓词正率和标准差。摘要：Given a dataset $\mathcal{D}$, we are interested in computing the mean of a subset of $\mathcal{D}$ which matches a predicate. \algname leverages stratified sampling and proxy models to efficiently compute this statistic given a sampling budget $N$. In this document, we theoretically analyze \algname and show that the MSE of the estimate decays at rate $O(N_1^{-1} + N_2^{-1} + N_1^{1/2}N_2^{-3/2})$, where $N=K \cdot N_1+N_2$ for some integer constant $K$ and $K \cdot N_1$ and $N_2$ represent the number of samples used in Stage 1 and Stage 2 of \algname respectively. Hence, if a constant fraction of the total sample budget $N$ is allocated to each stage, we will achieve a mean squared error of $O(N^{-1})$ which matches the rate of mean squared error of the optimal stratified sampling algorithm given a priori knowledge of the predicate positive rate and standard deviation per stratum.

本文参与腾讯云自媒体同步曝光计划，分享自微信公众号。

原始发表：2021-07-28，如有侵权请联系 cloudcommunity@tencent.com 删除

linux