机器学习学术速递[8.23]

公众号-arXiv每日学术速递

发布于 2021-08-24 16:41:37

6200

发布于 2021-08-24 16:41:37

文章被收录于专栏：arXiv每日学术速递arXiv每日学术速递

Update！H5支持摘要折叠，体验更佳！点击阅读原文访问arxivdaily.com，涵盖CS|物理|数学|经济|统计|金融|生物|电气领域，更有搜索、收藏等功能！

cs.LG 方向，今日共计64篇

Graph相关(图学习|图神经网络|图优化等)(3篇)

【1】 Network-wide link travel time and station waiting time estimation using automatic fare collection data: A computational graph approach 标题：利用自动收费数据估算全网路段行程时间和车站等待时间：一种计算图方法链接：https://arxiv.org/abs/2108.09292

作者：Jinlei Zhang,Feng Chen,Lixing Yang,Wei Ma,Guangyin Jin,Ziyou Gao 机构：Laboratory of Rail Traffic Control and Safety, Beijing Jiaotong University, No., Shangyuancun, Haidian District, Beijing , China. (e-mail:, University, No., Shangyuancun, Haidian District, Beijing , China. 摘要：城市轨道交通（URT）系统在北京、香港等特大城市中占据主导地位。由于其重要作用和复杂性，公共机构始终非常需要更好地了解城市轨道交通系统的性能。本文重点研究了在城市轨道交通系统中，利用自动售检票（AFC）数据估计网络范围内的路段行程时间和车站等待时间的一个基本而困难的问题，这有助于更好地了解系统范围内的实时运行状态。新兴的数据驱动技术，如机器学习领域中的计算图（CG）模型，为解决这一问题提供了新的解决方案。在本研究中，我们首先建立了一个数据驱动的估计优化框架来估计路段行程时间和车站等待时间。然后，我们将估计优化模型转换为CG框架来解决优化问题并获得估计结果。该方法在合成的城市轨道交通网络上得到验证，并分别使用合成和真实的AFC数据应用于真实的城市轨道交通网络。结果表明了基于CG的框架的鲁棒性和有效性。据我们所知，这是CG首次应用于城市轨道交通。这项研究可以为更好地了解城市轨道交通的运行状态提供重要的见解。摘要：Urban rail transit (URT) system plays a dominating role in many megacities like Beijing and Hong Kong. Due to its important role and complex nature, it is always in great need for public agencies to better understand the performance of the URT system. This paper focuses on an essential and hard problem to estimate the network-wide link travel time and station waiting time using the automatic fare collection (AFC) data in the URT system, which is beneficial to better understand the system-wide real-time operation state. The emerging data-driven techniques, such as computational graph (CG) models in the machine learning field, provide a new solution for solving this problem. In this study, we first formulate a data-driven estimation optimization framework to estimate the link travel time and station waiting time. Then, we cast the estimation optimization model into a CG framework to solve the optimization problem and obtain the estimation results. The methodology is verified on a synthetic URT network and applied to a real-world URT network using the synthetic and real-world AFC data, respectively. Results show the robustness and effectiveness of the CG-based framework. To the best of our knowledge, this is the first time that the CG is applied to the URT. This study can provide critical insights to better understand the operational state in URT.

【2】 TabGNN: Multiplex Graph Neural Network for Tabular Data Prediction 标题：TabGNN：用于表格数据预测的多重图神经网络链接：https://arxiv.org/abs/2108.09127

作者：Xiawei Guo,Yuhan Quan,Huan Zhao,Quanming Yao,Yong Li,Weiwei Tu 机构：,Paradigm Inc., China, †Beijing National Research Center for Information Science and Technology(BNRist), Department of Electronic Engineering, Tsinghua University, China, Beijing, China 摘要：表格数据预测（TDP）是最流行的工业应用之一，人们设计了各种方法来提高预测性能。然而，现有的工作主要关注特征交互，而忽略了样本关系，例如，具有相同教育水平的用户可能具有相似的偿还债务的能力。在这项工作中，通过显式和系统地建模样本关系，我们提出了一种新的基于最近流行的图神经网络（GNN）的框架TabGNN。具体地说，我们首先构造一个多重图来建模多方面的样本关系，然后设计一个多重图神经网络来学习每个样本的增强表示。为了将TabGNN与我们公司的表格解决方案集成，我们将学习到的嵌入和原始嵌入连接起来，然后将它们输入到解决方案中的预测模型中。对来自不同领域的11个TDP数据集（包括分类和回归数据集）的实验表明，与4Paradigm中的表格解决方案AutoFE相比，TabGNN可以持续改进性能。摘要：Tabular data prediction (TDP) is one of the most popular industrial applications, and various methods have been designed to improve the prediction performance. However, existing works mainly focus on feature interactions and ignore sample relations, e.g., users with the same education level might have a similar ability to repay the debt. In this work, by explicitly and systematically modeling sample relations, we propose a novel framework TabGNN based on recently popular graph neural networks (GNN). Specifically, we firstly construct a multiplex graph to model the multifaceted sample relations, and then design a multiplex graph neural network to learn enhanced representation for each sample. To integrate TabGNN with the tabular solution in our company, we concatenate the learned embeddings and the original ones, which are then fed to prediction models inside the solution. Experiments on eleven TDP datasets from various domains, including classification and regression ones, show that TabGNN can consistently improve the performance compared to the tabular solution AutoFE in 4Paradigm.

【3】 Twitter User Representation using Weakly Supervised Graph Embedding 标题：基于弱监督图嵌入的Twitter用户表示链接：https://arxiv.org/abs/2108.08988

作者：Tunazzina Islam,Dan Goldwasser 机构：Department of Computer Science, Purdue University, West Lafayette, Indiana 备注：accepted at 16th International AAAI Conference on Web and Social Media (ICWSM-2022), direct accept from May 2021 submission, 12 pages 摘要：社交媒体平台为用户提供了方便的手段，让他们参与各种内容的多种在线活动，并创建快速广泛的互动。然而，这种快速增长的访问也增加了信息的多样性，对用户类型进行特征化以了解人们在社交媒体上共享的生活方式决策是一项挑战。本文提出了一种基于弱监督图嵌入的用户类型理解框架。我们通过对Twitter上与幸福感相关的推文进行弱监督来评估用户，重点关注“瑜伽”、“酮类饮食”。在真实数据集上的实验表明，该框架在检测用户类型方面优于基线。最后，我们举例说明从我们的数据集中对不同类型的用户（例如，从业者和促销人员）进行的数据分析。虽然我们关注与生活方式相关的tweet（即瑜伽、keto），但我们构建用户表示的方法很容易推广到其他领域。摘要：Social media platforms provide convenient means for users to participate in multiple online activities on various contents and create fast widespread interactions. However, this rapidly growing access has also increased the diverse information, and characterizing user types to understand people's lifestyle decisions shared in social media is challenging. In this paper, we propose a weakly supervised graph embedding based framework for understanding user types. We evaluate the user embedding learned using weak supervision over well-being related tweets from Twitter, focusing on 'Yoga', 'Keto diet'. Experiments on real-world datasets demonstrate that the proposed framework outperforms the baselines for detecting user types. Finally, we illustrate data analysis on different types of users (e.g., practitioner vs. promotional) from our dataset. While we focus on lifestyle-related tweets (i.e., yoga, keto), our method for constructing user representation readily generalizes to other domains.

Transformer(1篇)

【1】 Is it Time to Replace CNNs with Transformers for Medical Images? 标题：医学影像用Transformer取代CNN的时机到了吗？链接：https://arxiv.org/abs/2108.09038

作者：Christos Matsoukas,Johan Fredin Haslum,Magnus Söderberg,Kevin Smith 机构：Magnus S¨oderberg , KTH Royal Institute of Technology, Stockholm, Sweden, Science for Life Laboratory, Stockholm, Sweden, AstraZeneca, Gothenburg, Sweden 备注：Originally published at the ICCV 2021 Workshop on Computer Vision for Automated Medical Diagnosis (CVAMD) 摘要：卷积神经网络（CNN）作为一种事实上的自动医学图像诊断方法已经统治了十年。最近，视觉变换器（VIT）作为CNN的竞争替代品出现，产生了类似的性能水平，同时具有一些有趣的特性，可以证明对医学成像任务是有益的。在这项工作中，我们探讨是时候转向基于Transformer的模型了，还是我们应该继续使用CNN？我们可以简单地切换到Transformer吗？如果是的话，在医学影像诊断中使用ViTs有哪些优点和缺点？我们在三个主流医学图像数据集的一系列实验中考虑了这些问题。我们的研究结果表明，CNNS表现更好，当从零开始训练时，使用默认超参数的现成的视觉Transformer与CNNs在ImageNet上预先训练时相媲美，并且在使用自监督的预训练时优于美国有线电视新闻网同行。摘要：Convolutional Neural Networks (CNNs) have reigned for a decade as the de facto approach to automated medical image diagnosis. Recently, vision transformers (ViTs) have appeared as a competitive alternative to CNNs, yielding similar levels of performance while possessing several interesting properties that could prove beneficial for medical imaging tasks. In this work, we explore whether it is time to move to transformer-based models or if we should keep working with CNNs - can we trivially switch to transformers? If so, what are the advantages and drawbacks of switching to ViTs for medical image diagnosis? We consider these questions in a series of experiments on three mainstream medical image datasets. Our findings show that, while CNNs perform better when trained from scratch, off-the-shelf vision transformers using default hyperparameters are on par with CNNs when pretrained on ImageNet, and outperform their CNN counterparts when pretrained using self-supervision.

GAN|对抗|攻击|生成相关(8篇)

【1】 An Adaptable Deep Learning-Based Intrusion Detection System to Zero-Day Attacks 标题：一种适应零日攻击的基于深度学习的入侵检测系统链接：https://arxiv.org/abs/2108.09199

作者：Mahdi Soltani,Behzad Ousat,Mahdi Jafari Siavoshani,Amir Hossein Jahangir 机构： Jahangir are withthe Department of Computer Engineering, Sharif University of Technology 摘要：入侵检测系统（IDS）是计算机网络安全监控的重要组成部分。IDS区分恶意流量和良性流量，并确定针对组织资产的攻击类型。IDS的主要挑战是面临新的（即零日）攻击，并将其与良性流量和现有类型的攻击区分开来。随着基于深度学习的智能决策支持系统在自动提取高级特征方面的强大能力及其独立于耗时昂贵的特征提取过程，上述挑战仍然存在于新一代智能决策支持系统中。在本文中，我们提出了一个基于深度学习的入侵检测系统框架来应对新的攻击。该框架是在安全范围内，除了传统的基于深层结构专门层的聚类外，第一种使用基于深度新颖性的分类器的方法。此外，我们还介绍了DOC++作为DOC的更新版本，作为一种基于深度新颖性的分类器。我们还在预处理阶段采用了深度入侵检测（DID）框架，这提高了深度学习算法检测基于内容的攻击的能力。我们比较了四种不同的算法（包括DOC、DOC++、OpenMax和AutoSVM）作为框架的新颖性分类器，并使用CIC-IDS2017和CSE-CIC-IDS2018数据集进行评估。结果表明，DOC++是开放集识别模块的最佳实现。此外，聚类和训练后阶段的完备性和同质性证明了该模型对于监督标记和更新阶段是足够好的。摘要：The intrusion detection system (IDS) is an essential element of security monitoring in computer networks. An IDS distinguishes the malicious traffic from the benign one and determines the attack types targeting the assets of the organization. The main challenge of an IDS is facing new (i.e., zero-day) attacks and separating them from benign traffic and existing types of attacks. Along with the power of the deep learning-based IDSes in auto-extracting high-level features and its independence from the time-consuming and costly signature extraction process, the mentioned challenge still exists in this new generation of IDSes. In this paper, we propose a framework for deep learning-based IDSes addressing new attacks. This framework is the first approach using both deep novelty-based classifiers besides the traditional clustering based on the specialized layer of deep structures, in the security scope. Additionally, we introduce DOC++ as a newer version of DOC as a deep novelty-based classifier. We also employ the Deep Intrusion Detection (DID) framework for the preprocessing phase, which improves the ability of deep learning algorithms to detect content-based attacks. We compare four different algorithms (including DOC, DOC++, OpenMax, and AutoSVM) as the novelty classifier of the framework and use both the CIC-IDS2017 and CSE-CIC-IDS2018 datasets for the evaluation. Our results show that DOC++ is the best implementation of the open set recognition module. Besides, the completeness and homogeneity of the clustering and post-training phase prove that this model is good enough for the supervised labeling and updating phase.

【2】 SplitGuard: Detecting and Mitigating Training-Hijacking Attacks in Split Learning 标题：SplitGuard：检测和减轻分裂学习中的训练劫持攻击链接：https://arxiv.org/abs/2108.09052

作者：Ege Erdogan,Alptekin Kupcu,A. Ercument Cicek 机构： of Computer EngineeringKoc¸ UniversityIstanbul, of Computer EngineeringBilkent UniversityAnkara 备注：under review 摘要：分布式深度学习框架（如分割学习）最近被提出，以使一组参与者能够在不共享原始数据的情况下协作训练深度神经网络。拆分学习通过在客户端和服务器之间划分神经网络来实现这一目标，这样客户端计算初始层集，服务器计算其余层。然而，这种方法为恶意服务器尝试窃取客户端的私有数据引入了唯一的攻击向量：服务器可以引导客户端模型来学习其选择的任务。通过已经提出的一个具体示例，此类训练劫持攻击对分割学习客户端的数据隐私造成了重大风险。在本文中，我们提出了SplitGuard，这是一种方法，通过这种方法，split learning客户端可以检测到它是否被训练劫持攻击作为目标。我们通过实验评估了它的有效性，并详细讨论了与它的使用相关的各个方面。我们得出结论，SplitGuard可以有效地检测训练劫持攻击，同时最大限度地减少对手恢复的信息量。摘要：Distributed deep learning frameworks, such as split learning, have recently been proposed to enable a group of participants to collaboratively train a deep neural network without sharing their raw data. Split learning in particular achieves this goal by dividing a neural network between a client and a server so that the client computes the initial set of layers, and the server computes the rest. However, this method introduces a unique attack vector for a malicious server attempting to steal the client's private data: the server can direct the client model towards learning a task of its choice. With a concrete example already proposed, such training-hijacking attacks present a significant risk for the data privacy of split learning clients. In this paper, we propose SplitGuard, a method by which a split learning client can detect whether it is being targeted by a training-hijacking attack or not. We experimentally evaluate its effectiveness, and discuss in detail various points related to its use. We conclude that SplitGuard can effectively detect training-hijacking attacks while minimizing the amount of information recovered by the adversaries.

【3】 AdvDrop: Adversarial Attack to DNNs by Dropping Information 标题：AdvDrop：通过丢弃信息对DNNs的敌意攻击链接：https://arxiv.org/abs/2108.09034

作者：Ranjie Duan,Yuefeng Chen,Dantong Niu,Yun Yang,A. K. Qin,Yuan He 机构：Swinburne University of Technology, Australia, Alibaba Group, China, University of California, Berkeley, USA, 备注：Accepted to ICCV 2021 摘要：人类可以很容易地识别丢失信息的视觉对象：即使丢失了大部分细节，也只保留了轮廓，例如卡通。然而，就深度神经网络（DNN）的视觉感知而言，识别抽象对象（具有丢失信息的视觉对象）的能力仍然是一个挑战。在这项工作中，我们从一个对立的角度来研究这个问题：即使图像只丢失很少的信息，DNN的性能是否会降低？为此，我们提出了一种新的对抗性攻击，名为\textit{AdvDrop}，它通过删除图像的现有信息来制作对抗性示例。以前，大多数敌对攻击都会在干净的图像上添加额外的干扰信息。与以前的工作相反，我们提出的工作从一个新的角度探讨了DNN模型的对抗性健壮性，通过丢弃不易察觉的细节来制作对抗性示例。我们通过大量实验证明了\textit{AdvDrop}的有效性，并表明这种新型的对抗性示例更难被现有防御系统防御。摘要：Human can easily recognize visual objects with lost information: even losing most details with only contour reserved, e.g. cartoon. However, in terms of visual perception of Deep Neural Networks (DNNs), the ability for recognizing abstract objects (visual objects with lost information) is still a challenge. In this work, we investigate this issue from an adversarial viewpoint: will the performance of DNNs decrease even for the images only losing a little information? Towards this end, we propose a novel adversarial attack, named \textit{AdvDrop}, which crafts adversarial examples by dropping existing information of images. Previously, most adversarial attacks add extra disturbing information on clean images explicitly. Opposite to previous works, our proposed work explores the adversarial robustness of DNN models in a novel perspective by dropping imperceptible details to craft adversarial examples. We demonstrate the effectiveness of \textit{AdvDrop} by extensive experiments, and show that this new type of adversarial examples is more difficult to be defended by current defense systems.

【4】 UnSplit: Data-Oblivious Model Inversion, Model Stealing, and Label Inference Attacks Against Split Learning 标题：拆分：针对拆分学习的数据-迟钝模型反转、模型窃取和标签推理攻击链接：https://arxiv.org/abs/2108.09033

作者：Ege Erdogan,Alptekin Kupcu,A. Ercument Cicek 机构： Ercüment Çiçek‡† Department of Computer Engineering, Koç University Department of Computer Engineering, Bilkent University‡ Computational Biology Department, Carnegie Mellon University{eerdogan 17 备注：under review 摘要：训练深度神经网络需要大规模数据，这通常会迫使用户在分布式或外包环境中工作，并伴随着隐私问题。拆分学习框架旨在通过在客户机和服务器之间拆分模型来解决这一问题。其思想是，由于服务器无法访问模型的客户端部分，因此该方案应该提供隐私。我们通过两个新的攻击证明了这一点(1）我们证明，一个诚实但好奇的分裂学习服务器，只配备客户端神经网络结构的知识，可以恢复输入样本，并获得与客户端模型功能相似的模型，而客户端无法检测到攻击(2）此外，我们还表明，如果天真地使用分割学习来保护训练标签，诚实但好奇的服务器可以以完美的准确性推断标签。我们使用三个基准数据集测试我们的攻击，并调查影响攻击有效性的整个系统的各种属性。我们的研究结果表明，纯文本分割学习范式会带来严重的安全风险，并且只会提供虚假的安全感。摘要：Training deep neural networks requires large scale data, which often forces users to work in a distributed or outsourced setting, accompanied with privacy concerns. Split learning framework aims to address this concern by splitting up the model among the client and the server. The idea is that since the server does not have access to client's part of the model, the scheme supposedly provides privacy. We show that this is not true via two novel attacks. (1) We show that an honest-but-curious split learning server, equipped only with the knowledge of the client neural network architecture, can recover the input samples and also obtain a functionally similar model to the client model, without the client being able to detect the attack. (2) Furthermore, we show that if split learning is used naively to protect the training labels, the honest-but-curious server can infer the labels with perfect accuracy. We test our attacks using three benchmark datasets and investigate various properties of the overall system that affect the attacks' effectiveness. Our results show that plaintext split learning paradigm can pose serious security risks and provide no more than a false sense of security.

【5】 Discriminative Domain-Invariant Adversarial Network for Deep Domain Generalization 标题：区分论域-不变对抗网络的深域泛化链接：https://arxiv.org/abs/2108.08995

作者：Mohammad Mahfujur Rahman,Clinton Fookes,Sridha Sridharan 机构：School of Electrical Engineering and Robotics, Queensland University of Technology, Queensland, Australia 备注：This manuscript is submitted to Computer Vision and Image Understanding (CVIU) 摘要：领域泛化方法的目的是从具有不同分布的多个训练源领域中学习未知目标领域的领域不变预测模型。在机器学习和计算机视觉领域中，广域泛化是一个具有挑战性和热门的问题，近年来人们在这方面做出了大量的努力。大多数以前的域泛化方法都假设域之间的条件分布在源域之间保持不变，并通过最小化边缘分布来学习域不变模型。然而，训练源域的稳定条件分布的假设在实践中并不成立。从源域学习到的超平面很容易将分散在簇边界或远离其相应类中心的样本错误分类。为了解决上述两个缺点，我们提出了一种用于域泛化的区分域不变对抗网络（DDIAN）。特征的区分性通过区分性特征模块来保证，而域不变特征通过全局域和局部子域对齐模块来保证。在多个基准测试上的大量实验表明，与最先进的领域泛化方法相比，DDIAN在训练期间对看不见的目标数据实现了更好的预测。摘要：Domain generalization approaches aim to learn a domain invariant prediction model for unknown target domains from multiple training source domains with different distributions. Significant efforts have recently been committed to broad domain generalization, which is a challenging and topical problem in machine learning and computer vision communities. Most previous domain generalization approaches assume that the conditional distribution across the domains remain the same across the source domains and learn a domain invariant model by minimizing the marginal distributions. However, the assumption of a stable conditional distribution of the training source domains does not really hold in practice. The hyperplane learned from the source domains will easily misclassify samples scattered at the boundary of clusters or far from their corresponding class centres. To address the above two drawbacks, we propose a discriminative domain-invariant adversarial network (DDIAN) for domain generalization. The discriminativeness of the features are guaranteed through a discriminative feature module and domain-invariant features are guaranteed through the global domain and local sub-domain alignment modules. Extensive experiments on several benchmarks show that DDIAN achieves better prediction on unseen target data during training compared to state-of-the-art domain generalization approaches.

【6】 ASAT: Adaptively Scaled Adversarial Training in Time Series 标题：ASAT：时间序列中的自适应尺度对抗性训练链接：https://arxiv.org/abs/2108.08976

作者：Zhiyuan Zhang,Wei Li,Ruihan Bao,Keiko Harimoto,Yunfang Wu,Xu Sun 机构：Peking University, China, Mizuho Securities Co.,Ltd, Japan 备注：Accepted to be appeared in Workshop on Machine Learning in Finance (KDD-MLF) 2021 摘要：对抗性训练是一种增强神经网络的方法，以提高对对抗性示例的鲁棒性。除了潜在对抗性示例的安全问题外，对抗性训练还可以提高神经网络的性能，训练鲁棒的神经网络，并为神经网络提供可解释性。在这项工作中，我们首先以金融领域为例介绍了时间序列分析中的对抗性训练。在反思现有对抗训练研究的基础上，我们提出了时间序列分析中的自适应规模对抗训练（ASAT），通过对不同时隙的数据进行时间相关的重要性权重处理。实验结果表明，所提出的ASAT能够提高神经网络的精度和对抗鲁棒性。除了增强神经网络外，我们还提出了分维对抗敏感性指标来探测输入维度的敏感性和重要性。利用所提出的指标，我们可以解释黑箱神经网络的决策基础。摘要：Adversarial training is a method for enhancing neural networks to improve the robustness against adversarial examples. Besides the security concerns of potential adversarial examples, adversarial training can also improve the performance of the neural networks, train robust neural networks, and provide interpretability for neural networks. In this work, we take the first step to introduce adversarial training in time series analysis by taking the finance field as an example. Rethinking existing researches of adversarial training, we propose the adaptively scaled adversarial training (ASAT) in time series analysis, by treating data at different time slots with time-dependent importance weights. Experimental results show that the proposed ASAT can improve both the accuracy and the adversarial robustness of neural networks. Besides enhancing neural networks, we also propose the dimension-wise adversarial sensitivity indicator to probe the sensitivities and importance of input dimensions. With the proposed indicator, we can explain the decision bases of black box neural networks.

【7】 Application of Adversarial Examples to Physical ECG Signals 标题：对抗性实例在物理心电信号中的应用链接：https://arxiv.org/abs/2108.08972

作者：Taiga Ono,Takeshi Sugawara,Jun Sakuma,Tatsuya Mori 机构： Waseda University, The University of Electro-Communications, University of Tsukuba, RIKEN AIP 摘要：这项工作旨在评估机器学习算法支持的心脏诊断系统对抗性攻击的现实性和可行性。为此，我们引入了对抗性搏动，这是专门针对心电图（ECG）逐拍分类系统定制的对抗性扰动。我们首先为ECG分类神经网络模型制定了一个生成对抗性示例的算法，并研究了其攻击成功率。接下来，为了评估其在物理环境中的可行性，我们设计了一个恶意信号发生器，将对抗性心跳注入ECG传感器读数，从而发起硬件攻击。据我们所知，我们的工作是第一次在物理环境中评估对抗性ECG示例的熟练程度。我们的真实实验表明，对抗性搏动在2分钟内成功地操纵了40次尝试中的3-5次诊断结果。最后，我们通过明确定义预期攻击者的动机和约束以及我们的实验结果，讨论了攻击的总体可行性和影响。摘要：This work aims to assess the reality and feasibility of the adversarial attack against cardiac diagnosis system powered by machine learning algorithms. To this end, we introduce adversarial beats, which are adversarial perturbations tailored specifically against electrocardiograms (ECGs) beat-by-beat classification system. We first formulate an algorithm to generate adversarial examples for the ECG classification neural network model, and study its attack success rate. Next, to evaluate its feasibility in a physical environment, we mount a hardware attack by designing a malicious signal generator which injects adversarial beats into ECG sensor readings. To the best of our knowledge, our work is the first in evaluating the proficiency of adversarial examples for ECGs in a physical setup. Our real-world experiments demonstrate that adversarial beats successfully manipulated the diagnosis results 3-5 times out of 40 attempts throughout the course of 2 minutes. Finally, we discuss the overall feasibility and impact of the attack, by clearly defining motives and constraints of expected attackers along with our experimental results.

【8】 Mitigating Greenhouse Gas Emissions Through Generative Adversarial Networks Based Wildfire Prediction 标题：基于产生式对抗性网络的野火预测减少温室气体排放链接：https://arxiv.org/abs/2108.08952

作者：Sifat Chowdhury,Kai Zhu,Yu Zhang 机构： University of California 摘要：在过去十年中，世界各地的野火数量显著增加，尤其是在加利福尼亚州。野火排放的高浓度温室气体（GHG）加剧了全球变暖，进一步增加了发生更多火灾的风险。因此，准确预测野火的发生有助于防止大规模和持久的野火，并减少由此产生的温室气体排放。人们探索了各种野火风险预测方法。然而，许多自然和人为因素与野火点燃之间的复杂关联使得预测任务非常具有挑战性。在本文中，我们开发了一种基于深度学习的野火风险预测数据扩充方法。我们构建了一个数据集，该数据集由负责点火的各种特征组成，并利用条件表格生成对抗网络来探索风险水平目标值与所有相关特征之间的潜在模式。为了进行公平和全面的比较，我们将我们提出的方案与其他五种基线方法进行了比较，其中前者的表现优于大多数方法。为了证实鲁棒性，我们还使用另一个数据集测试了我们的方法的性能，该数据集也提高了效率。通过采用所提出的方法，我们可以采取缓解野火的预防策略来减少全球温室气体排放。摘要：Over the past decade, the number of wildfire has increased significantly around the world, especially in the State of California. The high-level concentration of greenhouse gas (GHG) emitted by wildfires aggravates global warming that further increases the risk of more fires. Therefore, an accurate prediction of wildfire occurrence greatly helps in preventing large-scale and long-lasting wildfires and reducing the consequent GHG emissions. Various methods have been explored for wildfire risk prediction. However, the complex correlations among a lot of natural and human factors and wildfire ignition make the prediction task very challenging. In this paper, we develop a deep learning based data augmentation approach for wildfire risk prediction. We build a dataset consisting of diverse features responsible for fire ignition and utilize a conditional tabular generative adversarial network to explore the underlying patterns between the target value of risk levels and all involved features. For fair and comprehensive comparisons, we compare our proposed scheme with five other baseline methods where the former outperformed most of them. To corroborate the robustness, we have also tested the performance of our method with another dataset that also resulted in better efficiency. By adopting the proposed method, we can take preventive strategies of wildfire mitigation to reduce global GHG emissions.

半/弱/无/有监督|不确定性|主动学习(5篇)

【1】 Unsupervised Domain-adaptive Hash for Networks 标题：一种无监督的域自适应网络散列算法链接：https://arxiv.org/abs/2108.09136

作者：Tao He,Lianli Gao,Jingkuan Song,Yuan-Fang Li 机构：Department of Data Science and AI, Monash, University, Clayton, Victoria , Center for Future Media, University of Electronic Science and Technology of China, Chengdu, Sichuan 摘要：大量的真实数据可以自然地用大规模网络表示，这就需要高效的学习算法。同时，标签可能只适用于某些网络，这就要求这些算法能够适应未标记的网络。域自适应哈希学习由于其在检索时间和存储空间方面的较低成本，在计算机视觉领域的许多实际任务中取得了相当大的成功。然而，它还没有被应用于多域网络。在这项工作中，我们通过开发一种用于网络的无监督域自适应哈希学习方法（称为UDAH）来弥补这一差距。具体来说，我们开发了四个{任务特定但相关}组件：（1）通过硬分组对比损失保持网络结构，（2）无松弛监督哈希，（3）跨域交叉鉴别器，（4）语义中心对齐。我们进行了大量的实验来评估我们的方法在一系列任务上的有效性和效率，包括链路预测、节点分类和邻居推荐。我们的评估结果表明，我们的模型在所有任务中都比最先进的传统离散嵌入方法具有更好的性能。摘要：Abundant real-world data can be naturally represented by large-scale networks, which demands efficient and effective learning algorithms. At the same time, labels may only be available for some networks, which demands these algorithms to be able to adapt to unlabeled networks. Domain-adaptive hash learning has enjoyed considerable success in the computer vision community in many practical tasks due to its lower cost in both retrieval time and storage footprint. However, it has not been applied to multiple-domain networks. In this work, we bridge this gap by developing an unsupervised domain-adaptive hash learning method for networks, dubbed UDAH. Specifically, we develop four {task-specific yet correlated} components: (1) network structure preservation via a hard groupwise contrastive loss, (2) relaxation-free supervised hashing, (3) cross-domain intersected discriminators, and (4) semantic center alignment. We conduct a wide range of experiments to evaluate the effectiveness and efficiency of our method on a range of tasks including link prediction, node classification, and neighbor recommendation. Our evaluation results demonstrate that our model achieves better performance than the state-of-the-art conventional discrete embedding methods over all the tasks.

【2】 Semi-supervised Network Embedding with Differentiable Deep Quantisation 标题：可区分深度量化的半监督网络嵌入链接：https://arxiv.org/abs/2108.09128

作者：Tao He,Lianli Gao,Jingkuan Song,Yuan-Fang Li 摘要：学习准确的低维网络嵌入是一项至关重要的任务，因为它有助于许多下游网络分析任务。对于大型网络，经过训练的嵌入通常需要大量的存储空间，这使得存储和处理成为一项挑战。基于我们以前在半监督网络嵌入方面的工作，我们开发了d-SNEQ，一种基于可微DNN的网络嵌入量化方法。d-SNEQ结合了秩损失，使学习到的量化码具有丰富的高阶信息，并且能够大幅压缩经过训练的嵌入的大小，从而减少存储占用并加快检索速度。我们还提出了一种新的评估指标，路径预测，以公平和更直接地评估模型在保留高阶信息方面的性能。我们对四个具有不同特征的真实网络的评估表明，d-SNEQ在链路预测、路径预测、节点分类和节点推荐方面优于许多最先进的嵌入方法，同时具有更高的空间和时间效率。摘要：Learning accurate low-dimensional embeddings for a network is a crucial task as it facilitates many downstream network analytics tasks. For large networks, the trained embeddings often require a significant amount of space to store, making storage and processing a challenge. Building on our previous work on semi-supervised network embedding, we develop d-SNEQ, a differentiable DNN-based quantisation method for network embedding. d-SNEQ incorporates a rank loss to equip the learned quantisation codes with rich high-order information and is able to substantially compress the size of trained embeddings, thus reducing storage footprint and accelerating retrieval speed. We also propose a new evaluation metric, path prediction, to fairly and more directly evaluate model performance on the preservation of high-order information. Our evaluation on four real-world networks of diverse characteristics shows that d-SNEQ outperforms a number of state-of-the-art embedding methods in link prediction, path prediction, node classification, and node recommendation while being far more space- and time-efficient.

【3】 A fuzzy-rough uncertainty measure to discover bias encoded explicitly or implicitly in features of structured pattern classification datasets 标题：用于发现在结构化模式分类数据集的特征中显式或隐式编码的偏差的模糊-粗糙不确定性度量链接：https://arxiv.org/abs/2108.09098

作者：Gonzalo Nápoles,Lisa Koutsoviti Koumeri 机构：Department of Cognitive Science & Artificial Intelligence, Tilburg University, The Netherlands., Business Informatics Research Group, Hasselt University, Belgium. 摘要：学术界、立法者和企业都广泛认识到需要测量用于解决模式识别问题的表格数据中编码的偏差。在以前的工作中，我们提出了一种基于模糊粗糙集理论的偏差量化度量，称为模糊粗糙不确定性。直觉表明，受保护的特征不应显著改变决策类的模糊粗糙边界区域。这种情况在多大程度上代表了在决策过程中表现为不确定性的偏见。我们的方法的主要优点是它不依赖于任何机器学习预测模型，而是依赖于一个支持函数。在本文中，我们通过探索受保护和未受保护属性之间的相关性定义的非受保护特征中隐含编码的偏差的存在来扩展我们的研究。这一分析引出了四种情景，领域专家在决定如何解决偏见之前应该对这些情景进行评估。此外，我们还进行了敏感性分析，以确定最能捕捉边界区域变化的模糊算子和距离函数。摘要：The need to measure bias encoded in tabular data that are used to solve pattern recognition problems is widely recognized by academia, legislators and enterprises alike. In previous work, we proposed a bias quantification measure, called fuzzy-rough uncer-tainty, which relies on the fuzzy-rough set theory. The intuition dictates that protected features should not change the fuzzy-rough boundary regions of a decision class significantly. The extent to which this happens is a proxy for bias expressed as uncertainty in adecision-making context. Our measure's main advantage is that it does not depend on any machine learning prediction model but adistance function. In this paper, we extend our study by exploring the existence of bias encoded implicitly in non-protected featuresas defined by the correlation between protected and unprotected attributes. This analysis leads to four scenarios that domain experts should evaluate before deciding how to tackle bias. In addition, we conduct a sensitivity analysis to determine the fuzzy operatorsand distance function that best capture change in the boundary regions.

【4】 Semi-supervised learning for medical image classification using imbalanced training data 标题：基于非平衡训练数据的半监督学习在医学图像分类中的应用链接：https://arxiv.org/abs/2108.08956

作者：Tri Huynh,Aiden Nibali,Zhen He 机构：Department of Computer Science and Information Technology, La Trobe University, Melbourne, Australia 备注：This paper has 28 pages, 7 figures 摘要：由于两个原因，医学图像分类通常具有挑战性：由于昂贵且耗时的注释协议而缺乏标记的示例，以及由于更广泛人群中疾病阳性个体相对稀少而导致的类别标签不平衡。半监督学习（SSL）方法用于处理缺少标签的问题，但它们通常不能解决类不平衡的问题。在这项研究中，我们提出了自适应混合一致性损失（ABCL），这是基于扰动的SSL方法中一致性损失的替代品。ABCL通过根据类频率自适应混合一致性损失的目标类分布来抵消数据倾斜。我们用ABCL进行的实验表明，与现有的一致性损失相比，在两个不同的不平衡医学图像分类数据集上的未加权平均召回率有所提高，而这些一致性损失不是为了抵消类别不平衡而设计的。摘要：Medical image classification is often challenging for two reasons: a lack of labelled examples due to expensive and time-consuming annotation protocols, and imbalanced class labels due to the relative scarcity of disease-positive individuals in the wider population. Semi-supervised learning (SSL) methods exist for dealing with a lack of labels, but they generally do not address the problem of class imbalance. In this study we propose Adaptive Blended Consistency Loss (ABCL), a drop-in replacement for consistency loss in perturbation-based SSL methods. ABCL counteracts data skew by adaptively mixing the target class distribution of the consistency loss in accordance with class frequency. Our experiments with ABCL reveal improvements to unweighted average recall on two different imbalanced medical image classification datasets when compared with existing consistency losses that are not designed to counteract class imbalance.

【5】 Local Latin Hypercube Refinement for Multi-objective Design Uncertainty Optimization 标题：多目标设计不确定性优化的局部拉丁超立方体精化链接：https://arxiv.org/abs/2108.08890

作者：Can Bogoclu,Dirk Roos,Tamara Nestorović 机构：Ruhr-Universität Bochum, Institute of Computational, Engineering, Mechanics of Adaptive Systems, Universitätsstr. , Building ICFW ,-, D-, Bochum, Niederrhein University of Applied Sciences, Institute of Modelling 备注：The code repository can be found at this https URL 摘要：优化设计的可靠性和稳健性很重要，但由于高样本要求，往往无法承受。使用基于统计和机器学习方法的代理模型来提高样本效率。然而，对于高维或多模态系统，替代模型也可能需要大量样本才能获得良好的结果。针对基于可靠性的多目标稳健设计优化问题，提出了一种基于替代项的序贯抽样策略。提出的局部拉丁超立方体精化（LoLHR）策略不依赖于模型，并且可以与任何代理模型相结合，因为没有免费午餐，但可能有预算午餐。将该方法与平稳采样以及文献中提出的其他策略进行了比较。高斯过程和支持向量回归都被用作替代模型。实证证据表明，与其他基于代理的策略相比，LoLHR平均取得了更好的结果。摘要：Optimizing the reliability and the robustness of a design is important but often unaffordable due to high sample requirements. Surrogate models based on statistical and machine learning methods are used to increase the sample efficiency. However, for higher dimensional or multi-modal systems, surrogate models may also require a large amount of samples to achieve good results. We propose a sequential sampling strategy for the surrogate based solution of multi-objective reliability based robust design optimization problems. Proposed local Latin hypercube refinement (LoLHR) strategy is model-agnostic and can be combined with any surrogate model because there is no free lunch but possibly a budget one. The proposed method is compared to stationary sampling as well as other proposed strategies from the literature. Gaussian process and support vector regression are both used as surrogate models. Empirical evidence is presented, showing that LoLHR achieves on average better results compared to other surrogate based strategies on the tested examples.

迁移|Zero/Few/One-Shot|自适应(2篇)

【1】 Combination of Transfer Learning, Recursive Learning and Ensemble Learning for Multi-Day Ahead COVID-19 Cases Prediction in India using Gated Recurrent Unit Networks 标题：转移学习、递归学习和集成学习相结合的门控递归单元网络在印度多天前冠状病毒病例预测中的应用链接：https://arxiv.org/abs/2108.09131

作者：Debasrita Chakraborty,Debayan Goswami,Susmita Ghosh,Ashish Ghosh,Jonathan H. Chan 机构：H. Chanc, Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India, Department of Computer Science and Engineering, Jadavpur University, Kolkata, India, King Mongkut’s University of Technology Thonburi, Bangkok, Thailand, A R T I C L E I N F O 备注：8 pages, 7 figures 摘要：目前的2019冠状病毒疾病对印度的卫生基础设施构成了巨大的挑战。随着越来越多的人在第二次浪潮中受到影响，医院负担过重，供应和氧气不足。在这种情况下2019冠状病毒疾病的数量预测可能有助于更好地利用有限的资源和供应。这份手稿涉及2019冠状病毒疾病19例，新死亡病例和活动病例的多天预测。该方法以选通递归单元网络为主要预测模型。通过构建四个模型进行研究，这些模型根据四个不同国家（美利坚合众国、巴西、西班牙和孟加拉国）的数据进行预训练，并根据印度的数据进行微调或再训练。由于所选的四个国家经历了不同类型的感染曲线，因此，预训练提供了将不同情况纳入模型的转移学习。然后，四个模型中的每一个都使用递归学习方法对印度测试数据进行多天的预测。最终的预测来自不同模型组合的预测集合。与其他传统回归模型相比，西班牙和巴西这两个国家的方法在所有组合中取得了最佳性能。摘要：The current COVID-19 pandemic has put a huge challenge on the Indian health infrastructure. With more and more people getting affected during the second wave, the hospitals were over-burdened, running out of supplies and oxygen. In this scenario, prediction of the number of COVID-19 cases beforehand might have helped in the better utilization of limited resources and supplies. This manuscript deals with the prediction of new COVID-19 cases, new deaths and total active cases for multiple days in advance. The proposed method uses gated recurrent unit networks as the main predicting model. A study is conducted by building four models that are pre-trained on the data from four different countries (United States of America, Brazil, Spain and Bangladesh) and are fine-tuned or retrained on India's data. Since the four countries chosen have experienced different types of infection curves, the pre-training provides a transfer learning to the models incorporating diverse situations into account. Each of the four models then give a multiple days ahead predictions using recursive learning method for the Indian test data. The final prediction comes from an ensemble of the predictions of the combination of different models. This method with two countries, Spain and Brazil, is seen to achieve the best performance amongst all the combinations as well as compared to other traditional regression models.

【2】 Few Shot Activity Recognition Using Variational Inference 标题：基于变分推理的少射活动识别链接：https://arxiv.org/abs/2108.08990

作者：Neeraj Kumar,Siddhansh Narang 机构：Indian Institute of Technology, Delhi, India, RAeS (Royal Aeronautical Society), UK 备注：Accepted in IJCAI 2021 - 3RD INTERNATIONAL WORKSHOP ON DEEP LEARNING FOR HUMAN ACTIVITY RECOGNITION. arXiv admin note: text overlap with arXiv:1611.09630, arXiv:1909.07945 by other authors 摘要：在过去几年中，在学习一种只需几个标记示例就可以识别新类的模型方面取得了显著的进展。用于动作识别的少数镜头学习（FSL）是一项具有挑战性的任务，用于识别训练数据中由少数实例表示的新动作类别。我们提出了一种新的基于变分推理的架构框架（HF-AR），用于Few-Shot活动识别。我们的框架利用保持体积的Householder流来学习新类的灵活后验分布。与用于人类活动识别的最先进的Few-Shot方法相比，这种方法具有更好的性能。该方法由基础模型和适配器模型组成。我们的架构由一个基本模型和一个适配器模型组成。基本模型在SEED类上进行训练，并计算表示从输入视频中提取的空间和时间细节的嵌入，例如，Resnet-152和基于LSTM的编码器-解码器模型的组合。adapter模型应用一系列Householder变换来计算灵活的后验分布，从而在Few-Shot方法中获得更高的精度。在三个著名的数据集：UCF101、HMDB51和Something-Something-V2上进行的大量实验表明，与仅使用RGB帧序列作为输入的最先进的Few-Shot方法相比，单镜头和五镜头分类的性能相似或更好。据我们所知，我们是第一个探索变分推理和householder变换以捕获后验分布的满秩协方差矩阵的人，用于活动识别中的少量镜头学习。摘要：There has been a remarkable progress in learning a model which could recognise novel classes with only a few labeled examples in the last few years. Few-shot learning (FSL) for action recognition is a challenging task of recognising novel action categories which are represented by few instances in the training data. We propose a novel variational inference based architectural framework (HF-AR) for few shot activity recognition. Our framework leverages volume-preserving Householder Flow to learn a flexible posterior distribution of the novel classes. This results in better performance as compared to state-of-the-art few shot approaches for human activity recognition. approach consists of base model and an adapter model. Our architecture consists of a base model and an adapter model. The base model is trained on seen classes and it computes an embedding that represent the spatial and temporal insights extracted from the input video, e.g. combination of Resnet-152 and LSTM based encoder-decoder model. The adapter model applies a series of Householder transformations to compute a flexible posterior distribution that lends higher accuracy in the few shot approach. Extensive experiments on three well-known datasets: UCF101, HMDB51 and Something-Something-V2, demonstrate similar or better performance on 1-shot and 5-shot classification as compared to state-of-the-art few shot approaches that use only RGB frame sequence as input. To the best of our knowledge, we are the first to explore variational inference along with householder transformations to capture the full rank covariance matrix of posterior distribution, for few shot learning in activity recognition.

强化学习(3篇)

【1】 Reinforcement Learning to Optimize Lifetime Value in Cold-Start Recommendation 标题：强化学习在冷启动推荐中优化寿命价值的研究链接：https://arxiv.org/abs/2108.09141

作者：Luo Ji,Qin Qi,Bingqing Han,Hongxia Yang 机构：DAMO Academy, Alibaba Group, Hangzhou, China, Qi Qin∗†, Center for Data Science, AAIS, Peking University, Beijing, China 备注：Accepted by CIKM 2021 摘要：推荐系统在现代电子商务平台中起着至关重要的作用。由于用户和项目之间缺乏历史交互，冷启动推荐是一个具有挑战性的问题。为了缓解冷启动问题，大多数现有方法引入内容和上下文信息作为辅助信息。然而，这些方法假设推荐的项目随着时间的推移表现稳定，而在典型的电子商务场景中，项目在其整个生命周期内通常具有非常不同的性能。在这种情况下，从项目的角度考虑长期回报将是有益的，这在传统方法中通常被忽略。强化学习（RL）自然适合这样一个长期优化问题，在这个问题中，推荐者可以识别高潜力的项目，主动分配更多的用户印象以促进其增长，从而提高多期累积收益。受这一思想的启发，我们将该过程建模为一个部分可观测和可控的马尔可夫决策过程（POC-MDP），并提出了一个参与者-批评家RL框架（RL-LTV），将项目生命周期值（LTV）纳入推荐中。在RL-LTV中，评论家研究项目的历史轨迹并预测新项目的未来LTV，而参与者则建议基于分数的策略，以最大化未来LTV预期。然后，参与者建议的分数与双等级框架中的经典等级分数相结合，因此建议与LTV考虑相平衡。在最大的电子商务平台之一上，我们的方法优于强大的实时基线，冷启动项目的IPV和GMV的相对改善率分别为8.67%和18.03%。摘要：Recommender system plays a crucial role in modern E-commerce platform. Due to the lack of historical interactions between users and items, cold-start recommendation is a challenging problem. In order to alleviate the cold-start issue, most existing methods introduce content and contextual information as the auxiliary information. Nevertheless, these methods assume the recommended items behave steadily over time, while in a typical E-commerce scenario, items generally have very different performances throughout their life period. In such a situation, it would be beneficial to consider the long-term return from the item perspective, which is usually ignored in conventional methods. Reinforcement learning (RL) naturally fits such a long-term optimization problem, in which the recommender could identify high potential items, proactively allocate more user impressions to boost their growth, therefore improve the multi-period cumulative gains. Inspired by this idea, we model the process as a Partially Observable and Controllable Markov Decision Process (POC-MDP), and propose an actor-critic RL framework (RL-LTV) to incorporate the item lifetime values (LTV) into the recommendation. In RL-LTV, the critic studies historical trajectories of items and predict the future LTV of fresh item, while the actor suggests a score-based policy which maximizes the future LTV expectation. Scores suggested by the actor are then combined with classical ranking scores in a dual-rank framework, therefore the recommendation is balanced with the LTV consideration. Our method outperforms the strong live baseline with a relative improvement of 8.67% and 18.03% on IPV and GMV of cold-start items, on one of the largest E-commerce platform.

【2】 Plug and Play, Model-Based Reinforcement Learning 标题：即插即用、基于模型的强化学习链接：https://arxiv.org/abs/2108.08960

作者：Majid Abdolshah,Hung Le,Thommen Karimpanal George,Sunil Gupta,Santu Rana,Svetha Venkatesh 机构：Most object-oriented reinforcement learning approachesaim to extract objects in the environment by vision-based or 1Applied Artificial Intelligence Institute (A 2I 2) 摘要：强化学习方法的样本有效推广一直是一个挑战，尤其是对于包含许多组件的复杂场景。在这项工作中，我们介绍了即插即用马尔可夫决策过程，这是一种基于对象的表示方法，允许对已知对象类中的新对象进行零炮集成。这是通过将全局过渡动力学表示为局部过渡函数的并集来实现的，每个局部过渡函数都与场景中的一个活动对象相关。可以预先学习对象类的过渡动力学，因此可以在新环境中使用。每个活动对象也被赋予了其奖励功能。由于没有中央奖励函数，因此只需更新相关对象的奖励函数，即可有效地处理对象的添加或删除。本文还提出了一种新的迁移学习机制来适应这种情况下的奖励函数。实验表明，我们的表示可以在各种设置下实现样本效率。摘要：Sample-efficient generalisation of reinforcement learning approaches have always been a challenge, especially, for complex scenes with many components. In this work, we introduce Plug and Play Markov Decision Processes, an object-based representation that allows zero-shot integration of new objects from known object classes. This is achieved by representing the global transition dynamics as a union of local transition functions, each with respect to one active object in the scene. Transition dynamics from an object class can be pre-learnt and thus would be ready to use in a new environment. Each active object is also endowed with its reward function. Since there is no central reward function, addition or removal of objects can be handled efficiently by only updating the reward functions of objects involved. A new transfer learning mechanism is also proposed to adapt reward function in such cases. Experiments show that our representation can achieve sample-efficiency in a variety of set-ups.

【3】 Explainable Deep Reinforcement Learning Using Introspection in a Non-episodic Task 标题：非情景任务中使用自省的可解释深度强化学习链接：https://arxiv.org/abs/2108.08911

作者：Angel Ayala,Francisco Cruz,Bruno Fernandes,Richard Dazeley 机构：Escola Polit´ecnica de Pernambuco, Universidade de Pernambuco, Recife, Brasil, School of Information Technology, Deakin University, Geelong, Australia, Escuela de Ingenier´ıa, Universidad Central de Chile, Santiago, Chile 摘要：可解释强化学习允许人工智能体以类似人类的方式针对非专家最终用户解释其行为。创建解释的一个有效替代方法是使用基于内省的方法，将Q值转换为成功概率，作为解释代理决策过程的基础。这种方法已经有效地应用于幕式和离散场景，但是，在非幕式和更复杂的环境中计算成功概率的方法尚未得到解决。在这项工作中，我们将内省方法应用于非情节性任务中，并在用彩虹算法解决的连续Atari游戏场景中进行了尝试。我们的初步结果表明，成功的概率可以直接从所有可能行动的Q值计算出来。摘要：Explainable reinforcement learning allows artificial agents to explain their behavior in a human-like manner aiming at non-expert end-users. An efficient alternative of creating explanations is to use an introspection-based method that transforms Q-values into probabilities of success used as the base to explain the agent's decision-making process. This approach has been effectively used in episodic and discrete scenarios, however, to compute the probability of success in non-episodic and more complex environments has not been addressed yet. In this work, we adapt the introspection method to be used in a non-episodic task and try it in a continuous Atari game scenario solved with the Rainbow algorithm. Our initial results show that the probability of success can be computed directly from the Q-values for all possible actions.

分层学习(1篇)

【1】 Mobility-Aware Cluster Federated Learning in Hierarchical Wireless Networks 标题：分层无线网络中移动性感知的簇联合学习链接：https://arxiv.org/abs/2108.09103

作者：Chenyuan Feng,Howard H. Yang,Deshun Hu,Zhiwei Zhao,Tony Q. S. Quek,Geyong Min 机构： Harbin Institute of Technology 摘要：在无线网络中实现联邦学习（FL）算法引起了广泛的关注。然而，很少有研究考虑到用户移动性对学习性能的影响。为了填补这一研究空白，首先，我们开发了一个理论模型来描述无线网络中的分层联合学习（HFL）算法，其中移动用户可能漫游多个边缘接入点，导致不一致的FL训练不完整。其次，我们对HFL在用户移动性下的收敛性进行了分析。我们的分析证明，在高度移动的用户中，HFL的学习性能急剧恶化。学习成绩的下降将因参与者人数少和用户本地数据分布差异大而加剧。为了解决这些问题，我们通过重新设计访问机制、局部更新规则和模型聚合机制，提出了一种移动感知集群联合学习（MACFL）算法。最后，我们提供实验来评估HFL和我们的MACFL的学习性能。结果表明，我们的MACFL可以提高学习性能，特别是对于三种不同的情况，即具有非独立和相同分布数据的用户、具有高移动性的用户和具有少量用户的情况。摘要：Implementing federated learning (FL) algorithms in wireless networks has garnered a wide range of attention. However, few works have considered the impact of user mobility on the learning performance. To fill this research gap, firstly, we develop a theoretical model to characterize the hierarchical federated learning (HFL) algorithm in wireless networks where the mobile users may roam across multiple edge access points, leading to incompletion of inconsistent FL training. Secondly, we provide the convergence analysis of HFL with user mobility. Our analysis proves that the learning performance of HFL deteriorates drastically with highly-mobile users. And this decline in the learning performance will be exacerbated with small number of participants and large data distribution divergences among local data of users. To circumvent these issues, we propose a mobility-aware cluster federated learning (MACFL) algorithm by redesigning the access mechanism, local update rule and model aggregation scheme. Finally, we provide experiments to evaluate the learning performance of HFL and our MACFL. The results show that our MACFL can enhance the learning performance, especially for three different cases, namely, the case of users with non-independent and identical distribution data, the case of users with high mobility, and the cases with a small number of users.

医学相关(2篇)

【1】 Detection of Illicit Drug Trafficking Events on Instagram: A Deep Multimodal Multilabel Learning Approach 标题：Instagram上非法贩毒事件的检测：一种深度多模态多标签学习方法链接：https://arxiv.org/abs/2108.08920

作者：Chuanbo Hu,Minglei Yin,Bin Liu,Xin Li,Yanfang Ye 机构：West Virginia University, Morgantown, WV, USA, Case Western Reserve University, University of Notre Dame, OhioIndiana, USA 备注：9 pages, 5 figures 摘要：Instagram和Twitter等社交媒体已成为营销和销售非法药物的重要平台。侦查网上非法药物贩运已成为打击非法药物网上贸易的关键。然而，法律地位往往在空间和时间上有所不同；即使是同一种药物，联邦和州立法对其合法性也可能有不同的规定。与此同时，更多的贩毒事件被伪装成一种新的广告评论形式，导致信息异质性。因此，从社交媒体准确检测非法药物贩运事件变得更加具有挑战性。在这项工作中，我们首次在Instagram上对IDTEs的细粒度检测进行了系统研究。我们建议采用深度多模态多标签学习（DMML）方法检测IDTE，并在新构建的多模态IDTE（MM-IDTE）数据集上证明其有效性。具体来说，我们的模型以文本和图像数据为输入，并结合多模态信息来预测多个非法药物标签。受BERT成功的启发，我们通过联合微调预训练文本和图像编码器，开发了一种自我监督的多模双向Transformer。我们构建了一个大规模数据集MM-IDTE，其中包含手动注释的多个药物标签，以支持非法药物的细粒度检测。在MM-IDTE数据集上的大量实验结果表明，所提出的DMML方法可以准确地检测IDTE，即使存在试图逃避检测的特殊字符和样式变化。摘要：Social media such as Instagram and Twitter have become important platforms for marketing and selling illicit drugs. Detection of online illicit drug trafficking has become critical to combat the online trade of illicit drugs. However, the legal status often varies spatially and temporally; even for the same drug, federal and state legislation can have different regulations about its legality. Meanwhile, more drug trafficking events are disguised as a novel form of advertising commenting leading to information heterogeneity. Accordingly, accurate detection of illicit drug trafficking events (IDTEs) from social media has become even more challenging. In this work, we conduct the first systematic study on fine-grained detection of IDTEs on Instagram. We propose to take a deep multimodal multilabel learning (DMML) approach to detect IDTEs and demonstrate its effectiveness on a newly constructed dataset called multimodal IDTE(MM-IDTE). Specifically, our model takes text and image data as the input and combines multimodal information to predict multiple labels of illicit drugs. Inspired by the success of BERT, we have developed a self-supervised multimodal bidirectional transformer by jointly fine-tuning pretrained text and image encoders. We have constructed a large-scale dataset MM-IDTE with manually annotated multiple drug labels to support fine-grained detection of illicit drugs. Extensive experimental results on the MM-IDTE dataset show that the proposed DMML methodology can accurately detect IDTEs even in the presence of special characters and style changes attempting to evade detection.

【2】 Segmentation of Lungs COVID Infected Regions by Attention Mechanism and Synthetic Data 标题：基于注意力机制和合成数据的肺部COVID感染区域分割链接：https://arxiv.org/abs/2108.08895

作者：Parham Yazdekhasty,Ali Zindari,Zahra Nabizadeh-ShahreBabak,Pejman Khadivi,Nader Karimi,Shadrokh Samavi 机构：Department of Electrical and Computer Engineering, Isfahan University of Technology, Isfahan, Iran, Computer Science Department, Seattle University, Seattle, USA, Department of Electrical and Computer Engineering, McMaster University, Hamilton, Canada 备注：8 pages, 5 figures 摘要：冠状病毒已造成数十万人死亡。如果每个患者都能得到医疗系统的适当治疗，死亡率可能会降低。2019冠状病毒疾病的机器学习，尤其是基于深度学习的计算机视觉方法，可以帮助医护人员更有效地诊断和治疗COVID-19感染病例。因此，受感染的患者可以从医疗系统获得更好的服务，并减少由冠状病毒引起的死亡人数。本研究提出了一种在CT图像中分割肺部感染区域的方法。为此，使用具有注意机制的卷积神经网络来检测具有复杂模式的感染区域。注意块通过关注图像的信息部分来提高分割精度。此外，生成性对抗网络生成合成图像，用于数据扩充和扩展小型可用数据集。实验结果表明，与现有的一些方法相比，该方法具有优越性。摘要：Coronavirus has caused hundreds of thousands of deaths. Fatalities could decrease if every patient could get suitable treatment by the healthcare system. Machine learning, especially computer vision methods based on deep learning, can help healthcare professionals diagnose and treat COVID-19 infected cases more efficiently. Hence, infected patients can get better service from the healthcare system and decrease the number of deaths caused by the coronavirus. This research proposes a method for segmenting infected lung regions in a CT image. For this purpose, a convolutional neural network with an attention mechanism is used to detect infected areas with complex patterns. Attention blocks improve the segmentation accuracy by focusing on informative parts of the image. Furthermore, a generative adversarial network generates synthetic images for data augmentation and expansion of small available datasets. Experimental results show the superiority of the proposed method compared to some existing procedures.

推荐(3篇)

【1】 A Recommender System for Scientific Datasets and Analysis Pipelines 标题：一种面向科学数据集和分析管道的推荐系统链接：https://arxiv.org/abs/2108.09275

作者：Mandana Mazaheri,Gregory Kiar,Tristan Glatard 机构：Department of Computer Science and Software Engineering, Concordia University, Montreal, Canada, Center for the Developing Brain, Child Mind Institute, New York, NY, USA 摘要：为了开放科学的利益，科学数据集和分析管道越来越多地被公开共享。然而，缺乏可靠的机制来确定哪些管道和数据集可以适当地一起使用。鉴于高质量公共数据集和管道的数量不断增加，这种缺乏明确兼容性的情况威胁到这些资源的可查找性和可重用性。我们研究了一个协同过滤系统的可行性，该系统基于以前执行的源记录推荐管道和数据集。我们使用从加拿大开放神经科学平台（一个开放神经科学的国家倡议）提取的数据集和管道来评估我们的系统。我们的系统提供的建议（AUC$=0.83$）明显优于chance，并且优于领域专家使用其先前知识以及管道和数据集描述提出的建议（AUC$=0.63$）。特别是，领域专家通常忽略管道数据集交互的低级技术方面，例如预处理级别，这些方面由基于源的系统捕获。我们得出结论，基于来源的管道和数据集推荐是可行的，有利于开放科学资源的共享和利用。未来的工作将侧重于收集更全面的起源痕迹，并在生产中部署该系统。摘要：Scientific datasets and analysis pipelines are increasingly being shared publicly in the interest of open science. However, mechanisms are lacking to reliably identify which pipelines and datasets can appropriately be used together. Given the increasing number of high-quality public datasets and pipelines, this lack of clear compatibility threatens the findability and reusability of these resources. We investigate the feasibility of a collaborative filtering system to recommend pipelines and datasets based on provenance records from previous executions. We evaluate our system using datasets and pipelines extracted from the Canadian Open Neuroscience Platform, a national initiative for open neuroscience. The recommendations provided by our system (AUC$=0.83$) are significantly better than chance and outperform recommendations made by domain experts using their previous knowledge as well as pipeline and dataset descriptions (AUC$=0.63$). In particular, domain experts often neglect low-level technical aspects of a pipeline-dataset interaction, such as the level of pre-processing, which are captured by a provenance-based system. We conclude that provenance-based pipeline and dataset recommenders are feasible and beneficial to the sharing and usage of open-science resources. Future work will focus on the collection of more comprehensive provenance traces, and on deploying the system in production.

【2】 PASTO: Strategic Parameter Optimization in Recommendation Systems -- Probabilistic is Better than Deterministic 标题：PASTO：推荐系统中的策略参数优化--概率优于确定性链接：https://arxiv.org/abs/2108.09076

作者：Weicong Ding,Hanlin Tang,Jingshuo Feng,Lei Yuan,Sen Yang,Guangxu Yang,Jie Zheng,Jing Wang,Qiang Su,Dong Zheng,Xuezhong Qiu,Yongqi Liu,Yuxuan Chen,Yang Liu,Chao Song,Dongying Kong,Kai Ren,Peng Jiang,Qiao Lian,Ji Liu 机构：Kuaishou Technology, University of Rochester, University of Washington 摘要：现实世界的推荐系统通常包括两个阶段。在第一阶段，多个预测模型产生不同即时用户行为的概率。在第二阶段，根据一组“战略参数”对这些预测进行汇总，以满足不同的业务目标，如更长的用户参与度、更高的收入潜力或更多的社区/网络互动。除了建立准确的预测模型外，优化这组“战略参数”也很重要，这样一来，在优化主要目标的同时，次要护栏不会受损。在这种多目标和受限目标的环境下，本文发现概率策略参数机制比标准机制（寻找单一确定性参数）能够获得更好的价值。新的概率机制是学习策略参数选择的最佳分布，并在每个用户访问平台时从分布中抽取一个策略参数。为了寻求最优概率解，我们将问题转化为一个随机组合优化问题，其中无偏随机梯度是不可用的。我们的方法应用于一个每天有数亿用户的流行社交网络平台，与使用最佳确定性参数策略相比，在推荐任务中用户参与度提高了+0.22%，在广告优化场景中收入提高了+1.7%。摘要：Real-world recommendation systems often consist of two phases. In the first phase, multiple predictive models produce the probability of different immediate user actions. In the second phase, these predictions are aggregated according to a set of 'strategic parameters' to meet a diverse set of business goals, such as longer user engagement, higher revenue potential, or more community/network interactions. In addition to building accurate predictive models, it is also crucial to optimize this set of 'strategic parameters' so that primary goals are optimized while secondary guardrails are not hurt. In this setting with multiple and constrained goals, this paper discovers that a probabilistic strategic parameter regime can achieve better value compared to the standard regime of finding a single deterministic parameter. The new probabilistic regime is to learn the best distribution over strategic parameter choices and sample one strategic parameter from the distribution when each user visits the platform. To pursue the optimal probabilistic solution, we formulate the problem into a stochastic compositional optimization problem, in which the unbiased stochastic gradient is unavailable. Our approach is applied in a popular social network platform with hundreds of millions of daily users and achieves +0.22% lift of user engagement in a recommendation task and +1.7% lift in revenue in an advertising optimization scenario comparing to using the best deterministic parameter strategy.

【3】 Personalized next-best action recommendation with multi-party interaction learning for automated decision-making 标题：用于自动决策的具有多方交互学习的个性化次佳行动推荐链接：https://arxiv.org/abs/2108.08846

作者：Longbing Cao,Chengzhang Zhu 机构：Data Science Lab, University of Technology Sydney, Australia, +these authors contributed equally to this work 备注：17 pages, 7 figures, 4 tables 摘要：在自然、社会和商业决策中，广泛需要在连续、动态和交互的环境中为每个客户提供自动化的次优行动建议。个性化的次优行动建议必须包括过去、当前和未来的客户人口统计数据、情况（状态）和行为、客户和决策者之间的长期顺序交互、状态、行为和行动之间的多顺序交互以及他们对对方行动的反应。现有的建模理论和工具，包括马尔可夫决策过程、用户和行为建模、深度顺序建模和个性化顺序推荐，都无法在个人层面上量化如此复杂的决策。我们采用数据驱动的方法，通过强化耦合递归神经网络（CRN）学习下一个最佳的个性化决策行为。CRN表示客户历史和当前状态的多个耦合动态序列、对决策者行动的响应、行动的决策回报，并学习各方（客户和决策者）之间的长期多序列交互。然后，建议每个客户在某个时间点采取次优行动，以改变其状态，实现最佳决策目标。我们的研究证明了多序列交互的个性化深度学习和复杂系统中个性化决策的自动动态干预的潜力。摘要：Automated next-best action recommendation for each customer in a sequential, dynamic and interactive context has been widely needed in natural, social and business decision-making. Personalized next-best action recommendation must involve past, current and future customer demographics and circumstances (states) and behaviors, long-range sequential interactions between customers and decision-makers, multi-sequence interactions between states, behaviors and actions, and their reactions to their counterpart's actions. No existing modeling theories and tools, including Markovian decision processes, user and behavior modeling, deep sequential modeling, and personalized sequential recommendation, can quantify such complex decision-making on a personal level. We take a data-driven approach to learn the next-best actions for personalized decision-making by a reinforced coupled recurrent neural network (CRN). CRN represents multiple coupled dynamic sequences of a customer's historical and current states, responses to decision-makers' actions, decision rewards to actions, and learns long-term multi-sequence interactions between parties (customer and decision-maker). Next-best actions are then recommended on each customer at a time point to change their state for an optimal decision-making objective. Our study demonstrates the potential of personalized deep learning of multi-sequence interactions and automated dynamic intervention for personalized decision-making in complex systems.

聚类(1篇)

【1】 Lessons from the Clustering Analysis of a Search Space: A Centroid-based Approach to Initializing NAS 标题：搜索空间聚类分析的教训：基于质心的NAS初始化方法链接：https://arxiv.org/abs/2108.09126

作者：Kalifou Rene Traore,Andrés Camero,Xiao Xiang Zhu 机构：German Aerospace Center (DLR), Remote Sensing Technology Institute (IMF), Germany, Technical University of Munich, Data Science in Earth Observation, Germany 备注：Accepted to the Workshop on 'Data Science Meets Optimisation' at IJCAI 2021 摘要：神经架构搜索（NAS）研究中的大量工作一直致力于算法开发，旨在设计更高效、成本更低的方法。尽管如此，对这些技术的初始化的研究仍然很少，目前大多数NAS方法依赖于随机初始化过程，因为在搜索之前获取信息的成本很高。然而，NAS基准测试的最新可用性使得低计算资源的原型设计成为可能。在本研究中，我们建议利用NAS基准的可用性，使用数据驱动的初始化技术来加速NAS算法。特别是，我们提出了一种两步方法。首先，对搜索空间进行校准聚类分析。其次，提取质心并用于初始化NAS算法。我们在NAS-bench-101上使用进化算法Aging Evolution对我们的方案进行了测试。结果表明，与随机初始化相比，最终解的收敛速度更快，性能更好。摘要：Lots of effort in neural architecture search (NAS) research has been dedicated to algorithmic development, aiming at designing more efficient and less costly methods. Nonetheless, the investigation of the initialization of these techniques remain scare, and currently most NAS methodologies rely on stochastic initialization procedures, because acquiring information prior to search is costly. However, the recent availability of NAS benchmarks have enabled low computational resources prototyping. In this study, we propose to accelerate a NAS algorithm using a data-driven initialization technique, leveraging the availability of NAS benchmarks. Particularly, we proposed a two-step methodology. First, a calibrated clustering analysis of the search space is performed. Second, the centroids are extracted and used to initialize a NAS algorithm. We tested our proposal using Aging Evolution, an evolutionary algorithm, on NAS-bench-101. The results show that, compared to a random initialization, a faster convergence and a better performance of the final solution is achieved.

超分辨率|去噪|去模糊|去雾(1篇)

【1】 Achieving on-Mobile Real-Time Super-Resolution with Neural Architecture and Pruning Search 标题：用神经网络结构和剪枝搜索实现车载实时超分辨率链接：https://arxiv.org/abs/2108.08910

作者：Zheng Zhan,Yifan Gong,Pu Zhao,Geng Yuan,Wei Niu,Yushu Wu,Tianyun Zhang,Malith Jayaweera,David Kaeli,Bin Ren,Xue Lin,Yanzhi Wang 机构：Northeastern University,College of William & Mary,Cleveland State University 摘要：尽管近年来随着深度神经网络（DNN）的蓬勃发展，单图像超分辨率（SISR）任务取得了显著进展，但深度学习方法在实际应用中面临着计算和内存消耗问题，特别是对于资源有限的平台，如移动设备。为了克服这一挑战并便于在移动设备上实时部署SISR任务，我们将神经结构搜索与剪枝搜索相结合，提出了一种自动搜索框架，该框架在满足实时推理要求的同时，导出具有高图像质量的稀疏超分辨率（SR）模型。为了降低搜索成本，我们通过引入超网来利用权重共享策略，并将搜索问题分解为三个阶段，包括超网构造、编译器感知体系结构和剪枝搜索以及编译器感知剪枝比率搜索。利用所提出的框架，我们是第一个实现实时SR推断（每帧仅数十毫秒）的公司，用于在移动平台（三星Galaxy S20）上实现720p分辨率和具有竞争力的图像质量（PSNR和SSIM）。摘要：Though recent years have witnessed remarkable progress in single image super-resolution (SISR) tasks with the prosperous development of deep neural networks (DNNs), the deep learning methods are confronted with the computation and memory consumption issues in practice, especially for resource-limited platforms such as mobile devices. To overcome the challenge and facilitate the real-time deployment of SISR tasks on mobile, we combine neural architecture search with pruning search and propose an automatic search framework that derives sparse super-resolution (SR) models with high image quality while satisfying the real-time inference requirement. To decrease the search cost, we leverage the weight sharing strategy by introducing a supernet and decouple the search problem into three stages, including supernet construction, compiler-aware architecture and pruning search, and compiler-aware pruning ratio search. With the proposed framework, we are the first to achieve real-time SR inference (with only tens of milliseconds per frame) for implementing 720p resolution with competitive image quality (in terms of PSNR and SSIM) on mobile platforms (Samsung Galaxy S20).

自动驾驶|车辆|车道检测等(1篇)

【1】 DL-Traff: Survey and Benchmark of Deep Learning Models for Urban Traffic Prediction 标题：DL-TRAFF：城市交通预测深度学习模型的研究综述与基准链接：https://arxiv.org/abs/2108.09091

作者：Renhe Jiang,Du Yin,Zhaonan Wang,Yizhuo Wang,Jiewen Deng,Hangchen Liu,Zekun Cai,Jinliang Deng,Xuan Song,Ryosuke Shibasaki 机构：The University of Tokyo, Japan, Southern University of Science and Technology, China, University of Technology Sydney, Australia 备注：This paper has been accepted by CIKM 2021 Resource Track 摘要：如今，随着物联网（InternetofThings）和CPS（Cyber Physical Systems）技术的快速发展，手机、汽车导航系统和交通传感器产生了大量时空数据。通过对这些数据利用最先进的深度学习技术，城市交通预测在人工智能和智能交通系统领域引起了广泛关注。该问题可以用三维张量（T，N，C）统一建模，其中T表示总时间步长，N表示空间域的大小（即网格或图形节点），C表示信息通道。根据具体的建模策略，最先进的深度学习模型可分为三类：基于网格的、基于图形的和多元时间序列模型。在本研究中，我们首先综合回顾了深度流量模型以及广泛使用的数据集，然后建立一个标准基准，在相同的设置和指标下综合评估其性能。我们的研究命名为DL Traff，它是通过两个最流行的深度学习框架实现的，即TensorFlow和PyTorch，这两个框架已经作为两个GitHub存储库公开提供https://github.com/deepkashiwa20/DL-Traff-Grid 及https://github.com/deepkashiwa20/DL-Traff-Graph. 通过DL Traff，我们希望为对时空数据分析感兴趣的研究人员提供有用的资源。摘要：Nowadays, with the rapid development of IoT (Internet of Things) and CPS (Cyber-Physical Systems) technologies, big spatiotemporal data are being generated from mobile phones, car navigation systems, and traffic sensors. By leveraging state-of-the-art deep learning technologies on such data, urban traffic prediction has drawn a lot of attention in AI and Intelligent Transportation System community. The problem can be uniformly modeled with a 3D tensor (T, N, C), where T denotes the total time steps, N denotes the size of the spatial domain (i.e., mesh-grids or graph-nodes), and C denotes the channels of information. According to the specific modeling strategy, the state-of-the-art deep learning models can be divided into three categories: grid-based, graph-based, and multivariate time-series models. In this study, we first synthetically review the deep traffic models as well as the widely used datasets, then build a standard benchmark to comprehensively evaluate their performances with the same settings and metrics. Our study named DL-Traff is implemented with two most popular deep learning frameworks, i.e., TensorFlow and PyTorch, which is already publicly available as two GitHub repositories https://github.com/deepkashiwa20/DL-Traff-Grid and https://github.com/deepkashiwa20/DL-Traff-Graph. With DL-Traff, we hope to deliver a useful resource to researchers who are interested in spatiotemporal data analysis.

联邦学习|隐私保护|加密(3篇)

【1】 Accelerating Federated Learning with a Global Biased Optimiser 标题：利用全局偏向优化器加速联合学习链接：https://arxiv.org/abs/2108.09134

作者：Jed Mills,Jia Hu,Geyong Min,Rui Jin,Siwei Zheng,Jin Wang 机构：Department of Computer Science, University of Exeter, UK 摘要：联邦学习（FL）是机器学习领域的一项最新发展，它可以在不让训练数据离开客户端设备的情况下协作训练模型，以保护数据隐私。在现实环境中，整个训练集以高度非独立和同分布（非IID）的方式分布在客户机上，这已被广泛证明会损害FL收敛速度和最终模型性能。我们提出了一种新的、通用的方法，将自适应优化技术与联邦全局偏差优化（FedGBO）算法应用于FL。FedGBO通过在FL的本地训练阶段应用一组全局偏差优化值来加速FL，这有助于减少来自非IID数据的“客户漂移”，同时也受益于自适应动量/学习率方法。我们证明了具有一般优化算子的FedGBO更新可以看作是具有有偏梯度和优化算子更新的集中更新，并利用这个理论框架证明了FedGBO的动量随机梯度下降收敛性。我们还使用4个真实的基准FL数据集和3个流行的自适应优化程序进行了广泛的实验，以比较不同自适应FL方法的性能，证明FedGBO具有很高的竞争力，考虑到其较低的通信和计算成本，并为在FL中使用自适应优化提供高度实用的见解。摘要：Federated Learning (FL) is a recent development in the field of machine learning that collaboratively trains models without the training data leaving client devices, in order to preserve data-privacy. In realistic settings, the total training set is distributed over clients in a highly non-Independent and Identically Distributed (non-IID) fashion, which has been shown extensively to harm FL convergence speed and final model performance. We propose a novel, generalised approach for applying adaptive optimisation techniques to FL with the Federated Global Biased Optimiser (FedGBO) algorithm. FedGBO accelerates FL by applying a set of global biased optimiser values during the local training phase of FL, which helps to reduce `client-drift' from non-IID data, whilst also benefiting from adaptive momentum/learning-rate methods. We show that the FedGBO update with a generic optimiser can be viewed as a centralised update with biased gradients and optimiser update, and use this theoretical framework to prove the convergence of FedGBO using momentum-Stochastic Gradient Descent. We also perform extensive experiments using 4 realistic benchmark FL datasets and 3 popular adaptive optimisers to compare the performance of different adaptive-FL approaches, demonstrating that FedGBO has highly competitive performance considering its low communication and computation costs, and providing highly practical insights for the use of adaptive optimisation in FL.

【2】 FedSkel: Efficient Federated Learning on Heterogeneous Systems with Skeleton Gradients Update 标题：FedSkel：支持骨架梯度更新的异构系统高效联邦学习链接：https://arxiv.org/abs/2108.09081

作者：Junyu Luo,Jianlei Yang,Xucheng Ye,Xin Guo,Weisheng Zhao 机构：SCSE, Beihang University, Heterogeneous Computing Center, Kuaishou Technology, SME, Beihang University 备注：CIKM 2021 摘要：联合学习旨在保护用户的隐私，同时对不同参与者进行数据分析。然而，由于各种计算能力和通信瓶颈，在异构系统上保证训练效率是一个挑战。在这项工作中，我们提出FedSkel，只需更新模型的基本部分，即骨架网络，就可以在边缘设备上实现高效计算和高效通信的联邦学习。FedSkel在具有不平衡数据集的真实边缘设备上进行评估。实验结果表明，该算法对CONV层反向传播的加速比可达5.52$\倍，对整个训练过程的加速比可达1.82$\倍，通信成本降低64.8%，准确度损失可忽略不计。摘要：Federated learning aims to protect users' privacy while performing data analysis from different participants. However, it is challenging to guarantee the training efficiency on heterogeneous systems due to the various computational capabilities and communication bottlenecks. In this work, we propose FedSkel to enable computation-efficient and communication-efficient federated learning on edge devices by only updating the model's essential parts, named skeleton networks. FedSkel is evaluated on real edge devices with imbalanced datasets. Experimental results show that it could achieve up to 5.52$\times$ speedups for CONV layers' back-propagation, 1.82$\times$ speedups for the whole training process, and reduce 64.8% communication cost, with negligible accuracy loss.

【3】 Cross-Silo Federated Learning for Multi-Tier Networks with Vertical and Horizontal Data Partitioning 标题：具有垂直和水平数据划分的多层网络的跨竖井联合学习链接：https://arxiv.org/abs/2108.08930

作者：Anirban Das,Shiqiang Wang,Stacy Patterson 机构： Rensselaer Polytechnic Institute 备注：25 Pages, 11 Figures, Under Review. arXiv admin note: text overlap with arXiv:2102.03620 摘要：我们在分层通信网络中考虑联合学习。我们的网络模型由一组筒仓组成，每个筒仓都有一个垂直的数据分区。每个思洛存储器包含一个集线器和一组客户机，思洛存储器的垂直数据碎片在其客户机之间水平分区。我们提出了分层分散坐标下降（TDCD），这是一种针对这种双层网络的高效通信分散训练算法。为了减少通信开销，每个思洛存储器中的客户端在与其集线器共享更新之前执行多个局部梯度步骤。每个中心通过平均其工作人员的更新来调整其坐标，然后中心彼此交换中间更新。我们对我们的算法进行了理论分析，并展示了收敛速度与垂直分区的数量、本地更新的数量以及每个中心中的客户端数量的关系。我们通过使用各种数据集和目标的基于模拟的实验进一步验证了我们的方法。摘要：We consider federated learning in tiered communication networks. Our network model consists of a set of silos, each holding a vertical partition of the data. Each silo contains a hub and a set of clients, with the silo's vertical data shard partitioned horizontally across its clients. We propose Tiered Decentralized Coordinate Descent (TDCD), a communication-efficient decentralized training algorithm for such two-tiered networks. To reduce communication overhead, the clients in each silo perform multiple local gradient steps before sharing updates with their hub. Each hub adjusts its coordinates by averaging its workers' updates, and then hubs exchange intermediate updates with one another. We present a theoretical analysis of our algorithm and show the dependence of the convergence rate on the number of vertical partitions, the number of local updates, and the number of clients in each hub. We further validate our approach empirically via simulation-based experiments using a variety of datasets and objectives.

推理|分析|理解|解释(1篇)

【1】 Towards Understanding the Generative Capability of Adversarially Robust Classifiers 标题：理解对抗性稳健量词的生成能力链接：https://arxiv.org/abs/2108.09093

作者：Yao Zhu,Jiacheng Ma,Jiacheng Sun,Zewei Chen,Rongxin Jiang,Zhenguo Li 机构：Zhejiang University,Huawei Noah’s Ark Lab 备注：Accepted by ICCV 2021, Oral 摘要：最近，一些研究发现了一个有趣的现象，即逆向鲁棒分类器可以生成与生成模型相当的好图像。我们从能量的角度研究了这一现象，并给出了一个新颖的解释。我们根据能量函数重新构造了对抗性示例生成、对抗性训练和图像生成。我们发现，对抗性训练有助于获得围绕真实数据的平坦且低能量的能量函数，这是生成能力的关键。基于我们的新理解，我们进一步提出了一种更好的对抗训练方法，联合能量对抗训练（JEAT），它可以生成高质量的图像，并在各种攻击下实现新的最先进的鲁棒性。JEAT生成的图像（CIFAR-10）的初始分数为8.80，远好于原始稳健分类器（7.50）。特别是，在没有额外训练数据的情况下，我们实现了对CIFAR-10（从57.20%到62.04%）和CIFAR-100（从30.03%到30.18%）的最新鲁棒性。摘要：Recently, some works found an interesting phenomenon that adversarially robust classifiers can generate good images comparable to generative models. We investigate this phenomenon from an energy perspective and provide a novel explanation. We reformulate adversarial example generation, adversarial training, and image generation in terms of an energy function. We find that adversarial training contributes to obtaining an energy function that is flat and has low energy around the real data, which is the key for generative capability. Based on our new understanding, we further propose a better adversarial training method, Joint Energy Adversarial Training (JEAT), which can generate high-quality images and achieve new state-of-the-art robustness under a wide range of attacks. The Inception Score of the images (CIFAR-10) generated by JEAT is 8.80, much better than original robust classifiers (7.50). In particular, we achieve new state-of-the-art robustness on CIFAR-10 (from 57.20% to 62.04%) and CIFAR-100 (from 30.03% to 30.18%) without extra training data.

检测相关(1篇)

【1】 CloudShield: Real-time Anomaly Detection in the Cloud 标题：CloudShield：云中的实时异常检测链接：https://arxiv.org/abs/2108.08977

作者：Zecheng He,Ruby B. Lee 机构：Princeton University, Ruby Lee 摘要：在云计算中，如果可以通过自动异常检测系统检测到可疑活动，这是可取的。尽管异常检测在过去已经被研究过，但在云计算中仍然没有得到解决。挑战包括：描述云服务器的正常行为，区分良性和恶意异常（攻击），以及防止假警报导致的警报疲劳。我们提出了CloudShield，一个实用的、可推广的云计算实时异常和攻击检测系统。Cloudshield使用具有不同云工作负载的通用、预训练的深度学习模型，通过检查模型重建误差分布来预测正常行为并提供实时和连续的检测。一旦检测到异常，为了减少警报疲劳，CloudShield通过检查预测误差分布，自动区分良性程序、已知攻击和零日攻击。我们在代表性的云基准上评估了提议的CloudShield。我们的评估表明，使用模型预训练的CloudShield可以应用于广泛的云工作负载。特别是，我们观察到CloudShield可以在毫秒内检测到最近提出的推测性执行攻击，例如Spectre和Meldown攻击。此外，我们还证明了CloudShield能够准确区分已知攻击和潜在零日攻击，并将其与良性程序区分开来。因此，它可以显著减少高达99.0%的假警报。摘要：In cloud computing, it is desirable if suspicious activities can be detected by automatic anomaly detection systems. Although anomaly detection has been investigated in the past, it remains unsolved in cloud computing. Challenges are: characterizing the normal behavior of a cloud server, distinguishing between benign and malicious anomalies (attacks), and preventing alert fatigue due to false alarms. We propose CloudShield, a practical and generalizable real-time anomaly and attack detection system for cloud computing. Cloudshield uses a general, pretrained deep learning model with different cloud workloads, to predict the normal behavior and provide real-time and continuous detection by examining the model reconstruction error distributions. Once an anomaly is detected, to reduce alert fatigue, CloudShield automatically distinguishes between benign programs, known attacks, and zero-day attacks, by examining the prediction error distributions. We evaluate the proposed CloudShield on representative cloud benchmarks. Our evaluation shows that CloudShield, using model pretraining, can apply to a wide scope of cloud workloads. Especially, we observe that CloudShield can detect the recently proposed speculative execution attacks, e.g., Spectre and Meltdown attacks, in milliseconds. Furthermore, we show that CloudShield accurately differentiates and prioritizes known attacks, and potential zero-day attacks, from benign programs. Thus, it significantly reduces false alarms by up to 99.0%.

分类|识别(1篇)

【1】 Comparing concepts of quantum and classical neural network models for image classification task 标题：用于图像分类任务的量子神经网络模型与经典神经网络模型的概念比较链接：https://arxiv.org/abs/2108.08875

作者：Sebastian Porebski,Rafal Potempa 机构：Department of Cybernetics, Nanotechnology and Data Processing, Automatic Control, Electronics and Computer Science, Silesian University of, Technology, Gliwice, Poland 备注：None 摘要：虽然量子体系结构仍在开发中，但如果可用，它们将只能处理量子数据，而机器学习算法只能处理数字数据。因此，在分类或回归问题中，有必要模拟和研究将数字输入数据转换为量子形式的量子系统，并使量子计算机能够使用可用的机器学习方法。本材料包括针对MNIST数据集手写数字分类问题开发的混合量子经典神经网络的训练和性能的实验结果。两种模型（训练参数数量相似的经典神经网络和量子神经网络）的比较结果表明，尽管量子网络的模拟耗时，但它克服了经典网络（具有更好的收敛性，实现了更高的训练和测试精度）。摘要：While quantum architectures are still under development, when available, they will only be able to process quantum data when machine learning algorithms can only process numerical data. Therefore, in the issues of classification or regression, it is necessary to simulate and study quantum systems that will transfer the numerical input data to a quantum form and enable quantum computers to use the available methods of machine learning. This material includes the results of experiments on training and performance of a hybrid quantum-classical neural network developed for the problem of classification of handwritten digits from the MNIST data set. The comparative results of two models: classical and quantum neural networks of a similar number of training parameters, indicate that the quantum network, although its simulation is time-consuming, overcomes the classical network (it has better convergence and achieves higher training and testing accuracy).

表征(2篇)

【1】 Contrastive Representations for Label Noise Require Fine-Tuning 标题：标签噪声的对比表示法需要微调链接：https://arxiv.org/abs/2108.09154

作者：Pierre Nodet,Vincent Lemaire,Alexis Bondu,Antoine Cornuéjols 机构： Orange Labs, Paris & Lannion, France, AgroParisTech, Paris, France 摘要：在本文中，我们展示了对比表示与标签噪声鲁棒分类头的组合需要微调表示以实现最先进的性能。由于微调的表示优于冻结的表示，因此可以得出这样的结论：如果提供合适的起点，噪声鲁棒性分类头确实能够促进有意义的表示。通过六种方法和三种不同类型（无、对称和非对称）的九种噪声实例，进行了实验，以绘制性能的综合图。在存在噪声的情况下，实验表明，对比表征的微调可以使这六种方法比端到端学习获得更好的结果，并代表了与最新技术相比的新参考。与噪声水平相比，结果也非常稳定。摘要：In this paper we show that the combination of a Contrastive representation with a label noise-robust classification head requires fine-tuning the representation in order to achieve state-of-the-art performances. Since fine-tuned representations are shown to outperform frozen ones, one can conclude that noise-robust classification heads are indeed able to promote meaningful representations if provided with a suitable starting point. Experiments are conducted to draw a comprehensive picture of performances by featuring six methods and nine noise instances of three different kinds (none, symmetric, and asymmetric). In presence of noise the experiments show that fine tuning of Contrastive representation allows the six methods to achieve better results than end-to-end learning and represent a new reference compare to the recent state of art. Results are also remarkable stable versus the noise level.

【2】 Augmenting Implicit Neural Shape Representations with Explicit Deformation Fields 标题：用显式变形场增强隐式神经形状表示链接：https://arxiv.org/abs/2108.08931

作者：Matan Atzmon,David Novotny,Andrea Vedaldi,Yaron Lipman 机构：Weizmann Institute of Science,Facebook AI Research 摘要：隐式神经表示是一种将形状集合学习为零级神经网络集的最新方法，其中每个形状由潜在代码表示。到目前为止，人们关注的焦点一直是形状重建，而形状泛化则主要留给通用编码器-解码器或自动解码器正则化。在本文中，我们提倡隐式神经表示的变形感知正则化，目的是在潜在代码更改时产生合理的变形。挑战在于隐式表示无法捕捉不同形状之间的对应关系，这使得难以表示和规范它们的变形。因此，我们建议将形状的隐式表示与作为辅助函数学习的显式分段线性变形场配对。我们证明，通过正则化这些变形场，我们可以鼓励隐式神经表示在学习的形状空间中诱导自然变形，例如尽可能刚性的变形。摘要：Implicit neural representation is a recent approach to learn shape collections as zero level-sets of neural networks, where each shape is represented by a latent code. So far, the focus has been shape reconstruction, while shape generalization was mostly left to generic encoder-decoder or auto-decoder regularization. In this paper we advocate deformation-aware regularization for implicit neural representations, aiming at producing plausible deformations as latent code changes. The challenge is that implicit representations do not capture correspondences between different shapes, which makes it difficult to represent and regularize their deformations. Thus, we propose to pair the implicit representation of the shapes with an explicit, piecewise linear deformation field, learned as an auxiliary function. We demonstrate that, by regularizing these deformation fields, we can encourage the implicit neural representation to induce natural deformations in the learned shape space, such as as-rigid-as-possible deformations.

优化|敛散性(2篇)

【1】 Federated Distributionally Robust Optimization for Phase Configuration of RISs 标题：RISS相位配置的联合分布鲁棒优化链接：https://arxiv.org/abs/2108.09026

作者：Chaouki Ben Issaid,Sumudu Samarakoon,Mehdi Bennis,H. Vincent Poor 机构：Centre for Wireless Communications (CWC), University of Oulu, Finland, †Electrical Engineering Department, Princeton University, Princeton, USA 备注：6 pages, 2 figures 摘要：在这篇文章中，我们研究了在监督学习环境下，异构RIS类型下的鲁棒可重构智能表面（RIS）辅助下行链路通信问题。通过将异构RIS设计上的下行链路通信建模为学习如何以分布式方式优化相位配置的不同工作者，我们以通信效率高的方式使用分布式鲁棒公式解决了该分布式学习问题，同时确定了其收敛速度。通过这样做，我们可以确保最坏情况工人的全局模型性能接近其他工人的性能。仿真结果表明，与竞争基线相比，我们提出的算法需要更少的通信回合（大约减少50%），以达到相同的最坏情况分布测试精度。摘要：In this article, we study the problem of robust reconfigurable intelligent surface (RIS)-aided downlink communication over heterogeneous RIS types in the supervised learning setting. By modeling downlink communication over heterogeneous RIS designs as different workers that learn how to optimize phase configurations in a distributed manner, we solve this distributed learning problem using a distributionally robust formulation in a communication-efficient manner, while establishing its rate of convergence. By doing so, we ensure that the global model performance of the worst-case worker is close to the performance of other workers. Simulation results show that our proposed algorithm requires fewer communication rounds (about 50% lesser) to achieve the same worst-case distribution test accuracy compared to competitive baselines.

【2】 Optimal Order Simple Regret for Gaussian Process Bandits 标题：高斯过程带的最优阶简单遗憾链接：https://arxiv.org/abs/2108.09262

作者：Sattar Vakili,Nacime Bouziani,Sepehr Jalali,Alberto Bernacchia,Da-shan Shiu 机构：∗ MediaTek Research, †Imperial College London 摘要：考虑连续优化，一个连续的，可能非凸的，昂贵的评估目标函数$F$。这个问题可以归结为高斯过程（GP）强盗，$f$生活在再生核希尔BERT空间（RKHS）中。对几种学习算法的最新分析表明，简单遗憾性能的上下限之间存在显著差距。当$N$是勘探试验的次数，$\gamma_N$是最大信息增益时，我们证明了一个$\tilde{\mathcal{O}（\sqrt{\gamma_N/N}）$界对纯勘探算法的简单后悔性能的影响，该界明显比现有界更为严格。我们证明，对于已知后悔下限的情况，这个界在对数因子下是顺序最优的。为了建立这些结果，我们证明了适用于RKHS元素的GP模型的新颖且精确的置信区间，这可能具有更广泛的意义。摘要：Consider the sequential optimization of a continuous, possibly non-convex, and expensive to evaluate objective function $f$. The problem can be cast as a Gaussian Process (GP) bandit where $f$ lives in a reproducing kernel Hilbert space (RKHS). The state of the art analysis of several learning algorithms shows a significant gap between the lower and upper bounds on the simple regret performance. When $N$ is the number of exploration trials and $\gamma_N$ is the maximal information gain, we prove an $\tilde{\mathcal{O}}(\sqrt{\gamma_N/N})$ bound on the simple regret performance of a pure exploration algorithm that is significantly tighter than the existing bounds. We show that this bound is order optimal up to logarithmic factors for the cases where a lower bound on regret is known. To establish these results, we prove novel and sharp confidence intervals for GP models applicable to RKHS elements which may be of broader interest.

预测|估计(2篇)

【1】 Efficient Online Estimation of Causal Effects by Deciding What to Observe 标题：通过决定观察什么来有效地在线估计因果效应链接：https://arxiv.org/abs/2108.09265

作者：Shantanu Gupta,Zachary C. Lipton,David Childers 机构：Carnegie Mellon University 摘要：研究人员经常面临数据融合问题，其中有多个数据源可用，每个数据源捕获不同的变量子集。虽然问题公式通常采用给定的数据，但在实践中，数据采集可能是一个持续的过程。在本文中，我们的目标是通过每次确定要查询的数据源，尽可能有效地估计概率模型的任何函数（例如因果效应）。我们提出了在线力矩选择（OMS），一个将结构假设编码为力矩条件的框架。每一步的最佳行动在一定程度上取决于识别感兴趣功能的关键时刻。我们的算法在探索和选择最佳行动之间取得了平衡，正如当前的矩估计所建议的那样。我们提出了两种选择策略：（1）先探索后提交（OMS-ETC）和（2）先探索后贪婪（OMS-ETG），证明了二者都达到了MSE评估的零渐近后悔。我们举例说明我们的平均治疗效果评估设置，其中结构假设由因果图给出，数据源可能包括中介子、混杂因子和工具变量的子集。摘要：Researchers often face data fusion problems, where multiple data sources are available, each capturing a distinct subset of variables. While problem formulations typically take the data as given, in practice, data acquisition can be an ongoing process. In this paper, we aim to estimate any functional of a probabilistic model (e.g., a causal effect) as efficiently as possible, by deciding, at each time, which data source to query. We propose online moment selection (OMS), a framework in which structural assumptions are encoded as moment conditions. The optimal action at each step depends, in part, on the very moments that identify the functional of interest. Our algorithms balance exploration with choosing the best action as suggested by current estimates of the moments. We propose two selection strategies: (1) explore-then-commit (OMS-ETC) and (2) explore-then-greedy (OMS-ETG), proving that both achieve zero asymptotic regret as assessed by MSE. We instantiate our setup for average treatment effect estimation, where structural assumptions are given by a causal graph and data sources may include subsets of mediators, confounders, and instrumental variables.

【2】 Estimation of Convex Polytopes for Automatic Discovery of Charge State Transitions in Quantum Dot Arrays 标题：用于量子点阵列电荷态跃迁自动发现的凸多面体估计链接：https://arxiv.org/abs/2108.09133

作者：Oswin Krause,Torbjørn Rasmussen,Bertram Brovang,Anasua Chatterjee,Ferdinand Kuemmeth 机构：Department of Computer Science, University of Copenhagen, Universitetsparken , Copenhagen, Denmark, Niels Bohr Institute 备注：submitted to Journal of Machine Learning Research (JMLR) 摘要：在基于自旋的量子点阵列中，作为量子计算应用的领先技术，材料或制造的不精确性会影响器件的性能，通过调谐参数进行补偿。自动调整这些设备参数对机器学习来说是一个巨大的挑战。在这里，我们提出了第一个控制自旋量子位阵列中电子跃迁的实用算法。我们利用与计算几何的联系，将任务表述为从测量值估计凸多面体。我们提出的算法使用主动学习，以找到计数，形状和大小的所有方面的一个给定的多面体。我们在人造多面体和真实2x2自旋量子比特阵列上测试了我们的算法。我们的结果表明，我们可以可靠地找到多面体的小面，包括大小与测量精度有关的小面。我们讨论了潜在估计问题的NP硬度的含义，并概述了控制未来大规模自旋量子位器件的设计考虑、限制和调谐策略。摘要：In spin based quantum dot arrays, a leading technology for quantum computation applications, material or fabrication imprecisions affect the behaviour of the device, which is compensated via tuning parameters. Automatic tuning of these device parameters constitutes a formidable challenge for machine-learning. Here, we present the first practical algorithm for controlling the transition of electrons in a spin qubit array. We exploit a connection to computational geometry and phrase the task as estimating a convex polytope from measurements. Our proposed algorithm uses active learning, to find the count, shapes and sizes of all facets of a given polytope. We test our algorithm on artifical polytopes as well as a real 2x2 spin qubit array. Our results show that we can reliably find the facets of the polytope, including small facets with sizes on the order of the measurement precision. We discuss the implications of the NP-hardness of the underlying estimation problem and outline design considerations, limitations and tuning strategies for controlling future large-scale spin qubit devices.

其他神经网络|深度学习|模型|建模(11篇)

【1】 Quantization Backdoors to Deep Learning Models 标题：深度学习模型的量化后门链接：https://arxiv.org/abs/2108.09187

作者：Hua Ma,Huming Qiu,Yansong Gao,Zhi Zhang,Alsharif Abuadbba,Anmin Fu,Said Al-Sarawi,Derek Abbott 机构：‡ Equal Contribution., ∗ The University of Adelaide, Australia., † Nanjing University of Science and Technology, China., ¶ Data, CSIRO, Australia. 摘要：由于深度学习（DL）模型的低延迟和高隐私保护性，目前在无处不在的边缘物联网设备上部署深度学习（DL）模型的需求正在迅速增长。然而，DL模型通常尺寸较大，需要大规模计算，这使得它们无法直接放置在资源受限且32位浮点操作不可用的IoT设备上。模型量化是一种实用的解决方案，通过将大型高精度模型轻松量化为小型低精度模型，同时保持模型推理精度，从而在移动设备和嵌入式系统上部署DL。这项工作揭示了标准量化操作可以被滥用来激活后门。我们证明，在存在触发器的情况下（由于后门处于休眠状态），没有任何后门效应的全精度后门模型可以通过默认的TensorFlow Lite量化激活，这是迄今为止唯一一个产品就绪的量化框架。我们确定，所有经过训练的float-32后门模型即使在存在触发输入的情况下也不会表现出后门效应。最先进的前端检测方法，如神经清洗和剥离，无法识别float-32模型中的后门。当通过标准TFLite训练后量化将每个float-32模型转换为int-8格式模型时，量化模型中的后门被激活，在使用触发器输入时，显示接近100%的稳定攻击成功率，而在非触发器输入时行为正常。这项工作强调，当终端用户使用设备上的训练后模型量化工具包，通知安全研究人员量化后DL模型的跨平台大修时，即使他们通过前端检查，也会出现隐蔽的安全威胁。摘要：There is currently a burgeoning demand for deploying deep learning (DL) models on ubiquitous edge Internet of Things devices attributing to their low latency and high privacy preservation. However, DL models are often large in size and require large-scale computation, which prevents them from being placed directly onto IoT devices where resources are constrained and 32-bit floating-point operations are unavailable. Model quantization is a pragmatic solution, which enables DL deployment on mobile devices and embedded systems by effortlessly post-quantizing a large high-precision model into a small low-precision model while retaining the model inference accuracy. This work reveals that the standard quantization operation can be abused to activate a backdoor. We demonstrate that a full-precision backdoored model that does not have any backdoor effect in the presence of a trigger -- as the backdoor is dormant -- can be activated by the default TensorFlow-Lite quantization, the only product-ready quantization framework to date. We ascertain that all trained float-32 backdoored models exhibit no backdoor effect even in the presence of trigger inputs. State-of-the-art frontend detection approaches, such as Neural Cleanse and STRIP, fail to identify the backdoor in the float-32 models. When each of the float-32 models is converted into an int-8 format model through the standard TFLite post-training quantization, the backdoor is activated in the quantized model, which shows a stable attack success rate close to 100% upon inputs with the trigger, while behaves normally upon non-trigger inputs. This work highlights that a stealthy security threat occurs when end users utilize the on-device post-training model quantization toolkits, informing security researchers of cross-platform overhaul of DL models post quantization even if they pass frontend inspections.

【2】 Online Continual Learning with Natural Distribution Shifts: An Empirical Study with Visual Data 标题：自然分布漂移的在线持续学习：基于视觉数据的实证研究链接：https://arxiv.org/abs/2108.09020

作者：Zhipeng Cai,Ozan Sener,Vladlen Koltun 机构：Intel Labs 备注：Accepted to ICCV 2021 摘要：持续学习是指在多个任务和环境中通过时间学习和保留知识的问题。研究主要集中在增量分类设置上，即以离散时间间隔添加新任务/类。这样的“离线”设置不会评估代理有效学习的能力，因为当添加任务时，代理可以执行多个学习阶段，没有任何时间限制。我们认为，“在线”持续学习，即数据是没有任务边界的单一连续流，能够评估信息保留和在线学习效率。在在线持续学习中，每个输入的小批量数据首先用于测试，然后添加到训练集中，使问题真正在线。经过训练的模型随后会根据历史数据进行评估，以评估信息保留情况。我们为在线持续视觉学习引入了一个新的基准，展示了大规模和自然分布的变化。通过大规模的分析，我们发现了连续学习中基于梯度的优化的关键和以前未观察到的现象，并提出了有效的策略，用真实数据改进基于梯度的在线连续学习。源代码和数据集位于：https://github.com/IntelLabs/continuallearning. 摘要：Continual learning is the problem of learning and retaining knowledge through time over multiple tasks and environments. Research has primarily focused on the incremental classification setting, where new tasks/classes are added at discrete time intervals. Such an "offline" setting does not evaluate the ability of agents to learn effectively and efficiently, since an agent can perform multiple learning epochs without any time limitation when a task is added. We argue that "online" continual learning, where data is a single continuous stream without task boundaries, enables evaluating both information retention and online learning efficacy. In online continual learning, each incoming small batch of data is first used for testing and then added to the training set, making the problem truly online. Trained models are later evaluated on historical data to assess information retention. We introduce a new benchmark for online continual visual learning that exhibits large scale and natural distribution shifts. Through a large-scale analysis, we identify critical and previously unobserved phenomena of gradient-based optimization in continual learning, and propose effective strategies for improving gradient-based online continual learning with real data. The source code and dataset are available in: https://github.com/IntelLabs/continuallearning.

【3】 Deep Sequence Modeling: Development and Applications in Asset Pricing 标题：深序列建模：发展及其在资产定价中的应用链接：https://arxiv.org/abs/2108.08999

作者：Lin William Cong,Ke Tang,Jingyuan Wang,Yang Zhang 机构：Finance at the Johnson Graduate School of Management at Cornell University in Ithaca, NY., at Beihang University in Beijing, China. 摘要：我们使用人工智能的一项重要技术——深度序列建模来预测资产收益和衡量风险溢价。由于资产收益往往表现出传统时间序列模型无法有效捕获的序列依赖性，序列建模以其数据驱动的方法和优异的性能提供了一条有前途的道路。在本文中，我们首先概述了深序列模型的发展，介绍了它们在资产定价中的应用，并讨论了它们的优点和局限性。然后，我们使用美国股票数据对这些方法进行比较分析。我们展示了序列建模如何通过合并复杂的历史路径依赖而使投资者总体受益，并且基于长期和短期记忆（LSTM）的模型往往具有最佳的样本外性能。摘要：We predict asset returns and measure risk premia using a prominent technique from artificial intelligence -- deep sequence modeling. Because asset returns often exhibit sequential dependence that may not be effectively captured by conventional time series models, sequence modeling offers a promising path with its data-driven approach and superior performance. In this paper, we first overview the development of deep sequence models, introduce their applications in asset pricing, and discuss their advantages and limitations. We then perform a comparative analysis of these methods using data on U.S. equities. We demonstrate how sequence modeling benefits investors in general through incorporating complex historical path dependence, and that Long- and Short-term Memory (LSTM) based models tend to have the best out-of-sample performance.

【4】 A Framework for Neural Topic Modeling of Text Corpora 标题：一种面向文本语料库的神经主题建模框架链接：https://arxiv.org/abs/2108.08946

作者：Shayan Fazeli,Majid Sarrafzadeh 摘要：主题建模指的是发现文本数据语料库中出现的主要主题的问题，解决方案在许多领域找到了关键的应用。在这项工作中，受自然语言处理领域最新进展的启发，我们引入了FAME，这是一个开源框架，能够有效地提取和合并文本特征，并利用它们发现主题和聚类语料库中语义相似的文本文档。这些特性包括从传统方法（例如，基于频率的）到最新的基于转换器的语言模型（如BERT模型族）的自动编码嵌入。为了证明该库的有效性，我们在著名的新闻组数据集上进行了实验。该图书馆可在网上查阅。摘要：Topic Modeling refers to the problem of discovering the main topics that have occurred in corpora of textual data, with solutions finding crucial applications in numerous fields. In this work, inspired by the recent advancements in the Natural Language Processing domain, we introduce FAME, an open-source framework enabling an efficient mechanism of extracting and incorporating textual features and utilizing them in discovering topics and clustering text documents that are semantically similar in a corpus. These features range from traditional approaches (e.g., frequency-based) to the most recent auto-encoding embeddings from transformer-based language models such as BERT model family. To demonstrate the effectiveness of this library, we conducted experiments on the well-known News-Group dataset. The library is available online.

【5】 Statistical Learning to Operationalize a Domain Agnostic Data Quality Scoring 标题：将领域无关数据质量评分付诸实施的统计学习链接：https://arxiv.org/abs/2108.08905

作者：Sezal Chug,Priya Kaushal,Ponnurangam Kumaraguru,Tavpritesh Sethi 机构： Department of Computer Science, Indraprastha Institute of Information Technology, ¤Current Address: Department of Computer Science, Indraprastha Institute of, Information Technology, New Delhi, Delhi, India 备注：20 Pages, 8 Figures, 1 Table 摘要：数据正以难以想象的速度扩展，随着这一发展，数据质量的责任也随之而来。数据质量是指当前信息的相关性，有助于特定组织的决策和规划等各种操作。大多数数据质量是在特定的基础上测量的，因此开发的概念都没有提供任何实际应用。当前的实证研究旨在制定一个具体的自动化数据质量平台，以评估传入数据集的质量，并生成质量标签、评分和综合报告。我们利用healthdata.gov、opendata.nhs和人口统计与健康调查（DHS）项目的各种数据集，观察质量分数的变化，并使用主成分分析（PCA）制定标签。当前实证研究的结果揭示了一个包含九个质量成分的指标，即出处、数据集特征、一致性、元数据耦合、缺失单元格和重复行的百分比、数据的偏斜、分类列不一致的比率以及这些属性之间的相关性。该研究还提供了一个说明性的案例研究和验证的指标以下突变测试方法。本研究提供了一个自动化平台，该平台采用传入的数据集和元数据来提供DQ分数、报告和标签。这项研究的结果将对数据科学家有用，因为在为其各自的实际应用部署数据之前，这一质量标签的价值将逐渐增强信心。摘要：Data is expanding at an unimaginable rate, and with this development comes the responsibility of the quality of data. Data Quality refers to the relevance of the information present and helps in various operations like decision making and planning in a particular organization. Mostly data quality is measured on an ad-hoc basis, and hence none of the developed concepts provide any practical application. The current empirical study was undertaken to formulate a concrete automated data quality platform to assess the quality of incoming dataset and generate a quality label, score and comprehensive report. We utilize various datasets from healthdata.gov, opendata.nhs and Demographics and Health Surveys (DHS) Program to observe the variations in the quality score and formulate a label using Principal Component Analysis(PCA). The results of the current empirical study revealed a metric that encompasses nine quality ingredients, namely provenance, dataset characteristics, uniformity, metadata coupling, percentage of missing cells and duplicate rows, skewness of data, the ratio of inconsistencies of categorical columns, and correlation between these attributes. The study also provides an illustrative case study and validation of the metric following Mutation Testing approaches. This research study provides an automated platform which takes an incoming dataset and metadata to provide the DQ score, report and label. The results of this study would be useful to data scientists as the value of this quality label would instill confidence before deploying the data for his/her respective practical application.

【6】 SIAM: Chiplet-based Scalable In-Memory Acceleration with Mesh for Deep Neural Networks 标题：SIAM：基于Chiplet的深度神经网络网格可伸缩内存加速链接：https://arxiv.org/abs/2108.08903

作者：Gokul Krishnan,Sumit K. Mandal,Manvitha Pannala,Chaitali Chakrabarti,Jae-sun Seo,Umit Y. Ogras,Yu Cao 机构： Arizona State University, University of Wisconsin-Madison, School of Electrical 摘要：由于不断增加的模型尺寸，用于深度学习的单片芯片上的内存计算（IMC）在面积、产量和芯片互连成本方面面临着巨大的挑战。2.5D集成或基于芯片的体系结构将多个小型芯片（即芯片）互连，形成一个大型计算系统，提供了一个超越单片IMC体系结构的可行解决方案，以加速大型深度学习模型。本文提出了一种新的基准测试模拟器SIAM，用于评估基于芯片的IMC体系结构的性能，并探索这种模式转变在IMC体系结构设计中的潜力。SIAM集成了设备、电路、架构、片上网络（NoC）、封装上网络（NoP）和DRAM访问模型，以实现端到端系统。SIAM支持广泛的深度神经网络（DNN），可定制各种网络结构和配置，并能够有效探索设计空间，因此具有可扩展性。我们通过使用CIFAR-10、CIFAR-100和ImageNet数据集对不同最先进的DNN进行基准测试，展示了SIAM的灵活性、可扩展性和模拟速度。我们使用已发布的硅结果SIMBA进一步校准了模拟结果。通过SIAM获得的基于芯片的IMC架构显示，与Nvidia V100和T4 GPU相比，ImageNet数据集上ResNet-50的能效分别提高了130$\倍和72$\倍。摘要：In-memory computing (IMC) on a monolithic chip for deep learning faces dramatic challenges on area, yield, and on-chip interconnection cost due to the ever-increasing model sizes. 2.5D integration or chiplet-based architectures interconnect multiple small chips (i.e., chiplets) to form a large computing system, presenting a feasible solution beyond a monolithic IMC architecture to accelerate large deep learning models. This paper presents a new benchmarking simulator, SIAM, to evaluate the performance of chiplet-based IMC architectures and explore the potential of such a paradigm shift in IMC architecture design. SIAM integrates device, circuit, architecture, network-on-chip (NoC), network-on-package (NoP), and DRAM access models to realize an end-to-end system. SIAM is scalable in its support of a wide range of deep neural networks (DNNs), customizable to various network structures and configurations, and capable of efficient design space exploration. We demonstrate the flexibility, scalability, and simulation speed of SIAM by benchmarking different state-of-the-art DNNs with CIFAR-10, CIFAR-100, and ImageNet datasets. We further calibrate the simulation results with a published silicon result, SIMBA. The chiplet-based IMC architecture obtained through SIAM shows 130$\times$ and 72$\times$ improvement in energy-efficiency for ResNet-50 on the ImageNet dataset compared to Nvidia V100 and T4 GPUs.

【7】 Neural TMDlayer: Modeling Instantaneous flow of features via SDE Generators 标题：神经TMD层：通过SDE生成器对特征的瞬时流进行建模链接：https://arxiv.org/abs/2108.08891

作者：Zihang Meng,Vikas Singh,Sathya N. Ravi 机构：University of Wisconsin-Madison, University of Illinois at Chicago 摘要：我们研究了基于随机微分方程（SDE）的思想如何激发对计算机视觉中一组问题的现有算法的新修改。不严格地说，我们的公式与数据扩充和群等变的显式和隐式策略有关，但来自SDE文献中关于估计一类随机过程的无穷小生成元的新结果。如果应用程序/任务的需求与我们能够有效处理的进程类型的固有属性和行为之间存在名义上的一致性，我们将获得一个非常简单高效的插件层，该插件层可以整合到任何现有的网络架构中，只需很少的修改和几个额外的参数。我们在一些视觉任务上进行了有希望的实验，包括少量镜头学习、点云变换和深度变分分割，以提高效率或性能。摘要：We study how stochastic differential equation (SDE) based ideas can inspire new modifications to existing algorithms for a set of problems in computer vision. Loosely speaking, our formulation is related to both explicit and implicit strategies for data augmentation and group equivariance, but is derived from new results in the SDE literature on estimating infinitesimal generators of a class of stochastic processes. If and when there is nominal agreement between the needs of an application/task and the inherent properties and behavior of the types of processes that we can efficiently handle, we obtain a very simple and efficient plug-in layer that can be incorporated within any existing network architecture, with minimal modification and only a few additional parameters. We show promising experiments on a number of vision tasks including few shot learning, point cloud transformers and deep variational segmentation obtaining efficiency or performance improvements.

【8】 Deep Learning-based Spacecraft Relative Navigation Methods: A Survey 标题：基于深度学习的航天器相对导航方法综述链接：https://arxiv.org/abs/2108.08876

作者：Jianing Song,Duarte Rondao,Nabil Aouf 机构：City, University of London, ECV,HB London, United Kingdom 备注：41 pages; 17 figures; Submitted to Acta Astronautica, under review 摘要：自主航天器相对导航技术已被规划用于许多著名的空间任务。车载电子系统的发展使得基于视觉和基于激光雷达的方法能够实现更好的性能。与此同时，深度学习在各个领域都取得了巨大的成功，特别是在计算机视觉领域，这也引起了空间研究者的关注。然而，由于可靠性要求高，但缺乏大型数据集，航天器导航不同于地面任务。本次调查旨在系统研究当前基于深度学习的自主航天器相对导航方法，重点是航天器在小天体或月球上交会和着陆等具体轨道应用。首先从航天器交会、小行星探测和地形导航三个角度总结了基于深度学习的相对导航算法的基本特征、主要动机和贡献。此外，比较和总结了流行的视觉跟踪基准及其各自的特性。最后，讨论了潜在的应用，以及预期的障碍。摘要：Autonomous spacecraft relative navigation technology has been planned for and applied to many famous space missions. The development of on-board electronics systems has enabled the use of vision-based and LiDAR-based methods to achieve better performances. Meanwhile, deep learning has reached great success in different areas, especially in computer vision, which has also attracted the attention of space researchers. However, spacecraft navigation differs from ground tasks due to high reliability requirements but lack of large datasets. This survey aims to systematically investigate the current deep learning-based autonomous spacecraft relative navigation methods, focusing on concrete orbital applications such as spacecraft rendezvous and landing on small bodies or the Moon. The fundamental characteristics, primary motivations, and contributions of deep learning-based relative navigation algorithms are first summarised from three perspectives of spacecraft rendezvous, asteroid exploration, and terrain navigation. Furthermore, popular visual tracking benchmarks and their respective properties are compared and summarised. Finally, potential applications are discussed, along with expected impediments.

【9】 MOFit: A Framework to reduce Obesity using Machine learning and IoT 标题：MOFit：利用机器学习和物联网减少肥胖的框架链接：https://arxiv.org/abs/2108.08868

作者：Satvik Garg,Pradyumn Pundir 机构：Department of Computer Science, Jaypee University of Information Technology, Solan, India 备注：8 pages, This paper is accepted in the 2021 44th International Convention on Information, Communication and Electronic Technology (MIPRO). The final version of this paper will appear in the conference proceedings 摘要：从过去几年开始，由于技术的进步，城市地区的久坐生活方式达到了顶峰。这导致个人在很小的时候就成为肥胖症的受害者。肥胖会对健康产生各种影响，如糖尿病、心脏病、血压问题等等。过去几年中的机器学习在预测、医疗保健、医学成像、情感分析等所有专业领域都显示出其意义。在这项工作中，我们旨在提供一个使用机器学习算法的框架，即随机森林、决策树、XGBoost、额外树、，和KNN训练模型，使用各种参数帮助预测肥胖水平（分类）、体重和脂肪百分比水平（回归）。我们还应用并比较了各种超参数优化（HPO）算法，如遗传算法、随机搜索、网格搜索、Optuna等，以进一步提高模型的精度。该网站框架还包含各种其他功能，如制定可定制的饮食计划、锻炼计划和跟踪进度的仪表板。该框架是使用Python Flask构建的。此外，还将使用物联网（IoT）的称重秤集成到框架中，以跟踪食物摄入的卡路里和大量营养素。摘要：From the past few years, due to advancements in technologies, the sedentary living style in urban areas is at its peak. This results in individuals getting a victim of obesity at an early age. There are various health impacts of obesity like Diabetes, Heart disease, Blood pressure problems, and many more. Machine learning from the past few years is showing its implications in all expertise like forecasting, healthcare, medical imaging, sentiment analysis, etc. In this work, we aim to provide a framework that uses machine learning algorithms namely, Random Forest, Decision Tree, XGBoost, Extra Trees, and KNN to train models that would help predict obesity levels (Classification), Bodyweight, and fat percentage levels (Regression) using various parameters. We also applied and compared various hyperparameter optimization (HPO) algorithms such as Genetic algorithm, Random Search, Grid Search, Optuna to further improve the accuracy of the models. The website framework contains various other features like making customizable Diet plans, workout plans, and a dashboard to track the progress. The framework is built using the Python Flask. Furthermore, a weighing scale using the Internet of Things (IoT) is also integrated into the framework to track calories and macronutrients from food intake.

【10】 Distributionally Robust Learning 标题：分布式鲁棒学习链接：https://arxiv.org/abs/2108.08993

作者：Ruidi Chen,Ioannis Ch. Paschalidis 机构：Robust Learning”, : Vol. , No. ,–, pp ,–,. DOI: ,.,,., Boston University, teaching, andor private study. Commercial use or systematic downloading, (by robots or other automatic processes) is prohibited without ex-, plicit Publisher approval., Boston — Delft 摘要：本专著开发了一个全面的统计学习框架，该框架使用Wasserstein度量下的分布鲁棒优化（DRO）对数据中的（分布）扰动具有鲁棒性。从Wasserstein度量和DRO公式的基本性质开始，我们探索对偶性，以获得易于处理的公式，并开发有限样本以及渐近性能保证。我们考虑了一系列的学习问题，包括（i）分布鲁棒线性回归；ii）预测因子中具有群体结构的分布稳健回归(iii）分布稳健多输出回归和多类分类，（iv）结合分布稳健回归和最近邻估计的最优决策(v）分布式鲁棒半监督学习和（vi）分布式鲁棒强化学习。推导出了每个问题的易于处理的DRO松弛，在稳健性和正则化之间建立了联系，并获得了解决方案的预测和估计误差的界。除了理论之外，我们还包括数值实验和使用合成和真实数据的案例研究。真实数据实验都与各种健康信息学问题有关，这是一个应用领域，为这项工作提供了最初的推动力。摘要：This monograph develops a comprehensive statistical learning framework that is robust to (distributional) perturbations in the data using Distributionally Robust Optimization (DRO) under the Wasserstein metric. Beginning with fundamental properties of the Wasserstein metric and the DRO formulation, we explore duality to arrive at tractable formulations and develop finite-sample, as well as asymptotic, performance guarantees. We consider a series of learning problems, including (i) distributionally robust linear regression; (ii) distributionally robust regression with group structure in the predictors; (iii) distributionally robust multi-output regression and multiclass classification, (iv) optimal decision making that combines distributionally robust regression with nearest-neighbor estimation; (v) distributionally robust semi-supervised learning, and (vi) distributionally robust reinforcement learning. A tractable DRO relaxation for each problem is being derived, establishing a connection between robustness and regularization, and obtaining bounds on the prediction and estimation errors of the solution. Beyond theory, we include numerical experiments and case studies using synthetic and real data. The real data experiments are all associated with various health informatics problems, an application area which provided the initial impetus for this work.

【11】 Structure Learning for Directed Trees 标题：有向树的结构学习链接：https://arxiv.org/abs/2108.08871

作者：Martin Emil Jakobsen,Rajen D. Shah,Peter Bühlmann,Jonas Peters 机构：Peter B¨uhlmann⋆, ♮University of Copenhagen, Denmark, ♭University of Cambridge, United Kingdom, ⋆ETH Zurich, Switzerland 备注：84 pages, 17 figures 摘要：了解系统的因果结构是许多科学领域的基本兴趣，可以帮助设计预测算法，在对系统的操作下运行良好。在某些限制条件下，因果结构可以从观测分布中识别出来。为了从数据中了解结构，基于分数的方法根据拟合的质量评估不同的图形。然而，对于大型非线性模型，这些依赖于启发式优化方法，没有恢复真正因果结构的一般保证。在本文中，我们考虑有向树的结构学习。我们在Chu-Liu-Edmonds算法的基础上提出了一种快速且可扩展的方法，我们称之为因果加性树（CAT）。对于高斯误差的情况，我们证明了渐近区域的一致性，且可辨识间隙为零。我们还介绍了一种方法，用于测试子结构假设，该方法具有渐进的家族错误率控制，在选择后和未确定的环境中是有效的。此外，我们还研究了可识别差距，它量化了真实因果模型对观测分布的拟合程度，并证明了它是因果模型局部性质的下界。仿真研究表明，与竞争结构学习方法相比，CAT具有良好的性能。摘要：Knowing the causal structure of a system is of fundamental interest in many areas of science and can aid the design of prediction algorithms that work well under manipulations to the system. The causal structure becomes identifiable from the observational distribution under certain restrictions. To learn the structure from data, score-based methods evaluate different graphs according to the quality of their fits. However, for large nonlinear models, these rely on heuristic optimization approaches with no general guarantees of recovering the true causal structure. In this paper, we consider structure learning of directed trees. We propose a fast and scalable method based on Chu-Liu-Edmonds' algorithm we call causal additive trees (CAT). For the case of Gaussian errors, we prove consistency in an asymptotic regime with a vanishing identifiability gap. We also introduce a method for testing substructure hypotheses with asymptotic family-wise error rate control that is valid post-selection and in unidentified settings. Furthermore, we study the identifiability gap, which quantifies how much better the true causal model fits the observational distribution, and prove that it is lower bounded by local properties of the causal model. Simulation studies demonstrate the favorable performance of CAT compared to competing structure learning methods.

其他(10篇)

【1】 MHealth: An Artificial Intelligence Oriented Mobile Application for Personal Healthcare Support 标题：MHealth：面向人工智能的个人医疗支持移动应用链接：https://arxiv.org/abs/2108.09277

作者：Ismail Ali Afrah,Utku Kose 机构：Suleyman Demirel University, Turkey 备注：None 摘要：本研究的主要目的是介绍一种基于专家系统的mHealth应用程序，该应用程序通过考虑先前从文献中介绍的解决方案并采用更好解决方案的可能需求，获得人工智能支持。由于这项研究，设计开发了一个具有人工智能支持和针对日常生活中常见健康问题提供动态支持的移动软件系统，并通过调查和基于诊断的评估任务对其进行了评估。评估任务表明mHealth系统取得了积极成果。摘要：Main objective of this study is to introduce an expert system-based mHealth application that takes Artificial Intelligence support by considering previously introduced solutions from the literature and employing possible requirements for a better solution. Thanks to that research study, a mobile software system having Artificial Intelligence support and providing dynamic support against the common health problems in daily life was designed-developed and it was evaluated via survey and diagnosis-based evaluation tasks. Evaluation tasks indicated positive outcomes for the mHealth system.

【2】 Practical and Fast Momentum-Based Power Methods 标题：实用快速的动量幂法链接：https://arxiv.org/abs/2108.09264

作者：Tahseen Rabbani,Apollo Jain,Arjun Rajkumar,Furong Huang 机构：Department of Computer Science, University of Maryland, College Park, MD, Systems and Technology Research, Sensors Division, Arlington, VA 摘要：功率法是一种经典算法，在机器学习任务中有着广泛的应用，包括流式PCA、谱聚类和低秩矩阵逼近。香草幂法的目的是确定矩阵的最大特征值（绝对模）及其特征向量。基于动量的方案可用于加速幂法，但使用现有算法实现最佳收敛速度严重依赖于运行时不可用的额外光谱信息，次优初始化可能导致发散。在本文中，我们提供了一对新的基于动量的功率方法，我们称之为延迟动量功率方法（DMPower）和一种流变体，延迟动量流方法（DMStream）。我们的方法利用了不精确的通货紧缩，并且能够在限制性小得多的超参数要求下实现接近最优的收敛。我们从微扰理论的角度对这两种算法进行了收敛性分析。此外，我们还通过实验证明，DMPower通常优于vanilla power方法，并且两种算法的收敛速度都与运行现有加速方法的oracle算法的收敛速度相匹配，并且具有完美的光谱知识。摘要：The power method is a classical algorithm with broad applications in machine learning tasks, including streaming PCA, spectral clustering, and low-rank matrix approximation. The distilled purpose of the vanilla power method is to determine the largest eigenvalue (in absolute modulus) and its eigenvector of a matrix. A momentum-based scheme can be used to accelerate the power method, but achieving an optimal convergence rate with existing algorithms critically relies on additional spectral information that is unavailable at run-time, and sub-optimal initializations can result in divergence. In this paper, we provide a pair of novel momentum-based power methods, which we call the delayed momentum power method (DMPower) and a streaming variant, the delayed momentum streaming method (DMStream). Our methods leverage inexact deflation and are capable of achieving near-optimal convergence with far less restrictive hyperparameter requirements. We provide convergence analyses for both algorithms through the lens of perturbation theory. Further, we experimentally demonstrate that DMPower routinely outperforms the vanilla power method and that both algorithms match the convergence speed of an oracle running existing accelerated methods with perfect spectral knowledge.

【3】 Parsing Birdsong with Deep Audio Embeddings 标题：基于深度音频嵌入的鸟鸣句法分析链接：https://arxiv.org/abs/2108.09203

作者：Irina Tolkova,Brian Chu,Marcel Hedman,Stefan Kahl,Holger Klinck 机构：School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, K. Lisa Yang Center for Conservation Bioacoustics, Cornell Lab of Ornithology, Cornell University, Ithaca, NY 备注：IJCAI 2021 Artificial Intelligence for Social Good (AI4SG) Workshop 摘要：鸟类种群监测在保护工作和了解生物多样性丧失方面发挥了至关重要的作用。传感技术（如被动声学监测）和伴随的分析工具（如深度学习）促进了这一过程的自动化。然而，机器学习模型通常难以推广到训练数据中未遇到的示例。在我们的工作中，我们提出了一种半监督方法来识别特征呼叫和环境噪声。我们使用几种方法来学习音频样本的潜在表示，包括卷积自动编码器和两个预先训练的网络，并将生成的嵌入进行分组，以便领域专家识别聚类标签。我们表明，我们的方法可以提高分类精度，并提供洞察环境声学数据集的潜在结构。摘要：Monitoring of bird populations has played a vital role in conservation efforts and in understanding biodiversity loss. The automation of this process has been facilitated by both sensing technologies, such as passive acoustic monitoring, and accompanying analytical tools, such as deep learning. However, machine learning models frequently have difficulty generalizing to examples not encountered in the training data. In our work, we present a semi-supervised approach to identify characteristic calls and environmental noise. We utilize several methods to learn a latent representation of audio samples, including a convolutional autoencoder and two pre-trained networks, and group the resulting embeddings for a domain expert to identify cluster labels. We show that our approach can improve classification precision and provide insight into the latent structure of environmental acoustic datasets.

【4】 VAE-CE: Visual Contrastive Explanation using Disentangled VAEs 标题：VAE-CE：解缠VAE的视觉对比解释链接：https://arxiv.org/abs/2108.09159

作者：Yoeri Poels,Vlado Menkovski 机构：Eindhoven University of Technology, the Netherlands 摘要：分类模型的目标是为数据指定正确的标签。在大多数情况下，给定的标签集不能完全描述这些数据。域中通常存在一组丰富的有意义的概念，可以更精确地描述每个数据点。这些概念对于解释模型的分类也非常有用。在本文中，我们提出了一个模型，称为基于变分自动编码的对比解释（VAE-CE），该模型用高级概念表示数据，并将此表示用于分类和生成解释。这些解释是以对比的方式产生的，表达了为什么一个数据点被分配给一个类而不是另一个类。解释被指定为输入数据点的一组转换，每个步骤描述一个向对比类转变的概念。我们使用一个解纠缠的VAE建立模型，并用一种新的监督方法对单个维度进行了扩展。对合成数据和MNIST的分析表明，与其他方法相比，解纠缠和解释方法都具有优势。摘要：The goal of a classification model is to assign the correct labels to data. In most cases, this data is not fully described by the given set of labels. Often a rich set of meaningful concepts exist in the domain that can much more precisely describe each datapoint. Such concepts can also be highly useful for interpreting the model's classifications. In this paper we propose a model, denoted as Variational Autoencoder-based Contrastive Explanation (VAE-CE), that represents data with high-level concepts and uses this representation for both classification and generating explanations. The explanations are produced in a contrastive manner, conveying why a datapoint is assigned to one class rather than an alternative class. An explanation is specified as a set of transformations of the input datapoint, with each step depicting a concept changing towards the contrastive class. We build the model using a disentangled VAE, extended with a new supervised method for disentangling individual dimensions. An analysis on synthetic data and MNIST shows that the approaches to both disentanglement and explanation provide benefits over other methods.

【5】 User Localization Based on Call Detail Records 标题：基于呼叫详细记录的用户本地化链接：https://arxiv.org/abs/2108.09157

作者：Buddhi Ayesha,Bhagya Jeewanthi,Charith Chitraranjan,Amal Shehan Perera,Amal S. Kumarage 机构： Department of Computer Science & Engineering, University of Moratuwa, Department of Transport & Logistics Management, University of Moratuwa 备注：None 摘要：了解人员流动性对许多领域都至关重要，包括交通规划。目前，调查是此类分析的主要来源。然而，在最近的过去，许多研究人员将重点放在识别旅行模式的通话详细记录（CDR）上。CDR已显示出与人类移动行为的相关性。然而，使用CDR数据的一个主要问题是，由于数据的低空间分辨率和诸如负载共享效应等其他工件，难以识别用户的精确位置。现有的方法有一定的局限性。以前的研究使用CDRS不考虑蜂窝基站的发射功率时，本地化用户，并使用过简单的方法来确定负载分担效应。此外，他们把整个用户群体视为一组，忽略了不同用户段的移动模式的差异。本研究引入了一种新的方法，通过改进负载共享效应的检测，将发射功率考虑在内，并将用户分成不同的组，以便学习模型的任何参数，从而从CDR中定位用户位置。此外，本研究使用了几种方法来解决现有的局限性，并使用近40亿个CDR数据点、旅游调查数据和自愿收集的移动数据来验证生成的结果。摘要：Understanding human mobility is essential for many fields, including transportation planning. Currently, surveys are the primary source for such analysis. However, in the recent past, many researchers have focused on Call Detail Records (CDR) for identifying travel patterns. CDRs have shown correlation to human mobility behavior. However, one of the main issues in using CDR data is that it is difficult to identify the precise location of the user due to the low spacial resolution of the data and other artifacts such as the load sharing effect. Existing approaches have certain limitations. Previous studies using CDRs do not consider the transmit power of cell towers when localizing the users and use an oversimplified approach to identify load sharing effects. Furthermore, they consider the entire population of users as one group neglecting the differences in mobility patterns of different segments of users. This research introduces a novel methodology to user position localization from CDRs through improved detection of load sharing effects, by taking the transmit power into account, and segmenting the users into distinct groups for the purpose of learning any parameters of the model. Moreover, this research uses several methods to address the existing limitations and validate the generated results using nearly 4 billion CDR data points with travel survey data and voluntarily collected mobile data.

【6】 Group-based Distinctive Image Captioning with Memory Attention 标题：具有记忆注意的基于分组的区别化图像字幕链接：https://arxiv.org/abs/2108.09151

作者：Jiuniu Wang,Wenjia Xu,Qingzhong Wang,Antoni B. Chan 机构： Department of Computer Science, City University of Hong Kong, Aerospace Information Research Institute, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Baidu Research 备注：Accepted at ACM MM 2021 (oral) 摘要：使用自然语言描述图像被广泛称为图像字幕，由于计算机视觉和自然语言生成技术的发展，它已经取得了一致的进展。尽管传统的字幕模型基于流行的指标（即BLEU、CIDEr和SPICE）实现了高精度，但字幕区分目标图像和其他类似图像的能力还没有得到充分的研究。为了产生独特的字幕，一些先驱采用对比学习或重新加权的地面真相字幕，重点是一个单一的输入图像。但是，类似图像组中对象之间的关系（例如，同一相册中的项目或属性或细粒度事件）被忽略。在本文中，我们使用基于组的独特字幕模型（GdisCap）来改进图像字幕的独特性，该模型将每个图像与一个相似组中的其他图像进行比较，并突出每个图像的独特性。特别是，我们提出了一个基于组的记忆注意（GMA）模块，该模块存储图像组中唯一的对象特征（即，与其他图像中的对象相似性较低）。生成字幕时，这些独特的对象特征会高亮显示，从而产生更独特的字幕。此外，在地面实况字幕中选择不同的词语来监督语言解码器和GMA。最后，我们提出了一个新的评价指标，独特词率（DisWordRate）来衡量字幕的独特性。定量结果表明，该方法显著提高了多个基线模型的显著性，并在准确性和显著性方面达到了最新水平。用户研究的结果与定量评估结果一致，并证明了新度量失言率的合理性。摘要：Describing images using natural language is widely known as image captioning, which has made consistent progress due to the development of computer vision and natural language generation techniques. Though conventional captioning models achieve high accuracy based on popular metrics, i.e., BLEU, CIDEr, and SPICE, the ability of captions to distinguish the target image from other similar images is under-explored. To generate distinctive captions, a few pioneers employ contrastive learning or re-weighted the ground-truth captions, which focuses on one single input image. However, the relationships between objects in a similar image group (e.g., items or properties within the same album or fine-grained events) are neglected. In this paper, we improve the distinctiveness of image captions using a Group-based Distinctive Captioning Model (GdisCap), which compares each image with other images in one similar group and highlights the uniqueness of each image. In particular, we propose a group-based memory attention (GMA) module, which stores object features that are unique among the image group (i.e., with low similarity to objects in other images). These unique object features are highlighted when generating captions, resulting in more distinctive captions. Furthermore, the distinctive words in the ground-truth captions are selected to supervise the language decoder and GMA. Finally, we propose a new evaluation metric, distinctive word rate (DisWordRate) to measure the distinctiveness of captions. Quantitative results indicate that the proposed method significantly improves the distinctiveness of several baseline models, and achieves the state-of-the-art performance on both accuracy and distinctiveness. Results of a user study agree with the quantitative evaluation and demonstrate the rationality of the new metric DisWordRate.

【7】 Airbert: In-domain Pretraining for Vision-and-Language Navigation 标题：Airbert：视觉和语言导航的域内预训练链接：https://arxiv.org/abs/2108.09105

作者：Pierre-Louis Guhur,Makarand Tapaswi,Shizhe Chen,Ivan Laptev,Cordelia Schmid 机构：Inria, ´Ecole normale sup´erieure, CNRS, PSL Research University, Paris, France, IIIT Hyderabad, India 备注：To be published on ICCV 2021. Webpage is at this https URL linking to our dataset, codes and models 摘要：视觉和语言导航（VLN）的目标是使嵌入式代理能够使用自然语言指令在现实环境中导航。由于缺乏特定领域的训练数据以及图像和语言输入的高度多样性，VLN代理在未知环境中的推广仍然具有挑战性。最近的方法探索预训练以提高泛化能力，然而，使用通用图像字幕数据集或现有的小规模VLN环境是次优的，并且导致有限的改进。在这项工作中，我们介绍了BnB，一个大规模的和多样化的域内VLN数据集。我们首先从在线租赁市场的数十万个列表中收集图像标题（IC）对。使用IC对，我们接下来提出自动生成数百万VLN路径指令（PI）对的策略。我们进一步提出了一种洗牌损失，它改进了PI对内时序的学习。我们使用BnB pretrain我们的Airbert模型，该模型可以适应区分性和生成性设置，并表明它在房间到房间（R2R）导航和远程引用表达（幻想）基准方面优于最先进的水平。此外，我们的域内预训练显著提高了具有挑战性的少数镜头VLN评估的性能，我们仅根据少数几家公司的VLN指令训练模型。摘要：Vision-and-language navigation (VLN) aims to enable embodied agents to navigate in realistic environments using natural language instructions. Given the scarcity of domain-specific training data and the high diversity of image and language inputs, the generalization of VLN agents to unseen environments remains challenging. Recent methods explore pretraining to improve generalization, however, the use of generic image-caption datasets or existing small-scale VLN environments is suboptimal and results in limited improvements. In this work, we introduce BnB, a large-scale and diverse in-domain VLN dataset. We first collect image-caption (IC) pairs from hundreds of thousands of listings from online rental marketplaces. Using IC pairs we next propose automatic strategies to generate millions of VLN path-instruction (PI) pairs. We further propose a shuffling loss that improves the learning of temporal order inside PI pairs. We use BnB pretrain our Airbert model that can be adapted to discriminative and generative settings and show that it outperforms state of the art for Room-to-Room (R2R) navigation and Remote Referring Expression (REVERIE) benchmarks. Moreover, our in-domain pretraining significantly increases performance on a challenging few-shot VLN evaluation, where we train the model only on VLN instructions from a few houses.

【8】 Risk Bounds and Calibration for a Smart Predict-then-Optimize Method 标题：一种智能先预测后优化方法的风险界与校正链接：https://arxiv.org/abs/2108.08887

作者：Heyuan Liu,Paul Grigas 机构：Department of Industrial Engineering and Operations Research, University of California, Berkeley∗ 摘要：预测-然后优化框架是实际随机决策问题的基础：首先预测优化模型的未知参数，然后使用预测值求解问题。该设置中的自然损失函数是通过测量预测参数引起的决策误差来定义的，Elmachtoub和Grigas将其命名为Smart Predict then Optimize（SPO）损失[arXiv:1710.08005]。由于SPO损失通常是非凸的，并且可能是不连续的，Elmachtoub和Grigas[arXiv:1710.08005]引入了一种称为SPO+损失的凸替代物，它重要地解释了优化模型的基本结构。在本文中，我们大大扩展了Elmachtoub和Grigas提供的SPO+损失的一致性结果[arXiv:1710.08005]。我们开发了SPO+损失相对于SPO损失的风险边界和统一校准结果，这提供了一种将超额替代风险转换为超额真实风险的定量方法。通过将我们的风险界与推广界相结合，我们证明了SPO+损失的经验最小值以高概率实现了低超额真实风险。我们首先将这些结果证明在基本优化问题的可行域是多面体的情况下，然后我们证明了当可行域是强凸函数的水平集时，这些结果可以得到实质性的加强。我们进行实验，从经验上证明SPO+替代物在投资组合分配和成本敏感的多类别分类问题上的强度，与标准的$\ell_1$和平方$\ell_2$预测误差损失相比。摘要：The predict-then-optimize framework is fundamental in practical stochastic decision-making problems: first predict unknown parameters of an optimization model, then solve the problem using the predicted values. A natural loss function in this setting is defined by measuring the decision error induced by the predicted parameters, which was named the Smart Predict-then-Optimize (SPO) loss by Elmachtoub and Grigas [arXiv:1710.08005]. Since the SPO loss is typically nonconvex and possibly discontinuous, Elmachtoub and Grigas [arXiv:1710.08005] introduced a convex surrogate, called the SPO+ loss, that importantly accounts for the underlying structure of the optimization model. In this paper, we greatly expand upon the consistency results for the SPO+ loss provided by Elmachtoub and Grigas [arXiv:1710.08005]. We develop risk bounds and uniform calibration results for the SPO+ loss relative to the SPO loss, which provide a quantitative way to transfer the excess surrogate risk to excess true risk. By combining our risk bounds with generalization bounds, we show that the empirical minimizer of the SPO+ loss achieves low excess true risk with high probability. We first demonstrate these results in the case when the feasible region of the underlying optimization problem is a polyhedron, and then we show that the results can be strengthened substantially when the feasible region is a level set of a strongly convex function. We perform experiments to empirically demonstrate the strength of the SPO+ surrogate, as compared to standard $\ell_1$ and squared $\ell_2$ prediction error losses, on portfolio allocation and cost-sensitive multi-class classification problems.

【9】 Topo2vec: Topography Embedding Using the Fractal Effect 标题：Topo2vec：使用分形效果的地形嵌入链接：https://arxiv.org/abs/2108.08870

作者：Jonathan Kavitzky,Jonathan Zarecki,Idan Brusilovsky,Uriel Singer 机构： Penta-AI, HUJI: The Hebrew University, Jerusalem, Israel, Bar-Ilan University, Ramat Gan, Israel, The Open University, Ra’anana, Israel, Technion—Israel Institute of Technology, Haifa, Israel 备注：9 pages, 6 figures, 2 tables, 1 algorithm 摘要：深度学习的最新进展通过引入泛型嵌入空间改变了许多领域，能够以最小的标记努力实现良好的预测性能。地质学领域尚未取得如此成功。在这项工作中，我们介绍了一种扩展的自监督学习技术，专门用于利用遥感图像中的分形效应。分形效应假设在所有尺度上都会出现相同的结构（例如河流、山峰和鞍座）。我们在高程数据上证明了我们的方法的有效性，并在推理中使用了这种效果。我们对几个分类任务进行了广泛的分析，并强调了它在不同尺度上检测同一类别的有效性。据我们所知，这是首次尝试为地形图像构建通用表示。摘要：Recent advances in deep learning have transformed many fields by introducing generic embedding spaces, capable of achieving great predictive performance with minimal labeling effort. The geology field has not yet met such success. In this work, we introduce an extension for self-supervised learning techniques tailored for exploiting the fractal-effect in remote-sensing images. The fractal-effect assumes that the same structures (for example rivers, peaks and saddles) will appear in all scales. We demonstrate our method's effectiveness on elevation data, we also use the effect in inference. We perform an extensive analysis on several classification tasks and emphasize its effectiveness in detecting the same class on different scales. To the best of our knowledge, it is the first attempt to build a generic representation for topographic images.

【10】 State-Of-The-Art Algorithms For Low-Rank Dynamic Mode Decomposition 标题：低秩动态模态分解的最新算法链接：https://arxiv.org/abs/2108.09160

作者：Patrick Heas,Cedric Herzet 机构：campus universitaire de Beaulieu 备注：arXiv admin note: substantial text overlap with arXiv:1610.02962 摘要：本技术说明回顾了使用低阶动态模式分解（DMD）对高维动态系统进行线性近似的最新算法。在重复我们的文章“低阶动态模式分解：精确且易于处理的解决方案”中的几个部分时，这项工作提供了额外的细节，有助于构建最先进方法的全面图景。摘要：This technical note reviews sate-of-the-art algorithms for linear approximation of high-dimensional dynamical systems using low-rank dynamic mode decomposition (DMD). While repeating several parts of our article "low-rank dynamic mode decomposition: an exact and tractable solution", this work provides additional details useful for building a comprehensive picture of state-of-the-art methods.

本文参与腾讯云自媒体分享计划，分享自微信公众号。

原始发表：2021-08-23，如有侵权请联系 cloudcommunity@tencent.com 删除

linux