前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >人工智能学术速递[6.28]

人工智能学术速递[6.28]

作者头像
公众号-arXiv每日学术速递
发布2021-07-02 17:26:16
8030
发布2021-07-02 17:26:16
举报

访问www.arxivdaily.com获取含摘要速递,涵盖CS|物理|数学|经济|统计|金融|生物|电气领域,更有搜索、收藏、发帖等功能!点击阅读原文即可访问

cs.AI人工智能,共计31篇

【1】 Single Image Texture Translation for Data Augmentation 标题:用于数据增强的单幅图像纹理转换

作者:Boyi Li,Yin Cui,Tsung-Yi Lin,Serge Belongie 机构:Cornell University, Cornell Tech, Google Research, Brain Team 链接:https://arxiv.org/abs/2106.13804 摘要:图像合成的最新进展使人们能够通过学习源域和目标域之间的映射来翻译图像。现有的方法倾向于通过在各种数据集上训练一个模型来学习分布,结果的评估主要以主观的方式进行。然而,这方面的研究相对较少,研究语义图像翻译方法在图像识别任务中的潜在应用。在本文中,我们探讨了使用单一图像纹理转换(SITT)的数据增强。我们首先提出一个轻量级的模型来将纹理转换成基于单一输入的图像,允许快速的训练和测试。在此基础上,探讨了增强数据在长尾和Few-Shot图像分类中的应用。我们发现该方法能够将输入数据转换到目标域,从而提高图像识别性能。最后,我们研究了SITT和相关的图像翻译方法如何为数据高效、增强工程的模型训练方法提供基础。 摘要:Recent advances in image synthesis enables one to translate images by learning the mapping between a source domain and a target domain. Existing methods tend to learn the distributions by training a model on a variety of datasets, with results evaluated largely in a subjective manner. Relatively few works in this area, however, study the potential use of semantic image translation methods for image recognition tasks. In this paper, we explore the use of Single Image Texture Translation (SITT) for data augmentation. We first propose a lightweight model for translating texture to images based on a single input of source texture, allowing for fast training and testing. Based on SITT, we then explore the use of augmented data in long-tailed and few-shot image classification tasks. We find the proposed method is capable of translating input data into a target domain, leading to consistent improved image recognition performance. Finally, we examine how SITT and related image translation methods can provide a basis for a data-efficient, augmentation engineering approach to model training.

【2】 Assessing Generalization of SGD via Disagreement 标题:通过不同意见评价SGD的概括性

作者:Yiding Jiang,Vaishnavh Nagarajan,Christina Baek,J. Zico Kolter 机构:Carnegie Mellon University, Bosch Center for AI, Pittsburgh 链接:https://arxiv.org/abs/2106.13799 摘要:我们的经验表明,深度网络的测试误差可以通过简单地在相同的训练集上训练相同的结构,但使用不同的随机梯度下降(SGD)运行,并在未标记的测试数据上测量两个网络之间的不一致率来估计。这是建立在20年Nakkiran&Bansal观察的基础上的,并且是一个更强大的版本,它要求第二次跑步必须在全新的训练环境中进行。我们进一步从理论上证明了这种特殊现象是由SGD训练模型的emph{ensembles}性质引起的。这一发现不仅为利用未标记测试数据直接预测测试误差提供了一种简单的实证方法,而且在泛化和校准之间建立了一种新的概念联系。 摘要:We empirically show that the test error of deep networks can be estimated by simply training the same architecture on the same training set but with a different run of Stochastic Gradient Descent (SGD), and measuring the disagreement rate between the two networks on unlabeled test data. This builds on -- and is a stronger version of -- the observation in Nakkiran & Bansal '20, which requires the second run to be on an altogether fresh training set. We further theoretically show that this peculiar phenomenon arises from the \emph{well-calibrated} nature of \emph{ensembles} of SGD-trained models. This finding not only provides a simple empirical measure to directly predict the test error using unlabeled test data, but also establishes a new conceptual connection between generalization and calibration.

【3】 Advancing Methodology for Social Science Research Using Alternate Reality Games: Proof-of-Concept Through Measuring Individual Differences and Adaptability and their impact on Team Performance 标题:利用交互现实博弈推进社会科学研究的方法论:通过测量个体差异和适应性及其对团队绩效的影响进行概念验证

作者:Magy Seif El-Nasr,Casper Harteveld,Paul Fombelle,Truong-Huy Nguyen,Paola Rizzo,Dylan Schouten,Abdelrahman Madkour,Chaima Jemmali,Erica Kleinman,Nithesh Javvaji,Zhaoqing Teng,Extra Ludic Inc 机构:Ludic Inc: Samuel Liberty, Wade Kimbrough., A. Our Goals, We have several overarching goals for this proposal. We will first enumerate these goals and, outcomes then delve into more detailed discussion on each of the outcomes discussed. 备注:None 链接:https://arxiv.org/abs/2106.13740 摘要:在计算机支持的协同工作(CSCW)、心理学和社会科学等领域的工作,使我们对团队过程及其效果、绩效和有效性的理解不断提高,而目前的方法依赖于观察或自我报告,研究团队过程的工作很少,而团队过程是基于行为数据的可量化度量。在本报告中,我们讨论了解决这个开放性问题的工作,重点是理解个体差异及其对团队适应的影响,并进一步探讨了这些因素对团队绩效的影响,作为一个结果和过程。我们特别讨论了我们在方法方面的贡献,这些方法补充了调查数据和行为数据,使我们能够更深入地了解团队绩效,并开发出一种方法来评估团队内部和整个团队的适应性和绩效。为了使这个问题更容易处理,我们选择了集中在特定类型的环境,交替现实游戏(ARGs),并出于几个原因。首先,这些类型的游戏包括类似于现实世界设置的设置,例如,通过slack或email进行通信。第二,它们比真实环境更可控,允许我们在需要时嵌入刺激。最后,它们允许我们收集所需的数据,以了解在整个体验期间做出的决策和沟通,这使得团队流程比其他方式更加透明。在本报告中,我们讨论了我们迄今为止所做的工作,并证明了该方法的有效性。 摘要:While work in fields of CSCW (Computer Supported Collaborative Work), Psychology and Social Sciences have progressed our understanding of team processes and their effect performance and effectiveness, current methods rely on observations or self-report, with little work directed towards studying team processes with quantifiable measures based on behavioral data. In this report we discuss work tackling this open problem with a focus on understanding individual differences and its effect on team adaptation, and further explore the effect of these factors on team performance as both an outcome and a process. We specifically discuss our contribution in terms of methods that augment survey data and behavioral data that allow us to gain more insight on team performance as well as develop a method to evaluate adaptation and performance across and within a group. To make this problem more tractable we chose to focus on specific types of environments, Alternate Reality Games (ARGs), and for several reasons. First, these types of games involve setups that are similar to a real-world setup, e.g., communication through slack or email. Second, they are more controllable than real environments allowing us to embed stimuli if needed. Lastly, they allow us to collect data needed to understand decisions and communications made through the entire duration of the experience, which makes team processes more transparent than otherwise possible. In this report we discuss the work we did so far and demonstrate the efficacy of the approach.

【4】 Bayesian Neural Networks: Essentials 标题:贝叶斯神经网络:要点

作者:Daniel T. Chang 链接:https://arxiv.org/abs/2106.13594 摘要:贝叶斯神经网络利用概率层捕捉权重和激活的不确定性,并使用贝叶斯推理进行训练。由于这些概率层的设计是为了替换它们的确定性计数器部件,贝叶斯神经网络提供了一种直接而自然的方法来扩展传统的深度神经网络以支持概率深度学习。然而,由于贝叶斯神经网络的复杂性,对其进行理解、设计和训练是非常重要的。我们讨论了贝叶斯神经网络的本质,包括对偶性(深层神经网络、概率模型)、近似贝叶斯推理、贝叶斯先验、贝叶斯后验和深层变分学习。我们使用TensorFlow概率API和代码示例进行说明。贝叶斯神经网络的主要问题是,深层神经网络的结构使得对大量连续层的不确定性进行解释变得非常冗余,而且成本高昂。混合贝叶斯神经网络是一种实用的解决方案,它使用少量的概率层在网络中进行司法定位。 摘要:Bayesian neural networks utilize probabilistic layers that capture uncertainty over weights and activations, and are trained using Bayesian inference. Since these probabilistic layers are designed to be drop-in replacement of their deterministic counter parts, Bayesian neural networks provide a direct and natural way to extend conventional deep neural networks to support probabilistic deep learning. However, it is nontrivial to understand, design and train Bayesian neural networks due to their complexities. We discuss the essentials of Bayesian neural networks including duality (deep neural networks, probabilistic models), approximate Bayesian inference, Bayesian priors, Bayesian posteriors, and deep variational learning. We use TensorFlow Probability APIs and code examples for illustration. The main problem with Bayesian neural networks is that the architecture of deep neural networks makes it quite redundant, and costly, to account for uncertainty for a large number of successive layers. Hybrid Bayesian neural networks, which use few probabilistic layers judicially positioned in the networks, provide a practical solution.

【5】 Fostering Diversity in Spatial Evolutionary Generative Adversarial Networks 标题:空间进化生成对抗网络中多样性的培育

作者:Jamal Toutouh,Erik Hemberg,Una-May O'Reilly 机构:Massachusetts Institute of Technology, Cambridge, MA, USA 备注:Accepted to be presented during Conference of the Spanish Association of Artificial Intelligence (CAEPIA 2021). arXiv admin note: substantial text overlap with arXiv:1905.12702 链接:https://arxiv.org/abs/2106.13590 摘要:生成性对手网络(generativediscountary networks,GANs)存在着不稳定、模式崩溃等训练病理现象,其主要原因是缺乏多样性。协同进化GAN(CoE-GAN)训练算法已被证明对这些疾病具有弹性。本文介绍了野马,一种空间分布的CoE-GAN,它通过在训练过程中使用不同的损失函数来训练多样性。对MNIST和CelebA的实验分析表明,野马在统计上训练更精确的发电机。 摘要:Generative adversary networks (GANs) suffer from training pathologies such as instability and mode collapse, which mainly arise from a lack of diversity in their adversarial interactions. Co-evolutionary GAN (CoE-GAN) training algorithms have shown to be resilient to these pathologies. This article introduces Mustangs, a spatially distributed CoE-GAN, which fosters diversity by using different loss functions during the training. Experimental analysis on MNIST and CelebA demonstrated that Mustangs trains statistically more accurate generators.

【6】 Graph Pattern Loss based Diversified Attention Network for Cross-Modal Retrieval 标题:基于图模式丢失的多样化注意力网络跨模态检索

作者:Xueying Chen,Rong Zhang,Yibing Zhan 机构:Department of Electronic Engineering and Information Science, University of Science and Technology of China, Hefei, China, Hangzhou Dianzi University 备注:None 链接:https://arxiv.org/abs/2106.13552 摘要:跨模式检索旨在通过结合图像、视频、文本和音频等多媒体数据,实现灵活的检索体验。无监督方法的核心之一是挖掘不同对象表示之间的相关性,从而在不需要昂贵标签的情况下获得满意的检索性能。本文提出了一种基于图形模式丢失的多样化注意网络(GPLDAN),用于无监督跨模态检索,以深入分析表征之间的相关性。首先,我们提出了一个多样化的注意力特征投射器,通过考虑不同表征之间的交互来产生实例的多个表征。然后,我们设计了一个新的图形模式损失来探索不同表示之间的相关性,在这个图中,考虑了不同表示之间所有可能的距离。另外,在融合前加入模态分类器,明确声明特征对应的模态,引导网络增强识别能力。我们在四个公共数据集上测试GPLDAN。实验结果表明,与现有的跨模态检索方法相比,GPLDAN具有良好的性能和竞争力。 摘要:Cross-modal retrieval aims to enable flexible retrieval experience by combining multimedia data such as image, video, text, and audio. One core of unsupervised approaches is to dig the correlations among different object representations to complete satisfied retrieval performance without requiring expensive labels. In this paper, we propose a Graph Pattern Loss based Diversified Attention Network(GPLDAN) for unsupervised cross-modal retrieval to deeply analyze correlations among representations. First, we propose a diversified attention feature projector by considering the interaction between different representations to generate multiple representations of an instance. Then, we design a novel graph pattern loss to explore the correlations among different representations, in this graph all possible distances between different representations are considered. In addition, a modality classifier is added to explicitly declare the corresponding modalities of features before fusion and guide the network to enhance discrimination ability. We test GPLDAN on four public datasets. Compared with the state-of-the-art cross-modal retrieval methods, the experimental results demonstrate the performance and competitiveness of GPLDAN.

【7】 Tensor-based framework for training flexible neural networks 标题:基于张量的柔性神经网络训练框架

作者:Yassine Zniyed,Konstantin Usevich,Sebastian Miron,David Brie 机构: Universit´e de Lorraine 备注:26 pages, 13 figures 链接:https://arxiv.org/abs/2106.13542 摘要:激活函数(AFs)是神经网络设计的重要组成部分,其选择对神经网络的性能起着决定性的作用。在这项工作中,我们特别感兴趣的是使用基于张量的解来估计灵活的激活函数,其中AFs表示为预定义基函数的加权和。为此,我们提出了一种新的学习算法来解决约束耦合矩阵张量分解(CMTF)问题。该技术融合了神经网络的一阶和零阶信息,其中一阶信息包含在一个雅可比张量中,然后进行约束正则多元分解(CPD)。该算法可以处理不同的分解基。该方法的目的是通过用一个新的柔性层代替原网络的一层或多层子网,来压缩大的预训练神经网络模型。将该方法应用于用于字符分类的预训练卷积神经网络(CNN)。 摘要:Activation functions (AFs) are an important part of the design of neural networks (NNs), and their choice plays a predominant role in the performance of a NN. In this work, we are particularly interested in the estimation of flexible activation functions using tensor-based solutions, where the AFs are expressed as a weighted sum of predefined basis functions. To do so, we propose a new learning algorithm which solves a constrained coupled matrix-tensor factorization (CMTF) problem. This technique fuses the first and zeroth order information of the NN, where the first-order information is contained in a Jacobian tensor, following a constrained canonical polyadic decomposition (CPD). The proposed algorithm can handle different decomposition bases. The goal of this method is to compress large pretrained NN models, by replacing subnetworks, {\em i.e.,} one or multiple layers of the original network, by a new flexible layer. The approach is applied to a pretrained convolutional neural network (CNN) used for character classification.

【8】 Dealing with Expert Bias in Collective Decision-Making 标题:处理集体决策中的专家偏差问题

作者:Axel Abels,Tom Lenaerts,Vito Trianni,Ann Nowé 机构:Ann Now´e , Machine Learning Group, Universit´e Libre de Bruxelles, Brussels, Belgium, AI Lab, Vrije Universiteit Brussel, Brussels, Belgium, Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy 链接:https://arxiv.org/abs/2106.13539 摘要:相当多的现实世界问题可以被描述为决策问题,其中一个人必须从一组备选方案中反复做出适当的选择。专家的判断,无论是人为的还是人为的,都有助于做出正确的决定,尤其是在探索替代解决方案成本高昂的情况下。由于专家的意见可能有偏差,寻找正确的替代方案的问题可以作为一个集体决策问题来处理。目前解决清洁发展机制问题的最新方法受到专家组中最优秀专家素质的限制,如果专家不合格或过于偏袒,则表现不佳,从而有可能使决策过程脱轨。在本文中,我们提出了一种新的算法方法基于上下文多武装土匪问题(CMAB)来识别和抵消这种偏见的专家。我们探讨了同质、异质和两极分化的专家组,并表明这种方法能够有效地利用集体的专业知识,无论提供的建议是否直接有利于良好的表现,优于最先进的方法,特别是当提供的专业知识的质量下降。我们的新的CMAB启发的方法实现了更高的最终性能,并这样做,同时收敛速度比以前的自适应算法更快,特别是当异构的专业知识是现成的。 摘要:Quite some real-world problems can be formulated as decision-making problems wherein one must repeatedly make an appropriate choice from a set of alternatives. Expert judgements, whether human or artificial, can help in taking correct decisions, especially when exploration of alternative solutions is costly. As expert opinions might deviate, the problem of finding the right alternative can be approached as a collective decision making problem (CDM). Current state-of-the-art approaches to solve CDM are limited by the quality of the best expert in the group, and perform poorly if experts are not qualified or if they are overly biased, thus potentially derailing the decision-making process. In this paper, we propose a new algorithmic approach based on contextual multi-armed bandit problems (CMAB) to identify and counteract such biased expertises. We explore homogeneous, heterogeneous and polarised expert groups and show that this approach is able to effectively exploit the collective expertise, irrespective of whether the provided advice is directly conducive to good performance, outperforming state-of-the-art methods, especially when the quality of the provided expertise degrades. Our novel CMAB-inspired approach achieves a higher final performance and does so while converging more rapidly than previous adaptive algorithms, especially when heterogeneous expertise is readily available.

【9】 Multi-Domain Active Learning: A Comparative Study 标题:多领域主动学习的比较研究

作者:Rui He,Shan He,Ke Tang 机构: Department of Computer Scienceand Engineering, Southern University of Science and Technology 链接:https://arxiv.org/abs/2106.13516 摘要:在多个领域构建分类器是现实生活中的一个实际问题。多域学习(multi-domainlearning,MDL)不是逐个构建分类器,而是在多个域上同时构建分类器。MDL利用域间共享的信息来提高性能。作为一个有监督的学习问题,MDL问题中的标注工作量仍然很大。通常,这种高成本的标签问题可以通过使用主动学习来解决。因此,利用主动学习来减少MDL中的标记工作是很自然的,我们将这种设置称为多域主动学习(MDAL)。然而,只有很少的作品是建立在这种设置。当研究者不得不面对这个问题时,没有现成的解决方案。在这种情况下,结合现有的多领域学习模型和单领域主动学习策略可能是解决MDAL问题的一个初步方案。为了找出这一初步解决方案的潜力,本文对5种模式和4种选择策略进行了比较研究。据我们所知,这是第一个提供MDAL正式定义的工作。此外,这是MDAL问题的第一个比较工作。结果表明,在大多数情况下,采用简单的最优vs次优(BvSB)不确定性策略的多项式对抗网络(MAN)模型显示了其优越性。我们将此组合作为MDAL问题的现成建议。 摘要:Building classifiers on multiple domains is a practical problem in the real life. Instead of building classifiers one by one, multi-domain learning (MDL) simultaneously builds classifiers on multiple domains. MDL utilizes the information shared among the domains to improve the performance. As a supervised learning problem, the labeling effort is still high in MDL problems. Usually, this high labeling cost issue could be relieved by using active learning. Thus, it is natural to utilize active learning to reduce the labeling effort in MDL, and we refer this setting as multi-domain active learning (MDAL). However, there are only few works which are built on this setting. And when the researches have to face this problem, there is no off-the-shelf solutions. Under this circumstance, combining the current multi-domain learning models and single-domain active learning strategies might be a preliminary solution for MDAL problem. To find out the potential of this preliminary solution, a comparative study over 5 models and 4 selection strategies is made in this paper. To the best of our knowledge, this is the first work provides the formal definition of MDAL. Besides, this is the first comparative work for MDAL problem. From the results, the Multinomial Adversarial Networks (MAN) model with a simple best vs second best (BvSB) uncertainty strategy shows its superiority in most cases. We take this combination as our off-the-shelf recommendation for the MDAL problem.

【10】 Branch Prediction as a Reinforcement Learning Problem: Why, How and Case Studies 标题:作为强化学习问题的分支预测:为什么、如何和案例研究

作者:Anastasios Zouzias,Kleovoulos Kalaitzidis,Boris Grot 机构:Huawei Technologies, Zurich Research Center, Switzerland, University of Edinburgh, School of Informatics, United Kingdom 备注:6 pages, appeared in ML workshop for Computer Architecture and Systems 2021 链接:https://arxiv.org/abs/2106.13429 摘要:近年来,分支预测器(branch predictor,BP)效能的提高停滞不前,分支预测器的设计缺乏新的思路,需要在这方面进行新的思考。本文认为,从强化学习(RL)的角度看待BP,有助于对BP设计进行系统的推理和探索。我们描述了如何将RL公式应用于分支预测器,表明现有的预测器可以简洁地表达在这个公式中,并研究了两个基于RL的常规BPs变体。 摘要:Recent years have seen stagnating improvements to branch predictor (BP) efficacy and a dearth of fresh ideas in branch predictor design, calling for fresh thinking in this area. This paper argues that looking at BP from the viewpoint of Reinforcement Learning (RL) facilitates systematic reasoning about, and exploration of, BP designs. We describe how to apply the RL formulation to branch predictors, show that existing predictors can be succinctly expressed in this formulation, and study two RL-based variants of conventional BPs.

【11】 Federated Graph Classification over Non-IID Graphs 标题:非IID图上的联合图分类

作者:Han Xie,Jing Ma,Li Xiong,Carl Yang 机构:Department of Computer Science, Emory University 链接:https://arxiv.org/abs/2106.13423 摘要:联邦学习已经成为在不同领域训练机器学习模型的一个重要范例。对于图级任务(如图分类),图也可以看作是一种特殊类型的数据样本,可以在单独的本地系统中收集和存储。与其他领域类似,多个局部系统(每个局部系统都有一小组图)可以从协作训练一个强大的图挖掘模型中获益,例如流行的图神经网络(GNNs)。为了给这些努力提供更多的动力,我们分析了来自不同领域的真实世界图,以确认它们确实共享某些与随机图相比具有统计显著性的图属性。然而,我们也发现不同的图集,即使来自同一个域或同一个数据集,在图结构和节点特征方面都是非IID的。为了解决这个问题,我们提出了一个图聚类联邦学习(GCFL)框架,该框架基于GNNs的梯度动态地发现局部系统的聚类,并从理论上证明了这种聚类可以减少局部系统所拥有的图的结构和特征的异质性。此外,我们观察到GNNs的梯度在GCFL中波动较大,这阻碍了高质量的聚类,并设计了一种基于梯度序列的动态时间扭曲聚类机制(GCFL+)。大量的实验结果和深入的分析证明了我们提出的框架的有效性。 摘要:Federated learning has emerged as an important paradigm for training machine learning models in different domains. For graph-level tasks such as graph classification, graphs can also be regarded as a special type of data samples, which can be collected and stored in separate local systems. Similar to other domains, multiple local systems, each holding a small set of graphs, may benefit from collaboratively training a powerful graph mining model, such as the popular graph neural networks (GNNs). To provide more motivation towards such endeavors, we analyze real-world graphs from different domains to confirm that they indeed share certain graph properties that are statistically significant compared with random graphs. However, we also find that different sets of graphs, even from the same domain or same dataset, are non-IID regarding both graph structures and node features. To handle this, we propose a graph clustering federated learning (GCFL) framework that dynamically finds clusters of local systems based on the gradients of GNNs, and theoretically justify that such clusters can reduce the structure and feature heterogeneity among graphs owned by the local systems. Moreover, we observe the gradients of GNNs to be rather fluctuating in GCFL which impedes high-quality clustering, and design a gradient sequence-based clustering mechanism based on dynamic time warping (GCFL+). Extensive experimental results and in-depth analysis demonstrate the effectiveness of our proposed frameworks.

【12】 Building Intelligent Autonomous Navigation Agents 标题:构建智能自主导航代理

作者:Devendra Singh Chaplot 机构:CMU-ML-,-, Machine Learning Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, Thesis Committee:, Ruslan Salakhutdinov, Chair, Abhinav Gupta, Deva Ramanan, Jitendra Malik, Submitted in partial fulfillment of the requirements 备注:CMU Ph.D. Thesis, March 2021. For more details see this http URL 链接:https://arxiv.org/abs/2106.13415 摘要:在过去十年中,机器学习的突破导致了“数字智能”,即机器学习模型能够从大量的标记数据中学习,以执行一些数字任务,如语音识别、人脸识别、机器翻译等。本论文的目标是在设计“物理智能”算法方面取得进展,即构建智能自主导航代理,能够学习在物理世界中执行复杂的导航任务,包括视觉感知、自然语言理解、推理、规划,以及顺序决策。尽管经典的导航方法在过去的几十年中取得了一些进展,但是当前的导航代理在长期的语义导航任务中仍然很困难。在论文的第一部分,我们讨论了利用端到端强化学习来解决诸如障碍回避、语义感知、语言基础和推理等问题的短期导航工作。在第二部分中,我们提出了一类新的基于模块化学习和结构化显式地图表示的导航方法,利用经典和端到端学习方法的优点来处理长期的导航任务。结果表明,这些方法能够有效地解决诸如定位、映射、长期规划、探索和语义先验学习等问题。这些模块化学习方法能够对空间和语义进行长期的理解,并在各种导航任务上取得最先进的效果。 摘要:Breakthroughs in machine learning in the last decade have led to `digital intelligence', i.e. machine learning models capable of learning from vast amounts of labeled data to perform several digital tasks such as speech recognition, face recognition, machine translation and so on. The goal of this thesis is to make progress towards designing algorithms capable of `physical intelligence', i.e. building intelligent autonomous navigation agents capable of learning to perform complex navigation tasks in the physical world involving visual perception, natural language understanding, reasoning, planning, and sequential decision making. Despite several advances in classical navigation methods in the last few decades, current navigation agents struggle at long-term semantic navigation tasks. In the first part of the thesis, we discuss our work on short-term navigation using end-to-end reinforcement learning to tackle challenges such as obstacle avoidance, semantic perception, language grounding, and reasoning. In the second part, we present a new class of navigation methods based on modular learning and structured explicit map representations, which leverage the strengths of both classical and end-to-end learning methods, to tackle long-term navigation tasks. We show that these methods are able to effectively tackle challenges such as localization, mapping, long-term planning, exploration and learning semantic priors. These modular learning methods are capable of long-term spatial and semantic understanding and achieve state-of-the-art results on various navigation tasks.

【13】 ParaLaw Nets -- Cross-lingual Sentence-level Pretraining for Legal Text Processing 标题:准法律网--法律文本处理的跨语言句子级预训练

作者:Ha-Thanh Nguyen,Vu Tran,Phuong Minh Nguyen,Thi-Hai-Yen Vuong,Quan Minh Bui,Chau Minh Nguyen,Binh Tran Dang,Minh Le Nguyen,Ken Satoh 机构:Japan Advanced Institute of Science, and Technology, Ishikawa, Japan, University of Engineering and, Technology, VNU, Hanoi, Vietnam, National Institute of Informatics, Tokyo, Japan 备注:Also published in COLIEE 2021's Proceeding 链接:https://arxiv.org/abs/2106.13403 摘要:歧义是自然语言的一个特点,它使表达思想具有灵活性。然而,在一个需要准确陈述的领域,它成为了一个障碍。具体来说,一个单词可以有很多意思,多个单词可以有相同的意思。在将文本翻译成外语时,译者需要确定原文句子中每个成分的确切含义,以产生正确的翻译句子。基于这一观察,本文提出了一种基于句子级跨语言信息的预训练模型族ParaLaw-Nets,以减少歧义,提高法律文本处理的性能。该方法在colie-2021的问答任务中取得了最好的效果。 摘要:Ambiguity is a characteristic of natural language, which makes expression ideas flexible. However, in a domain that requires accurate statements, it becomes a barrier. Specifically, a single word can have many meanings and multiple words can have the same meaning. When translating a text into a foreign language, the translator needs to determine the exact meaning of each element in the original sentence to produce the correct translation sentence. From that observation, in this paper, we propose ParaLaw Nets, a pretrained model family using sentence-level cross-lingual information to reduce ambiguity and increase the performance in legal text processing. This approach achieved the best result in the Question Answering task of COLIEE-2021.

【14】 Decomposed Mutual Information Estimation for Contrastive Representation Learning 标题:用于对比表征学习的分解互信息估计

作者:Alessandro Sordoni,Nouha Dziri,Hannes Schulz,Geoff Gordon,Phil Bachman,Remi Tachet 机构: 20 1 4) and self-Equal contribution 1Microsoft Research 2University ofAlberta 备注:ICML 2021 链接:https://arxiv.org/abs/2106.13401 摘要:最近的对比表征学习方法依赖于对背景的多个视角之间的互信息的估计。例如,我们可以通过应用数据扩充来导出给定图像的多个视图,或者我们可以将序列分割为包含序列中某个步骤的过去和未来的视图。MI的对比下界易于优化,但在估计大量MI时存在很强的低估偏差。我们建议将完整的MI估计问题分解为一个较小的估计问题,方法是将其中一个视图拆分为逐渐增加信息的子视图,并在分解的视图之间应用MI链规则。此表达式包含无条件和条件MI项的总和,每个项测量总MI的适度部分,这有助于通过对比边界进行近似。为了使求和最大化,我们在条件MI上构造了一个对比下界,这个下界可以有效地逼近。我们将我们的一般方法称为互信息分解估计(DEMI)。我们发现,与标准的非分解对比边界相比,DEMI在合成环境中能够捕获更多的MI,并且在视觉领域和对话生成中学习更好的表示。 摘要:Recent contrastive representation learning methods rely on estimating mutual information (MI) between multiple views of an underlying context. E.g., we can derive multiple views of a given image by applying data augmentation, or we can split a sequence into views comprising the past and future of some step in the sequence. Contrastive lower bounds on MI are easy to optimize, but have a strong underestimation bias when estimating large amounts of MI. We propose decomposing the full MI estimation problem into a sum of smaller estimation problems by splitting one of the views into progressively more informed subviews and by applying the chain rule on MI between the decomposed views. This expression contains a sum of unconditional and conditional MI terms, each measuring modest chunks of the total MI, which facilitates approximation via contrastive bounds. To maximize the sum, we formulate a contrastive lower bound on the conditional MI which can be approximated efficiently. We refer to our general approach as Decomposed Estimation of Mutual Information (DEMI). We show that DEMI can capture a larger amount of MI than standard non-decomposed contrastive bounds in a synthetic setting, and learns better representations in a vision domain and for dialogue generation.

【15】 Interpreting Depression From Question-wise Long-term Video Recording of SDS Evaluation 标题:从SDS评估的问题式长期录像解读抑郁

作者:Wanqing Xie,Lizhong Liang,Yao Lu,Chen Wang,Jihong Shen,Hui Luo,Xiaofeng Liu 机构: Lu is with the School of Computer Science and Engineering, Sun Yat-senUniversity, HarbinEngineering University 备注:Published in IEEE Journal of Biomedical and Health Informatics 链接:https://arxiv.org/abs/2106.13393 摘要:抑郁自评量表(SDS)是一种常用的抑郁症初筛方法。然而,不可控的自我管理措施很容易受到漫不经心或欺骗性回答的影响,并产生不同的结果与临床医生管理汉密尔顿抑郁量表(HDRS)和最终诊断。临床上,面部表情和动作在临床医生的评估中起着至关重要的作用,而自我评估中对面部表情和动作的探索不足。在这项工作中,我们收集了一个新的数据集,200名受试者的自评问卷的有效性和相应的问题的视频记录。为了从抑郁自评量表和配对视频中自动解释抑郁,我们提出了一个长时变长视频的端到端分层框架,该框架还以问卷结果和回答时间为条件。具体地说,我们采用了一个层次模型,该模型利用一个3D CNN进行局部时间模式探索,并利用一个冗余感知的自我注意(RAS)方案进行问题式的全局特征聚合。针对冗余的长期FE视频处理,我们的RAS能够有效地利用问题集中每个视频片段的相关性来强调区分信息,并基于特征对的亲和性消除冗余。然后,将问题视频特征与问卷分数连接起来,进行最终的抑郁检测。我们的深入评估也显示了融合SDS评估和视频记录的有效性,以及我们的框架相对于传统的时态建模方法的优越性。 摘要:Self-Rating Depression Scale (SDS) questionnaire has frequently been used for efficient depression preliminary screening. However, the uncontrollable self-administered measure can be easily affected by insouciantly or deceptively answering, and producing the different results with the clinician-administered Hamilton Depression Rating Scale (HDRS) and the final diagnosis. Clinically, facial expression (FE) and actions play a vital role in clinician-administered evaluation, while FE and action are underexplored for self-administered evaluations. In this work, we collect a novel dataset of 200 subjects to evidence the validity of self-rating questionnaires with their corresponding question-wise video recording. To automatically interpret depression from the SDS evaluation and the paired video, we propose an end-to-end hierarchical framework for the long-term variable-length video, which is also conditioned on the questionnaire results and the answering time. Specifically, we resort to a hierarchical model which utilizes a 3D CNN for local temporal pattern exploration and a redundancy-aware self-attention (RAS) scheme for question-wise global feature aggregation. Targeting for the redundant long-term FE video processing, our RAS is able to effectively exploit the correlations of each video clip within a question set to emphasize the discriminative information and eliminate the redundancy based on feature pair-wise affinity. Then, the question-wise video feature is concatenated with the questionnaire scores for final depression detection. Our thorough evaluations also show the validity of fusing SDS evaluation and its video recording, and the superiority of our framework to the conventional state-of-the-art temporal modeling methods.

【16】 Towards A Knowledge Graph Based Autonomic Management of Software Defined Networks 标题:基于知识图的软件定义网络自主管理研究

作者:Qianru Zhou,Alasdair J. G. Gray,Stephen McLaughlin 机构: Gray is with the Department of Computer Science, Heriot-Watt University 链接:https://arxiv.org/abs/2106.13367 摘要:人工智能技术驱动的网络自动管理是近几十年来研究的热点。然而,目前的报告主要集中在理论建议和架构设计上,在实际网络上的实际实现方面的工作还没有出现。本文提出了在软件定义网络(SDNs)中实现知识图驱动的自主网络管理方法。在ToCo本体论的驱动下,基于Mininet(SDN仿真器)对SeaNet进行了重新编程。它由三个核心组件组成:知识图生成器、SPARQL引擎和网络管理API。知识图生成器将电信网络管理任务中的知识表示为形式化的本体驱动模型。将专家经验和网络管理规则形式化为知识图,通过SPARQL引擎的自动推理,网络管理API能够对特定的技术细节进行封装,并向用户提供与技术无关的接口。通过与用同一语言Python实现的商用SDN控制器Ryu的比较,对所提出的工作进行了实验评估。评估结果表明,在大多数情况下,SeaNet比Ryu快得多,而且SeaNet代码更紧凑。得益于RDF推理,SeaNet能够在知识图的不同尺度上达到O(1)的时间复杂度,而传统的数据库可以达到O(nlogn)的最佳时间复杂度。通过开发的网络管理API,SeaNet使研究人员能够在自己的sdn上开发语义智能应用程序。 摘要:Automatic network management driven by Artificial Intelligent technologies has been heatedly discussed over decades. However, current reports mainly focus on theoretic proposals and architecture designs, works on practical implementations on real-life networks are yet to appear. This paper proposes our effort toward the implementation of knowledge graph driven approach for autonomic network management in software defined networks (SDNs), termed as SeaNet. Driven by the ToCo ontology, SeaNet is reprogrammed based on Mininet (a SDN emulator). It consists three core components, a knowledge graph generator, a SPARQL engine, and a network management API. The knowledge graph generator represents the knowledge in the telecommunication network management tasks into formally represented ontology driven model. Expert experience and network management rules can be formalized into knowledge graph and by automatically inferenced by SPARQL engine, Network management API is able to packet technology-specific details and expose technology-independent interfaces to users. The Experiments are carried out to evaluate proposed work by comparing with a commercial SDN controller Ryu implemented by the same language Python. The evaluation results show that SeaNet is considerably faster in most circumstances than Ryu and the SeaNet code is significantly more compact. Benefit from RDF reasoning, SeaNet is able to achieve O(1) time complexity on different scales of the knowledge graph while the traditional database can achieve O(nlogn) at its best. With the developed network management API, SeaNet enables researchers to develop semantic-intelligent applications on their own SDNs.

【17】 CausalCity: Complex Simulations with Agency for Causal Discovery and Reasoning 标题:因果城市:具有因果发现和推理机构的复杂模拟

作者:Daniel McDuff,Yale Song,Jiyoung Lee,Vibhav Vineet,Sai Vemprala,Nicholas Gyde,Hadi Salman,Shuang Ma,Kwanghoon Sohn,Ashish Kapoor 机构:Microsoft, Redmond, USA, Yonsei University, South Korea, MIT, Cambridge, USA 链接:https://arxiv.org/abs/2106.13364 摘要:执行因果和反事实推理的能力是人类智力的核心属性。能够进行这类推理的决策系统有可能更具普遍性和可解释性。通过提供系统地改变参数(例如,混淆)的能力和在反事实情况下生成结果的示例,模拟有助于推进这一领域的最新技术。然而,在多智能体场景中模拟复杂的时间因果事件,例如那些存在于驾驶和车辆导航中的事件,是一个挑战。为了帮助解决这个问题,我们提供了一个高保真仿真环境,该环境是为在安全关键环境中开发因果发现和反事实推理算法而设计的。我们工作的一个核心组件是引入\textit{agency},这样使用高级定义定义和创建复杂场景就很简单了。然后,车辆与机构一起运行以完成这些目标,这意味着只有在必要时才需要控制低级别的行为。我们用三种最先进的方法进行实验,以创建基线并强调这种环境的启示。最后,我们强调今后工作的挑战和机遇。 摘要:The ability to perform causal and counterfactual reasoning are central properties of human intelligence. Decision-making systems that can perform these types of reasoning have the potential to be more generalizable and interpretable. Simulations have helped advance the state-of-the-art in this domain, by providing the ability to systematically vary parameters (e.g., confounders) and generate examples of the outcomes in the case of counterfactual scenarios. However, simulating complex temporal causal events in multi-agent scenarios, such as those that exist in driving and vehicle navigation, is challenging. To help address this, we present a high-fidelity simulation environment that is designed for developing algorithms for causal discovery and counterfactual reasoning in the safety-critical context. A core component of our work is to introduce \textit{agency}, such that it is simple to define and create complex scenarios using high-level definitions. The vehicles then operate with agency to complete these objectives, meaning low-level behaviors need only be controlled if necessary. We perform experiments with three state-of-the-art methods to create baselines and highlight the affordances of this environment. Finally, we highlight challenges and opportunities for future work.

【18】 What will it take to generate fairness-preserving explanations? 标题:怎样才能产生保持公平的解释呢?

作者:Jessica Dai,Sohini Upadhyay,Stephen H. Bach,Himabindu Lakkaraju 机构: it is also a critical 1Brown University, USA 2Harvard University 备注:Presented at ICML 2021 Workshop on Theoretic Foundation, Criticism, and Application Trend of Explainable AI 链接:https://arxiv.org/abs/2106.13346 摘要:在解释黑匣子模型可能有用的情况下,黑匣子的公平性通常也是一个相关的问题。然而,黑箱模型的公平性与黑箱解释行为之间的联系尚不清楚。我们关注于应用于表格数据集的解释,表明解释不一定保留黑盒算法的公平性。换句话说,解释算法可以忽略或模糊关键的相关属性,从而产生错误或误导性的解释。更广泛地说,我们提出未来的研究方向,以评估和产生解释,使他们是信息和相关的公平角度。 摘要:In situations where explanations of black-box models may be useful, the fairness of the black-box is also often a relevant concern. However, the link between the fairness of the black-box model and the behavior of explanations for the black-box is unclear. We focus on explanations applied to tabular datasets, suggesting that explanations do not necessarily preserve the fairness properties of the black-box algorithm. In other words, explanation algorithms can ignore or obscure critical relevant properties, creating incorrect or misleading explanations. More broadly, we propose future research directions for evaluating and generating explanations such that they are informative and relevant from a fairness perspective.

【19】 Dr. Watson type Artificial Intellect (AI) Systems 标题:沃森博士型人工智能(AI)系统

作者:Saveli Goldberg,Stanislav Belyaev,Vladimir Sluchak 机构:" Elon Musk at Massachusetts Institute of Technology during an interview at the AeroAstro Centennial Symposium ( 20 1 4) 备注:24 pages,13 figures 链接:https://arxiv.org/abs/2106.13322 摘要:本文提出了一种新型的人工智能系统,它不直接给出解决方案,而是指向解决方案,友好地向用户提示问题和调整信息。人工智能与人类合作的模式可以从柯南·道尔的故事中福尔摩斯先生和沃森博士之间互动的经典文学例子中推断出来,在柯南·道尔的故事中,高素质的专家福尔摩斯先生回答了沃森博士提出的问题。在这里,霍姆斯先生,他的基于规则的计算、逻辑和内存管理,显然扮演着人工智能系统的角色,而沃森博士就是用户。通过研究同一个Holmes-Watson交互,我们发现并推广了另一个模型,在这个模型中,人工智能的行为就像Watson博士,他通过提问和以特定的方式行事,帮助Holmes(人工智能用户)做出正确的决定。根据这一原理,我们将这些系统称为“Watson医生型系统”。本文介绍了这类系统的特点,并介绍了两个具体的系统:重症监护医师的病人管理系统和数据防错系统。 摘要:The article proposes a new type of AI system that does not give solutions directly but rather points toward it, friendly prompting the user with questions and adjusting messages. Models of AI human collaboration can be deduced from the classic literary example of interaction between Mr. Holmes and Dr. Watson from the stories by Conan Doyle, where the highly qualified expert Mr. Holmes answers questions posed by Dr. Watson. Here Mr. Holmes, with his rule-based calculations, logic, and memory management, apparently plays the role of an AI system, and Dr. Watson is the user. Looking into the same Holmes-Watson interaction, we find and promote another model in which the AI behaves like Dr. Watson, who, by asking questions and acting in a particular way, helps Holmes (the AI user) make the right decisions. We call the systems based on this principle "Dr. Watson-type systems." The article describes the properties of such systems and introduces two particular: Patient Management System for intensive care physicians and Data Error Prevention System.

【20】 More Causes Less Effect: Destructive Interference in Decision Making 标题:多因少果:决策中的破坏性干预

作者:Irina Basieva,Vijitashwa Pandey,Polina Khrennikova 机构: International Center for Mathematical Modeling in Physics and Cognitive Science, Linnaeus, University, S-, Växjö, Sweden, Industrial and Systems Engineering Department, Oakland University, MI , USA, University of Leicester, LE,RH, United Kingdom 链接:https://arxiv.org/abs/2106.13320 摘要:我们提出了一个新的实验,证明在客户对产品失效条件概率的估计中存在破坏性干扰。我们从消费品制造商的角度出发,考虑因果两种情况。虽然个别原因的影响是相似的,但可以观察到,当两个原因结合在一起时,会产生相反的影响。这种由两个或多个原因造成的负面干扰可以被用来更好地模拟顾客头脑中发生的认知过程。这样做可以提高制造商能够设计更好的产品或其中的功能的可能性。量子概率已被用来解释一些常见的观察偏差,如问题顺序和反应可复制性效应,以及解释悖论,如违反确定事物原则,以及Machina和Ellsberg悖论。在这项工作中,我们介绍了一项调查的结果,该调查涉及多个观察到的症状对车辆驾驶性能的影响。我们证明,这组响应不能用经典概率来解释,但量子公式很容易对其建模,因为它允许事件之间的正“干扰”和负“干扰”。由于量子公式主义也解释了经典概率的预测,它为工程设计和行为经济学中的决策行为建模提供了更丰富的范例。 摘要:We present a new experiment demonstrating destructive interference in customers' estimates of conditional probabilities of product failure. We take the perspective of a manufacturer of consumer products, and consider two situations of cause and effect. Whereas individually the effect of the causes is similar, it is observed that when combined, the two causes produce the opposite effect. Such negative interference of two or more reasons may be exploited for better modeling the cognitive processes taking place in the customers' mind. Doing so can enhance the likelihood that a manufacturer will be able to design a better product, or a feature within it. Quantum probability has been used to explain some commonly observed deviations such as question order and response replicability effects, as well as in explaining paradoxes such as violations of the sure-thing principle, and Machina and Ellsberg paradoxes. In this work, we present results from a survey conducted regarding the effect of multiple observed symptoms on the drivability of a vehicle. We demonstrate that the set of responses cannot be explained using classical probability, but quantum formulation easily models it, as it allows for both positive and negative "interference" between events. Since quantum formulism also accounts for classical probability's predictions, it serves as a richer paradigm for modeling decision making behavior in engineering design and behavioral economics.

【21】 A variational autoencoder approach for choice set generation and implicit perception of alternatives in choice modeling 标题:选择建模中选择集生成和选择隐式感知的变分自动编码器方法

作者:Rui Yao,Shlomo Bekhor 机构:Department of Civil and Environmental Engineering, Technion – Israel Institute of Technology, Haifa , Israel 链接:https://arxiv.org/abs/2106.13319 摘要:本文推导了具有方案隐式可用性/感知(IAP)的广义极值(GEV)模型,提出了一种用于方案选择集生成和隐式感知的变分自编码(VAE)方法。具体地说,作为IAP-GEV模型的一个例子,导出了带有IAP的交叉嵌套logit(CNL)模型。采用VAE方法对选择集生成过程进行建模,使得在选择集中感知到选择方案的可能性最大。以一个实际数据集为例,说明了VAE方法生成路由选择集的方法。与多项式logit模型和传统的选择集生成方法相比,IAP-CNL模型在拟合优度和预测性能方面都有较好的表现。 摘要:This paper derives the generalized extreme value (GEV) model with implicit availability/perception (IAP) of alternatives and proposes a variational autoencoder (VAE) approach for choice set generation and implicit perception of alternatives. Specifically, the cross-nested logit (CNL) model with IAP is derived as an example of IAP-GEV models. The VAE approach is adapted to model the choice set generation process, in which the likelihood of perceiving chosen alternatives in the choice set is maximized. The VAE approach for route choice set generation is exemplified using a real dataset. IAP- CNL model estimated has the best performance in terms of goodness-of-fit and prediction performance, compared to multinomial logit models and conventional choice set generation methods.

【22】 Promises and Pitfalls of Black-Box Concept Learning Models 标题:黑盒概念学习模型的承诺与陷阱

作者:Anita Mahinpei,Justin Clark,Isaac Lage,Finale Doshi-Velez,Weiwei Pan 机构: ( 20 18);Equal contribution 1Harvard University 链接:https://arxiv.org/abs/2106.13314 摘要:将概念学习作为决策过程的中间步骤的机器学习模型可以与黑盒预测模型的性能相匹配,同时保留用人类可以理解的术语解释结果的能力。然而,我们证明了这些模型学习到的概念表示编码的信息超出了预先定义的概念,并且自然缓解策略没有完全起作用,使得对下游预测的解释具有误导性。我们描述了信息泄漏的机制,并提出了减轻其影响的方法。 摘要:Machine learning models that incorporate concept learning as an intermediate step in their decision making process can match the performance of black-box predictive models while retaining the ability to explain outcomes in human understandable terms. However, we demonstrate that the concept representations learned by these models encode information beyond the pre-defined concepts, and that natural mitigation strategies do not fully work, rendering the interpretation of the downstream prediction misleading. We describe the mechanism underlying the information leakage and suggest recourse for mitigating its effects.

【23】 Brax -- A Differentiable Physics Engine for Large Scale Rigid Body Simulation 标题:BRAX--一个用于大规模刚体模拟的差分物理引擎

作者:C. Daniel Freeman,Erik Frey,Anton Raichuk,Sertan Girgin,Igor Mordatch,Olivier Bachem 机构:Google Research 备注:9 pages + 12 pages of appendices and references. In submission at NeurIPS 2021 Datasets and Benchmarks Track 链接:https://arxiv.org/abs/2106.13281 摘要:我们介绍了Brax,一个用于刚体模拟的开源库,它的重点是加速器的性能和并行性,它是用JAX编写的。我们呈现了一组受现有强化学习文献启发的任务的结果,但在我们的引擎中进行了重新构建。此外,我们在JAX中提供了PPO、SAC、ES和直接策略优化的重新实现,这些优化与我们的环境一起编译,允许学习算法和环境处理在同一设备上发生,并在加速器上无缝扩展。最后,我们还提供了一些笔记本,这些笔记本可以在几分钟内就常见的OpenAI健身房MuJoCo类任务的绩效政策进行训练。 摘要:We present Brax, an open source library for rigid body simulation with a focus on performance and parallelism on accelerators, written in JAX. We present results on a suite of tasks inspired by the existing reinforcement learning literature, but remade in our engine. Additionally, we provide reimplementations of PPO, SAC, ES, and direct policy optimization in JAX that compile alongside our environments, allowing the learning algorithm and the environment processing to occur on the same device, and to scale seamlessly on accelerators. Finally, we include notebooks that facilitate training of performant policies on common OpenAI Gym MuJoCo-like tasks in minutes.

【24】 Multi-Robot Deep Reinforcement Learning for Mobile Navigation 标题:多机器人深度强化学习在移动导航中的应用

作者:Katie Kang,Gregory Kahn,Sergey Levine 机构:University of California, Berkeley 链接:https://arxiv.org/abs/2106.13280 摘要:深度强化学习算法需要大量不同的数据集来学习基于感知的移动导航策略。然而,用一个机器人收集这样的数据集可能会非常昂贵。用可能具有不同动力学的多个不同机器人平台收集数据是一种更具可伸缩性的大规模数据收集方法。但深度强化学习算法如何利用这些异构数据集呢?在这项工作中,我们提出了一个具有层次整合模型(HInt)的深度强化学习算法。在训练时,HInt学习单独的感知模型和动力学模型,在测试时,HInt将两个模型进行分层集成,并用集成模型规划动作。这种分层集成模型的规划方法允许算法在各种不同平台收集的数据集上进行训练,同时尊重测试时部署机器人的物理能力。我们的移动导航实验表明,HInt优于传统的分层策略和单源方法。 摘要:Deep reinforcement learning algorithms require large and diverse datasets in order to learn successful policies for perception-based mobile navigation. However, gathering such datasets with a single robot can be prohibitively expensive. Collecting data with multiple different robotic platforms with possibly different dynamics is a more scalable approach to large-scale data collection. But how can deep reinforcement learning algorithms leverage such heterogeneous datasets? In this work, we propose a deep reinforcement learning algorithm with hierarchically integrated models (HInt). At training time, HInt learns separate perception and dynamics models, and at test time, HInt integrates the two models in a hierarchical manner and plans actions with the integrated model. This method of planning with hierarchically integrated models allows the algorithm to train on datasets gathered by a variety of different platforms, while respecting the physical capabilities of the deployment robot at test time. Our mobile navigation experiments show that HInt outperforms conventional hierarchical policies and single-source approaches.

【25】 You are AllSet: A Multiset Function Framework for Hypergraph Neural Networks 标题:You Are AllSet:超图神经网络的多集函数框架

作者:Eli Chien,Chao Pan,Jianhao Peng,Olgica Milenkovic 机构:Department of Electrical and Computer Engineering, University of Illinois, Urbana-Champaign 链接:https://arxiv.org/abs/2106.13264 摘要:超图被用来模拟agent之间的高阶相互作用,超图数据集存在许多实际相关的实例。为了有效地处理超图结构数据,人们提出了几种学习超图性质和结构的超图神经网络平台,特别是节点分类。然而,几乎所有现有的方法都使用启发式传播规则,并且在许多数据集上提供次优的性能。我们提出了一种新的超图神经网络范式AllSet,它代表了一个高度通用的(超)图神经网络框架,并首次将超图神经网络层实现为两个多集函数的组合,可以有效地学习每个任务和每个数据集。此外,AllSet还利用了超图神经网络与多集函数深度学习的最新进展之间的新联系。特别是,所提出的体系结构利用了深集和集变换体系结构,允许显著的建模灵活性并提供高表达能力。为了评估AllSet的性能,我们进行了迄今为止最广泛的实验,包括十个已知的基准数据集和三个新整理的数据集,这些数据集代表了超图节点分类的重大挑战。结果表明,AllSet具有独特的能力,无论是一致匹配或优于所有其他超图神经网络测试数据集。我们的实现和数据集将在验收后发布。 摘要:Hypergraphs are used to model higher-order interactions amongst agents and there exist many practically relevant instances of hypergraph datasets. To enable efficient processing of hypergraph-structured data, several hypergraph neural network platforms have been proposed for learning hypergraph properties and structure, with a special focus on node classification. However, almost all existing methods use heuristic propagation rules and offer suboptimal performance on many datasets. We propose AllSet, a new hypergraph neural network paradigm that represents a highly general framework for (hyper)graph neural networks and for the first time implements hypergraph neural network layers as compositions of two multiset functions that can be efficiently learned for each task and each dataset. Furthermore, AllSet draws on new connections between hypergraph neural networks and recent advances in deep learning of multiset functions. In particular, the proposed architecture utilizes Deep Sets and Set Transformer architectures that allow for significant modeling flexibility and offer high expressive power. To evaluate the performance of AllSet, we conduct the most extensive experiments to date involving ten known benchmarking datasets and three newly curated datasets that represent significant challenges for hypergraph node classification. The results demonstrate that AllSet has the unique ability to consistently either match or outperform all other hypergraph neural networks across the tested datasets. Our implementation and dataset will be released upon acceptance.

【26】 Modeling the Mistakes of Boundedly Rational Agents Within a Bayesian Theory of Mind 标题:在贝叶斯心理理论中模拟有限理性主体的错误

作者:Arwa Alanqary,Gloria Z. Lin,Joie Le,Tan Zhi-Xuan,Vikash K. Mansinghka,Joshua B. Tenenbaum 机构:Department of Brain and Cognitive Sciences, MIT, Equal Contribution, †Corresponding Author 备注:Accepted to CogSci 2021. 6 pages, 5 figures. (Appendix: 1 page, 1 figure) 链接:https://arxiv.org/abs/2106.13249 摘要:当推断出其他人正在努力实现的目标时,人们直觉地明白,其他人可能会在这个过程中犯错误。这对于教学、提供帮助以及在责备与宽恕之间做出决定等活动至关重要。然而,心理理论的贝叶斯模型通常没有考虑到这些错误,而是将代理建模为实现其目标的最佳模型。结果,他们无法解释像把自己锁在家里,或下棋输了这样的现象。在这里,我们扩展了贝叶斯心智理论框架,对可能有错误目标、计划和行动的有界理性主体进行建模。我们通过将代理建模为概率程序将其形式化,其中目标可能与语义相似的状态相混淆,计划可能由于资源有限的规划而被误导,操作可能由于执行错误而非预期。我们提出了两个领域的人类目标推理实验:(i)一个锁在门后的宝石网格世界难题,和(ii)块堆叠领域。我们的模型更好地解释了人类的推论,而不是替代品,同时跨领域推广。这些发现表明了将他人建模为有界代理的重要性,以解释人类直觉心理的充分丰富性。 摘要:When inferring the goals that others are trying to achieve, people intuitively understand that others might make mistakes along the way. This is crucial for activities such as teaching, offering assistance, and deciding between blame or forgiveness. However, Bayesian models of theory of mind have generally not accounted for these mistakes, instead modeling agents as mostly optimal in achieving their goals. As a result, they are unable to explain phenomena like locking oneself out of one's house, or losing a game of chess. Here, we extend the Bayesian Theory of Mind framework to model boundedly rational agents who may have mistaken goals, plans, and actions. We formalize this by modeling agents as probabilistic programs, where goals may be confused with semantically similar states, plans may be misguided due to resource-bounded planning, and actions may be unintended due to execution errors. We present experiments eliciting human goal inferences in two domains: (i) a gridworld puzzle with gems locked behind doors, and (ii) a block-stacking domain. Our model better explains human inferences than alternatives, while generalizing across domains. These findings indicate the importance of modeling others as bounded agents, in order to account for the full richness of human intuitive psychology.

【27】 A fuzzy take on the logical issues of statistical hypothesis testing 标题:对统计假设检验逻辑问题的模糊认识

作者:Matthew Booth,Fabien Paillusson 机构:which is largely independent of the adopted school of thought, School of Mathematics and Physics, University of Lincoln 备注:15 pages, 3 figures. Preprint version of an amended version of the article published in Philosophies 2021, 6(1), 21 链接:https://arxiv.org/abs/2106.13241 摘要:统计假设检验(SHT)是一类利用经验数据来检验假设的推理方法,通常会给出是否拒绝假设的判断。在本文中,我们关注这个策略的逻辑方面,它在很大程度上独立于所采用的学派,至少在不同的频繁主义方法中是如此。我们认为SHT是以经典逻辑中Modus-Tollens的一个不合理论证的形式出现的,为了将SHT从这个困难中解救出来,我们建议它可以基于基于t范数的模糊逻辑。利用modus-Tollens的一个模糊扩展,重新构造了频繁论者的SHT逻辑,建立了一个真理评价模型。重要的是,我们证明了通过探索与构造模糊否定和模糊蕴涵有关的各种约定(即S和R约定),可以保持Modus-Tollens的正确性。我们发现,在S约定下,利用Zadeh的成分延拓和任何可能的t-范数进行Modus-Tollens推理是可能的。在R约定下,我们发现不一定是这样,但是通过混合R-蕴涵和S-否定,我们可以挽救乘积t-范数。总之,我们已经证明,模糊逻辑是一个合法的框架,以讨论和解决困难,困扰频繁的解释SHT。 摘要:Statistical Hypothesis Testing (SHT) is a class of inference methods whereby one makes use of empirical data to test a hypothesis and often emit a judgment about whether to reject it or not. In this paper we focus on the logical aspect of this strategy, which is largely independent of the adopted school of thought, at least within the various frequentist approaches. We identify SHT as taking the form of an unsound argument from Modus Tollens in classical logic, and, in order to rescue SHT from this difficulty, we propose that it can instead be grounded in t-norm based fuzzy logics. We reformulate the frequentists' SHT logic by making use of a fuzzy extension of modus Tollens to develop a model of truth valuation for its premises. Importantly, we show that it is possible to preserve the soundness of Modus Tollens by exploring the various conventions involved with constructing fuzzy negations and fuzzy implications (namely, the S and R conventions). We find that under the S convention, it is possible to conduct the Modus Tollens inference argument using Zadeh's compositional extension and any possible t-norm. Under the R convention we find that this is not necessarily the case, but that by mixing R-implication with S-negation we can salvage the product t-norm, for example. In conclusion, we have shown that fuzzy logic is a legitimate framework to discuss and address the difficulties plaguing frequentist interpretations of SHT.

【28】 Federated Noisy Client Learning 标题:联合噪音客户端学习

作者:Li Li,Huazhu Fu,Bo Han,Cheng-Zhong Xu,Ling Shao 机构: Shenzhen Institutes of Advanced Technology, CAS., Inception Institute of Artificial Intelligence, UAE., Department of Computer Science, Hong Kong Baptist University., University of Macau. 链接:https://arxiv.org/abs/2106.13239 摘要:联邦学习(FL)协作聚合一个基于多个本地客户机的共享全局模型,同时保持训练数据的分散性,以保护数据隐私。然而,标准的FL方法忽略了有噪声的客户端问题,这可能会损害聚合模型的整体性能。在本文中,我们首先分析有噪声的客户机语句,然后用不同的噪声分布(如Bernoulli分布和截断高斯分布)对有噪声的客户机进行建模。为了在有噪声的客户机上学习,我们提出了一个简单而有效的FL框架,称为联邦噪声客户机学习(Federated Noised Client Learning,Fed NCL),它是一种即插即用算法,包含两个主要部分:数据质量度量(DQM),用于动态量化每个参与客户机的数据质量,以及噪声鲁棒聚集(NRA),通过综合考虑每个客户机的局部训练数据量和数据质量,自适应地聚集每个客户机的局部模型。我们的Fed-NCL可以很容易地应用于任何标准的FL工作流中,以处理嘈杂的客户问题。在不同数据集上的实验结果表明,我们的算法提高了具有噪声客户端的不同系统的性能。 摘要:Federated learning (FL) collaboratively aggregates a shared global model depending on multiple local clients, while keeping the training data decentralized in order to preserve data privacy. However, standard FL methods ignore the noisy client issue, which may harm the overall performance of the aggregated model. In this paper, we first analyze the noisy client statement, and then model noisy clients with different noise distributions (e.g., Bernoulli and truncated Gaussian distributions). To learn with noisy clients, we propose a simple yet effective FL framework, named Federated Noisy Client Learning (Fed-NCL), which is a plug-and-play algorithm and contains two main components: a data quality measurement (DQM) to dynamically quantify the data quality of each participating client, and a noise robust aggregation (NRA) to adaptively aggregate the local models of each client by jointly considering the amount of local training data and the data quality of each client. Our Fed-NCL can be easily applied in any standard FL workflow to handle the noisy client issue. Experimental results on various datasets demonstrate that our algorithm boosts the performances of different state-of-the-art systems with noisy clients.

【29】 Towards Exploiting Geometry and Time for FastOff-Distribution Adaptation in Multi-Task RobotLearning 标题:多任务机器人学习中利用几何和时间进行快速分布适应的研究

作者:K. R. Zentner,Ryan Julian,Ujjwal Puri,Yulun Zhang,Gaurav Sukhatme 机构:University of Southern California, Los Angeles, CA 备注:Accepted to Challenges of Real World Reinforcement Learning, Virtual Workshop at NeurIPS 2020 链接:https://arxiv.org/abs/2106.13237 摘要:我们探索多任务转移学习的可能方法,寻求利用机器人任务的共享物理结构。具体来说,我们为一组基本的预训练任务训练策略,然后尝试适应新的非分布任务,使用简单的体系结构方法将这些策略重新用作黑盒优先级。这些方法包括学习从基本任务到目标任务的观察空间或动作空间的对齐以利用刚体结构,以及学习解决目标任务的跨基本任务的时域切换策略以利用时间一致性的方法。我们发现,将低复杂度的目标策略类、作为黑盒先验的基本策略和简单的优化算法相结合,可以使用少量的离线训练数据,获得基本任务分布之外的新任务。 摘要:We explore possible methods for multi-task transfer learning which seek to exploit the shared physical structure of robotics tasks. Specifically, we train policies for a base set of pre-training tasks, then experiment with adapting to new off-distribution tasks, using simple architectural approaches for re-using these policies as black-box priors. These approaches include learning an alignment of either the observation space or action space from a base to a target task to exploit rigid body structure, and methods for learning a time-domain switching policy across base tasks which solves the target task, to exploit temporal coherence. We find that combining low-complexity target policy classes, base policies as black-box priors, and simple optimization algorithms allows us to acquire new tasks outside the base task distribution, using small amounts of offline training data.

【30】 Post Selections Using Test Sets (PSUTS) and How Developmental Networks Avoid Them 标题:使用测试集的帖子选择(PSUT)以及开发网络如何避免它们

作者:Juyang Weng 机构:∗Department of Computer Science and Engineering, †Cognitive Science Program, ‡Neuroscience Program, Michigan State University, East Lansing, MI, USA, §GENISAMA LLC, Okemos, MI , USA 备注:13 pages, 2 figures. The first part has been accepted as an IJCNN 2021 paper and the second has been accepted as an ICDL 2021 paper 链接:https://arxiv.org/abs/2106.13233 摘要:本文提出了一个很少报道的人工智能(AI)实践,称为使用测试集的后选择(PSUT)。因此,在深度学习中流行的错误反馈方法缺乏可接受的泛化能力。所有人工智能方法分为两大流派,连接主义和象征性。PSUT分为两种,机器PSUT和人PSUT。由于大量的网络参数和现在更糟糕的机器PSUT,连接主义学派因其“不修边幅”而受到批评;但由于人类PSUT的泛化能力较弱,这种看似“干净”的符号学派似乎更脆弱。本文正式定义了PSUTS的概念,分析了随机初始权值的误差反投影方法为什么会出现严重的局部极小值,PSUTS为什么违反了公认的研究伦理,以及每一篇使用PSUTS的论文应该如何至少透明地报告PSUTS。为了提高未来出版物的透明度,本文提出了一个新的人工智能性能评估标准,即所有训练网络的发展误差,以及三种学习条件:(1)增量学习结构,(2)训练经验和(3)有限的计算资源。开发性网络避免PSUT,并且不“邋遢”,因为它们驱动紧急图灵机,并且在整个生命周期中的最大可能性意义上是最优的。 摘要:This paper raises a rarely reported practice in Artificial Intelligence (AI) called Post Selection Using Test Sets (PSUTS). Consequently, the popular error-backprop methodology in deep learning lacks an acceptable generalization power. All AI methods fall into two broad schools, connectionist and symbolic. The PSUTS fall into two kinds, machine PSUTS and human PSUTS. The connectionist school received criticisms for its "scruffiness" due to a huge number of network parameters and now the worse machine PSUTS; but the seemingly "clean" symbolic school seems more brittle because of a weaker generalization power using human PSUTS. This paper formally defines what PSUTS is, analyzes why error-backprop methods with random initial weights suffer from severe local minima, why PSUTS violates well-established research ethics, and how every paper that used PSUTS should have at least transparently reported PSUTS. For improved transparency in future publications, this paper proposes a new standard for performance evaluation of AI, called developmental errors for all networks trained, along with Three Learning Conditions: (1) an incremental learning architecture, (2) a training experience and (3) a limited amount of computational resources. Developmental Networks avoid PSUTS and are not "scruffy" because they drive Emergent Turing Machines and are optimal in the sense of maximum-likelihood across lifetime.

【31】 Online Self-Attentive Gated RNNs for Real-Time Speaker Separation 标题:用于实时说话人分离的在线自关注门控RNN

作者:Ori Kabeli,Yossi Adi,Zhenyu Tang,Buye Xu,Anurag Kumar 机构:Facebook AI Research, TLV, Israel, Facebook Reality Labs, Redmond, WA, USA, University of Maryland, College Park, MD, USA 链接:https://arxiv.org/abs/2106.13493 摘要:深度神经网络在单、双耳盲源分离方面取得了巨大的成功。虽然这些方法被证明能产生高质量的分离,但它们主要应用于离线设置下,即模型在分离信号的同时可以访问完整的输入信号。在这项研究中,我们将一个非因果的最新分离模型转换成一个因果的实时模型,并评估其在在线和离线环境下的性能。我们比较了所提出的模型与几种基线方法在消声、噪声和混响记录条件下的性能,同时考察了单耳和双耳的输入和输出。我们的发现揭示了分离时因果模型和非因果模型之间的相对差异。与离线模式相比,我们在线分离的有状态实现导致性能略有下降;单耳输入为0.8dB,双耳输入为0.3dB,实时系数为0.65。样本可在以下链接中找到:https://kwanum.github.io/sagrnnc-stream-results/. 摘要:Deep neural networks have recently shown great success in the task of blind source separation, both under monaural and binaural settings. Although these methods were shown to produce high-quality separations, they were mainly applied under offline settings, in which the model has access to the full input signal while separating the signal. In this study, we convert a non-causal state-of-the-art separation model into a causal and real-time model and evaluate its performance under both online and offline settings. We compare the performance of the proposed model to several baseline methods under anechoic, noisy, and noisy-reverberant recording conditions while exploring both monaural and binaural inputs and outputs. Our findings shed light on the relative difference between causal and non-causal models when performing separation. Our stateful implementation for online separation leads to a minor drop in performance compared to the offline model; 0.8dB for monaural inputs and 0.3dB for binaural inputs while reaching a real-time factor of 0.65. Samples can be found under the following link: https://kwanum.github.io/sagrnnc-

本文参与 腾讯云自媒体分享计划,分享自微信公众号。
原始发表:2021-06-28,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 arXiv每日学术速递 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
相关产品与服务
联邦学习
联邦学习(Federated Learning,FELE)是一种打破数据孤岛、释放 AI 应用潜能的分布式机器学习技术,能够让联邦学习各参与方在不披露底层数据和底层数据加密(混淆)形态的前提下,通过交换加密的机器学习中间结果实现联合建模。该产品兼顾AI应用与隐私保护,开放合作,协同性高,充分释放大数据生产力,广泛适用于金融、消费互联网等行业的业务创新场景。
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档