cs.AI人工智能,共计29篇
【1】 Time in a Box: Advancing Knowledge Graph Completion with Temporal Scopes 标题:盒子里的时间:利用时间范围推进知识图补全 链接:https://arxiv.org/abs/2111.06854
作者:Ling Cai,Krzysztof Janowic,Bo Yan,Rui Zhu,Gengchen Mai 机构:UC Santa Barbara, Stanford University 摘要:几乎知识库中的所有语句都有一个有效的时间范围。因此,时态知识库(TKB)上的知识库完成(KBC),其中每个语句{may}都与一个时态范围相关联,引起了越来越多的关注。以前的工作假设TKB\textit{must}中的每个语句都与一个时间范围相关联。这忽略了一个事实,即在知识库中通常缺少作用域信息。因此,以前的工作通常无法处理TKB由具有/不具有已知时间范围的时间语句组成的通用用例。为了解决这个问题,我们建立了一个新的知识库嵌入框架,称为TIME2BOX,它可以同时处理不同类型的时态和时态语句。我们的主要观点是,时态查询的答案总是属于时间不可知对应答案的子集。换言之,时间是一个过滤器,可以帮助你在特定的时间段内找出正确的答案。我们引入框来表示一组与时间无关的查询的应答实体。时间的过滤功能由这些方框上的交点建模。此外,我们还推广了当前关于时间间隔预测的评估协议。我们描述了在两个数据集上的实验,结果表明,所提出的方法在链路预测和时间预测方面都优于最新的(SOTA)方法。 摘要:Almost all statements in knowledge bases have a temporal scope during which they are valid. Hence, knowledge base completion (KBC) on temporal knowledge bases (TKB), where each statement \textit{may} be associated with a temporal scope, has attracted growing attention. Prior works assume that each statement in a TKB \textit{must} be associated with a temporal scope. This ignores the fact that the scoping information is commonly missing in a KB. Thus prior work is typically incapable of handling generic use cases where a TKB is composed of temporal statements with/without a known temporal scope. In order to address this issue, we establish a new knowledge base embedding framework, called TIME2BOX, that can deal with atemporal and temporal statements of different types simultaneously. Our main insight is that answers to a temporal query always belong to a subset of answers to a time-agnostic counterpart. Put differently, time is a filter that helps pick out answers to be correct during certain periods. We introduce boxes to represent a set of answer entities to a time-agnostic query. The filtering functionality of time is modeled by intersections over these boxes. In addition, we generalize current evaluation protocols on time interval prediction. We describe experiments on two datasets and show that the proposed method outperforms state-of-the-art (SOTA) methods on both link prediction and time prediction.
【2】 Influential Papers in Artificial Intelligence and Paediatrics: Assessing RPYS by Experts Review 标题:人工智能和儿科学中有影响力的论文:专家评审RPYS 链接:https://arxiv.org/abs/2111.06852
作者:Peter Kokol,Jernej Završnik,Helena Blažun Vošner 机构:. Community Healthcare Centre Dr. Adolf Drolc Maribor, Maribor, Slovenia, . Alma Mater Europaea—ECM, Koper, Slovenia, . Science and Research Center Koper, Koper, Slovenia, University of Maribor, Maribor, Slovenia, Slovenj Gradec, Slovenia 备注:8 pages, one figure, one table 摘要:在过去的几年里,人工智能在儿科的应用大大增加。有趣的是,还没有对这一特定儿科领域的知识发展进行历史文献计量研究,因此我们的研究旨在缩小这一差距。参考文献出版年光谱学(RPYS),更准确地说,CitedReferenceExplorer(CRE)软件工具被用来实现这一目标。我们鉴定了28篇有影响力的论文,领域专家验证表明,RPYS方法和CRE工具在鉴定过程中都表现良好。 摘要:The use of artificial intelligence in paediatrics has vastly increased in the last few years. Interestingly, no historical bibliometric study analysing the knowledge development in this specific paediatric field has been performed yet, thus our study aimed to close this gap. References Publication Years Spectrography (RPYS), more precisely CitedReferenceExplorer (CRE) software tool was employed to achieve this aim. We identified 28 influential papers and domain experts validation showed that both, the RPYS method and CRE tool performed adequately in the identification process.
【3】 NRC-GAMMA: Introducing a Novel Large Gas Meter Image Dataset 标题:NRC-GAMMA:引入一种新的大型煤气表图像数据集 链接:https://arxiv.org/abs/2111.06827
作者:Ashkan Ebadi,Patrick Paul,Sofia Auer,Stéphane Tremblay 机构:National Research Council Canada, Montreal, QC H,T ,B, Canada, National Research Council Canada, Ottawa, ON K,K ,E, Canada 备注:12 pages, 7 figures, 1 table 摘要:自动抄表技术尚未普及。天然气、电力或水累积仪表读数大多由操作员或业主在现场手动完成。在某些国家/地区,运营商通过与其他运营商进行离线检查和/或在发生冲突或投诉时使用照片作为证据,将照片作为阅读证明,以确认阅读。整个过程耗时、昂贵,而且容易出错。自动化可以优化和促进此类劳动密集型和容易出现人为错误的流程。随着人工智能和计算机视觉领域的最新进展,自动抄表系统比以往任何时候都更加可行。受人工智能领域最新进展的推动,受研究界开源开放获取计划的启发,我们引入了一个新的大型基准数据集,即真实气体流量计图像,名为NRC-GAMMA数据集。数据是在2020年1月20日上午00:05到晚上11:59之间从Itron 400A隔膜式燃气表收集的。我们采用了一种系统的方法来标记图像,验证标签,并确保注释的质量。该数据集包含整个煤气表的28883幅图像,以及左、右刻度盘显示的57766幅裁剪图像。我们希望NRC-GAMMA数据集有助于研究团体设计和实施准确、创新、智能和可再生的自动燃气表读数解决方案。 摘要:Automatic meter reading technology is not yet widespread. Gas, electricity, or water accumulation meters reading is mostly done manually on-site either by an operator or by the homeowner. In some countries, the operator takes a picture as reading proof to confirm the reading by checking offline with another operator and/or using it as evidence in case of conflicts or complaints. The whole process is time-consuming, expensive, and prone to errors. Automation can optimize and facilitate such labor-intensive and human error-prone processes. With the recent advances in the fields of artificial intelligence and computer vision, automatic meter reading systems are becoming more viable than ever. Motivated by the recent advances in the field of artificial intelligence and inspired by open-source open-access initiatives in the research community, we introduce a novel large benchmark dataset of real-life gas meter images, named the NRC-GAMMA dataset. The data were collected from an Itron 400A diaphragm gas meter on January 20, 2020, between 00:05 am and 11:59 pm. We employed a systematic approach to label the images, validate the labellings, and assure the quality of the annotations. The dataset contains 28,883 images of the entire gas meter along with 57,766 cropped images of the left and the right dial displays. We hope the NRC-GAMMA dataset helps the research community to design and implement accurate, innovative, intelligent, and reproducible automatic gas meter reading solutions.
【4】 Catastrophe, Compounding & Consistency in Choice 标题:选择中的突变、复合与一致性 链接:https://arxiv.org/abs/2111.06804
作者:Chris Gagne,Peter Dayan 机构:MPI for Biological Cybernetics, Tübingen, Germany, University of Tübingen 摘要:条件风险价值(CVaR)准确地描述了罕见灾难性事件对决策的影响。这些特征对于正常的决策和焦虑症等精神疾病都很重要,特别是对于可能最终导致灾难的决策序列。CVaR,像其他有根据的风险度量一样,在这些序列上以复杂的方式复合——我们最近正式确定了三种结构上不同的形式,其中风险要么平均,要么增加。不幸的是,现有的认知任务不能很好地区分这些方法;在这里,我们提供了一些例子来突出它们的独特特征,并为这两种时间一致的方法与时间折扣建立了正式的联系。这些例子可以为未来的实验打下基础,以更广泛的目标来描述风险态度,特别是对于长期问题和精神病态人群。 摘要:Conditional value-at-risk (CVaR) precisely characterizes the influence that rare, catastrophic events can exert over decisions. Such characterizations are important for both normal decision-making and for psychiatric conditions such as anxiety disorders -- especially for sequences of decisions that might ultimately lead to disaster. CVaR, like other well-founded risk measures, compounds in complex ways over such sequences -- and we recently formalized three structurally different forms in which risk either averages out or multiplies. Unfortunately, existing cognitive tasks fail to discriminate these approaches well; here, we provide examples that highlight their unique characteristics, and make formal links to temporal discounting for the two of the approaches that are time consistent. These examples can ground future experiments with the broader aim of characterizing risk attitudes, especially for longer horizon problems and in psychopathological populations.
【5】 Two steps to risk sensitivity 标题:风险敏感性的两个步骤 链接:https://arxiv.org/abs/2111.06803
作者:Chris Gagne,Peter Dayan 机构:MPI for Biological Cybernetics, Tübingen, Germany, University of Tübingen 摘要:分布强化学习(RL)——代理学习其行为的所有可能的长期后果,而不仅仅是预期值——最近引起了极大的兴趣。分配观点最重要的启示之一是,在结果不完全确定的情况下,促进一种现代的、可测量的风险处理方法。相比之下,对风险决策的心理和神经科学研究利用了各种更为古老的理论模型,如前景理论,这些模型缺乏公理上可取的特性,如一致性。在这里,我们考虑一个特别相关的风险度量建模人类和动物的规划,称为条件风险值(CVaR),量化最坏情况的结果(例如,车辆事故或捕食)。我们首先采用传统的分布方法在连续环境下计算CVaR,并重新分析人类决策者在著名的两步任务中的选择,揭示出在粘性和持续性下潜伏的大量风险厌恶。然后,我们考虑风险敏感性的另一个关键性质,即时间一致性,显示这种形式的CVaR的替代品,该CVaR具有这种理想特性。我们使用模拟来检验各种形式在人类和动物的规划和行为方面存在差异的环境。 摘要:Distributional reinforcement learning (RL) -- in which agents learn about all the possible long-term consequences of their actions, and not just the expected value -- is of great recent interest. One of the most important affordances of a distributional view is facilitating a modern, measured, approach to risk when outcomes are not completely certain. By contrast, psychological and neuroscientific investigations into decision making under risk have utilized a variety of more venerable theoretical models such as prospect theory that lack axiomatically desirable properties such as coherence. Here, we consider a particularly relevant risk measure for modeling human and animal planning, called conditional value-at-risk (CVaR), which quantifies worst-case outcomes (e.g., vehicle accidents or predation). We first adopt a conventional distributional approach to CVaR in a sequential setting and reanalyze the choices of human decision-makers in the well-known two-step task, revealing substantial risk aversion that had been lurking under stickiness and perseveration. We then consider a further critical property of risk sensitivity, namely time consistency, showing alternatives to this form of CVaR that enjoy this desirable characteristic. We use simulations to examine settings in which the various forms differ in ways that have implications for human and animal planning and behavior.
【6】 AWD3: Dynamic Reduction of the Estimation Bias 标题:AWD3:动态减小估计偏差 链接:https://arxiv.org/abs/2111.06780
作者:Dogan C. Cicek,Enes Duran,Baturay Saglam,Kagan Kaya,Furkan B. Mutlu,Suleyman S. Kozat 机构:Electrical and Electronics Engineering Department, Bilkent University, Ankara, Turkey, Equal contribution, †IEEE Senior Member 备注:Accepted at The 33rd IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2021) 摘要:基于值的深度强化学习(RL)算法存在主要由函数逼近和时间差(TD)学习引起的估计偏差。该问题会导致错误的状态动作值估计,从而损害学习算法的性能和鲁棒性。尽管提出了几种技术来解决这个问题,但学习算法仍然存在这种偏差。在这里,我们介绍了一种技术,它使用经验重放机制消除了非策略连续控制算法中的估计偏差。我们在加权双延迟深确定性策略梯度算法中自适应学习加权超参数beta。我们的方法称为自适应WD3(AWD3)。我们通过OpenAI gym的连续控制环境表明,我们的算法匹配或优于最先进的非策略梯度学习算法。 摘要:Value-based deep Reinforcement Learning (RL) algorithms suffer from the estimation bias primarily caused by function approximation and temporal difference (TD) learning. This problem induces faulty state-action value estimates and therefore harms the performance and robustness of the learning algorithms. Although several techniques were proposed to tackle, learning algorithms still suffer from this bias. Here, we introduce a technique that eliminates the estimation bias in off-policy continuous control algorithms using the experience replay mechanism. We adaptively learn the weighting hyper-parameter beta in the Weighted Twin Delayed Deep Deterministic Policy Gradient algorithm. Our method is named Adaptive-WD3 (AWD3). We show through continuous control environments of OpenAI gym that our algorithm matches or outperforms the state-of-the-art off-policy policy gradient learning algorithms.
【7】 Explainability and the Fourth AI Revolution 标题:可解释性与第四次人工智能革命 链接:https://arxiv.org/abs/2111.06773
作者:Loizos Michael 机构:INNOVATION AND ENTREPRENEURSHIP” to be published by Edward Elgar Publishing Loizos Michael Open University of Cyprus & CYENS Center of Excellence loizos 摘要:本章从数据组织自动化过程的角度讨论人工智能,并举例说明可解释性在从当前一代人工智能系统转移到下一代人工智能系统中所起的作用,在这里,人类的角色从为人工智能系统工作的数据注释者提升到与人工智能系统工作的协作者。 摘要:This chapter discusses AI from the prism of an automated process for the organization of data, and exemplifies the role that explainability has to play in moving from the current generation of AI systems to the next one, where the role of humans is lifted from that of data annotators working for the AI systems to that of collaborators working with the AI systems.
【8】 Multiway Storage Modification Machines 标题:多路存储改进机 链接:https://arxiv.org/abs/2111.06757
作者:J. -M. Chauvet 备注:15 pages, 6 figures 摘要:我们介绍了Sch \“onhage存储修改机的并行版本,即多路存储修改机(MWSMM)与Tromp和van Emde Boas的替代关联存储修改机一样,MWSMM在多项式时间内识别图灵机在多项式空间内识别的东西。因此,MWSMM属于第二类机器,是符合并行计算论文的并行机模型。我们通过一个简单的实现来说明MWSMM沃尔夫拉姆的弦代换系统。 摘要:We present a parallel version of Sch\"onhage's Storage Modification Machine, the Multiway Storage Modification Machine (MWSMM). Like the alternative Association Storage Modification Machine of Tromp and van Emde Boas, MWSMMs recognize in polynomial time what Turing Machines recognize in polynomial space. Falling thus into the Second Machine Class, the MWSMM is a parallel machine model conforming to the Parallel Computation Thesis. We illustrate MWSMMs by a simple implementation of Wolfram's String Substitution System.
【9】 STFL: A Temporal-Spatial Federated Learning Framework for Graph Neural Networks 标题:STFL:一种图神经网络的时空联合学习框架 链接:https://arxiv.org/abs/2111.06750
作者:Guannan Lou,Yuze Liu,Tiehua Zhang,Xi Zheng 机构:Macquarie University, Ant Group 摘要:我们提出了一个图神经网络的时空联合学习框架,即STFL。该框架探索了输入时空数据的潜在相关性,并将其转换为节点特征和邻接矩阵。框架中的联邦学习设置确保了数据隐私,同时实现了良好的模型泛化。在睡眠阶段数据集ISRUC_S3上的实验结果说明了STFL在图形预测任务中的有效性。 摘要:We present a spatial-temporal federated learning framework for graph neural networks, namely STFL. The framework explores the underlying correlation of the input spatial-temporal data and transform it to both node features and adjacency matrix. The federated learning setting in the framework ensures data privacy while achieving a good model generalization. Experiments results on the sleep stage dataset, ISRUC_S3, illustrate the effectiveness of STFL on graph prediction tasks.
【10】 One model Packs Thousands of Items with Recurrent Conditional Query Learning 标题:一个模型使用递归条件查询学习来打包数千个项目 链接:https://arxiv.org/abs/2111.06726
作者:Dongda Li,Zhaoquan Gu,Yuexuan Wang,Changwei Ren,Francis C. M. Lau 机构:Guangzhou University, Zhejiang University,The University of Hong Kong 备注:None 摘要:最近的研究表明,神经组合优化(NCO)在路由等许多组合优化问题上比传统算法具有优势,但在复杂的优化任务(如包含相互制约的动作空间的布局)中效率较低。在本文中,我们提出了一种递归条件查询学习(RCQL)方法来解决二维和三维包装问题。我们首先通过一个循环编码器嵌入状态,然后通过来自先前操作的条件查询采用注意。条件查询机制填补了学习步骤之间的信息鸿沟,将问题塑造为马尔可夫决策过程。得益于重复性,单个RCQL模型能够处理不同规模的包装问题。实验结果表明,RCQL能够有效地学习离线和在线条形包装问题(SPP)的强启发式算法,在空间利用率方面优于各种基线。与最先进的方法相比,RCQL在离线2D 40箱情况下将平均箱间距比降低1.83%,在3D情况下将平均箱间距比降低7.84%。同时,我们的方法还实现了1000个项目的SPP的空间利用率比现有技术高5.64%。 摘要:Recent studies have revealed that neural combinatorial optimization (NCO) has advantages over conventional algorithms in many combinatorial optimization problems such as routing, but it is less efficient for more complicated optimization tasks such as packing which involves mutually conditioned action spaces. In this paper, we propose a Recurrent Conditional Query Learning (RCQL) method to solve both 2D and 3D packing problems. We first embed states by a recurrent encoder, and then adopt attention with conditional queries from previous actions. The conditional query mechanism fills the information gap between learning steps, which shapes the problem as a Markov decision process. Benefiting from the recurrence, a single RCQL model is capable of handling different sizes of packing problems. Experiment results show that RCQL can effectively learn strong heuristics for offline and online strip packing problems (SPPs), outperforming a wide range of baselines in space utilization ratio. RCQL reduces the average bin gap ratio by 1.83% in offline 2D 40-box cases and 7.84% in 3D cases compared with state-of-the-art methods. Meanwhile, our method also achieves 5.64% higher space utilization ratio for SPPs with 1000 items than the state of the art.
【11】 Causal Multi-Agent Reinforcement Learning: Review and Open Problems 标题:因果多智能体强化学习:综述和有待解决的问题 链接:https://arxiv.org/abs/2111.06721
作者:St John Grimbly,Jonathan Shock,Arnu Pretorius 机构:University of Cape Town, InstaDeep 备注:Accepted at CoopAI NeurIPS Workshop 2021 摘要:本文旨在向读者介绍多智能体强化学习(MARL)领域及其与因果关系研究方法的交叉。我们强调MARL中的关键挑战,并在因果方法如何帮助解决这些挑战的背景下讨论这些挑战。我们提倡对泥灰岩采取“因果关系优先”的观点。具体而言,我们认为因果关系可以提高安全性、可解释性和鲁棒性,同时也为紧急行为提供强有力的理论保证。我们讨论共同挑战的潜在解决方案,并利用这一背景来推动未来的研究方向。 摘要:This paper serves to introduce the reader to the field of multi-agent reinforcement learning (MARL) and its intersection with methods from the study of causality. We highlight key challenges in MARL and discuss these in the context of how causal methods may assist in tackling them. We promote moving toward a 'causality first' perspective on MARL. Specifically, we argue that causality can offer improved safety, interpretability, and robustness, while also providing strong theoretical guarantees for emergent behaviour. We discuss potential solutions for common challenges, and use this context to motivate future research directions.
【12】 deepstruct -- linking deep learning and graph theory 标题:深度结构--连接深度学习与图论 链接:https://arxiv.org/abs/2111.06679
作者:Julian Stier,Michael Granitzer 机构:Chair of Data Science, University of Passau 摘要:deepstruct将深度学习模型和图论连接起来,这样可以对神经网络施加不同的图结构,或者从训练过的神经网络模型中提取图结构。为此,deepstruct提供了具有不同限制的深度神经网络模型,这些模型可以基于初始图创建。此外,还提供了从经过训练的模型中提取图形结构的工具。即使对于只有几万个参数的模型来说,提取图形的这一步骤在计算上也很昂贵,这是一个具有挑战性的问题。deepstruct支持修剪、神经结构搜索、自动网络设计和神经网络结构分析方面的研究。 摘要:deepstruct connects deep learning models and graph theory such that different graph structures can be imposed on neural networks or graph structures can be extracted from trained neural network models. For this, deepstruct provides deep neural network models with different restrictions which can be created based on an initial graph. Further, tools to extract graph structures from trained models are available. This step of extracting graphs can be computationally expensive even for models of just a few dozen thousand parameters and poses a challenging problem. deepstruct supports research in pruning, neural architecture search, automated network design and structure analysis of neural networks.
【13】 Attention Guided Cosine Margin For Overcoming Class-Imbalance in Few-Shot Road Object Detection 标题:注意力引导余弦裕度克服Few-Shot道路目标检测中的类不平衡 链接:https://arxiv.org/abs/2111.06639
作者:Ashutosh Agarwal,Anay Majee,Anbumani Subramanian,Chetan Arora 机构:IIT Delhi, Intel Corporation 备注:8 pages, 4 figures 摘要:Few-Shot目标检测(FSOD)仅在给定少量数据样本的情况下对图像中的目标进行定位和分类。FSOD研究的最新趋势表明采用了度量和元学习技术,这容易导致灾难性遗忘和课堂混乱。为了克服基于度量学习的FSOD技术中的这些缺陷,我们引入了注意力引导的余弦裕度(AGCM),这有助于在对象检测器的分类头中创建更紧密且分离良好的类特定特征簇。我们的新的注意建议融合(APF)模块通过减少共同发生的类之间的类内方差来最小化灾难性遗忘。同时,提出的余弦裕度交叉熵损失增加了混淆类之间的角裕度,以克服已学习(基本)类和新添加(新)类之间的类混淆的挑战。我们在具有挑战性的印度驾驶数据集(IDD)上进行了实验,该数据集与流行的FSOD基准PASCAL-VOC一起呈现了一个真实世界级的不平衡设置。我们的方法优于最先进的(SoTA)方法,在IDD-OS上最多可获得6.4个贴图点,在10次拍摄设置下,在IDD-10分割上最多可获得2.0个贴图点。在PASCAL-VOC数据集上,我们比现有的SoTA方法高出4.9个映射点。 摘要:Few-shot object detection (FSOD) localizes and classifies objects in an image given only a few data samples. Recent trends in FSOD research show the adoption of metric and meta-learning techniques, which are prone to catastrophic forgetting and class confusion. To overcome these pitfalls in metric learning based FSOD techniques, we introduce Attention Guided Cosine Margin (AGCM) that facilitates the creation of tighter and well separated class-specific feature clusters in the classification head of the object detector. Our novel Attentive Proposal Fusion (APF) module minimizes catastrophic forgetting by reducing the intra-class variance among co-occurring classes. At the same time, the proposed Cosine Margin Cross-Entropy loss increases the angular margin between confusing classes to overcome the challenge of class confusion between already learned (base) and newly added (novel) classes. We conduct our experiments on the challenging India Driving Dataset (IDD), which presents a real-world class-imbalanced setting alongside popular FSOD benchmark PASCAL-VOC. Our method outperforms State-of-the-Art (SoTA) approaches by up to 6.4 mAP points on the IDD-OS and up to 2.0 mAP points on the IDD-10 splits for the 10-shot setting. On the PASCAL-VOC dataset, we outperform existing SoTA approaches by up to 4.9 mAP points.
【14】 Meta-Teacher For Face Anti-Spoofing 标题:面向人脸反欺骗的元教师 链接:https://arxiv.org/abs/2111.06638
作者:Yunxiao Qin,Zitong Yu,Longbin Yan,Zezheng Wang,Chenxu Zhao,Zhen Lei 备注:Accepted by IEEE TPAMI-2021 摘要:人脸防欺骗(FAS)可确保人脸识别免受演示文稿攻击(PAs)。现有的FAS方法通常使用手工制作的二进制或像素级标签监控PA探测器。然而,手工制作的标签可能不是监督PA探测器学习足够和内在欺骗线索的最适当方式。我们提出了一种新的元教师FAS(MT-FAS)方法来代替手工制作的标签,以训练元教师更有效地监督PA检测器。元教师以双层优化方式接受训练,学习监督PA检测器学习丰富欺骗线索的能力。双层优化包含两个关键部分:1)较低级别的训练,其中元教师在训练集中监督检测器的学习过程;2)更高层次的训练,通过最小化检测器的验证损失来优化元教师的教学绩效。我们的元教师与现有的师生模式有很大的不同,因为元教师被明确地训练为更好地教授检测器(学生),而现有教师被训练为具有卓越的准确性,忽略了教学能力。在五个FAS基准上进行的大量实验表明,与手工制作的标签和现有的师生模型相比,经过训练的元教师(1)可以提供更适合的监督;2)显著提高了PA探测器的性能。 摘要:Face anti-spoofing (FAS) secures face recognition from presentation attacks (PAs). Existing FAS methods usually supervise PA detectors with handcrafted binary or pixel-wise labels. However, handcrafted labels may are not the most adequate way to supervise PA detectors learning sufficient and intrinsic spoofing cues. Instead of using the handcrafted labels, we propose a novel Meta-Teacher FAS (MT-FAS) method to train a meta-teacher for supervising PA detectors more effectively. The meta-teacher is trained in a bi-level optimization manner to learn the ability to supervise the PA detectors learning rich spoofing cues. The bi-level optimization contains two key components: 1) a lower-level training in which the meta-teacher supervises the detector's learning process on the training set; and 2) a higher-level training in which the meta-teacher's teaching performance is optimized by minimizing the detector's validation loss. Our meta-teacher differs significantly from existing teacher-student models because the meta-teacher is explicitly trained for better teaching the detector (student), whereas existing teachers are trained for outstanding accuracy neglecting teaching ability. Extensive experiments on five FAS benchmarks show that with the proposed MT-FAS, the trained meta-teacher 1) provides better-suited supervision than both handcrafted labels and existing teacher-student models; and 2) significantly improves the performances of PA detectors.
【15】 A Convolutional Neural Network Based Approach to Recognize Bangla Spoken Digits from Speech Signal 标题:一种基于卷积神经网络的孟加拉语音数字识别方法 链接:https://arxiv.org/abs/2111.06625
作者:Ovishake Sen,Al-Mahmud,Pias Roy 机构:Computer Science and Engineering, Khulna University of Engineering, & Technology, Khulna, Bangladesh 备注:4 pages, 5 figures, 2021 International Conference on Electronics, Communications and Information Technology (ICECIT), 14 to 16 September 2021, Khulna, Bangladesh 摘要:语音识别是一种将人类的语音信号转换成文本或文字,或以计算机或其他机器容易理解的任何形式的技术。有一些关于孟加拉语数字识别系统的研究,其中大多数使用的是在性别、年龄、方言和其他变量上几乎没有变化的小型数据集。本研究使用不同性别、年龄和方言的孟加拉国人的录音来创建一个大型语音数据集,该数据集包含说话的“0-9”孟加拉语数字。在这里,为创建数据集,每个数字记录了400个噪声和无噪声样本。Mel倒谱系数(MFCC)被用于从原始语音数据中提取有意义的特征。然后,利用卷积神经网络(CNN)检测孟加拉语数字。建议的技术在整个数据集中识别“0-9”孟加拉语语音数字的准确率为97.1%。使用10倍交叉验证对模型的效率进行了评估,获得了96.7%的准确率。 摘要:Speech recognition is a technique that converts human speech signals into text or words or in any form that can be easily understood by computers or other machines. There have been a few studies on Bangla digit recognition systems, the majority of which used small datasets with few variations in genders, ages, dialects, and other variables. Audio recordings of Bangladeshi people of various genders, ages, and dialects were used to create a large speech dataset of spoken '0-9' Bangla digits in this study. Here, 400 noisy and noise-free samples per digit have been recorded for creating the dataset. Mel Frequency Cepstrum Coefficients (MFCCs) have been utilized for extracting meaningful features from the raw speech data. Then, to detect Bangla numeral digits, Convolutional Neural Networks (CNNs) were utilized. The suggested technique recognizes '0-9' Bangla spoken digits with 97.1% accuracy throughout the whole dataset. The efficiency of the model was also assessed using 10-fold crossvalidation, which yielded a 96.7% accuracy.
【16】 PESTO: Switching Point based Dynamic and Relative Positional Encoding for Code-Mixed Languages 标题:PESTO:基于切换点的代码混合语言动态相对位置编码 链接:https://arxiv.org/abs/2111.06599
作者:Mohsin Ali,Kandukuri Sai Teja,Sumanth Manduru,Parth Patwa,Amitava Das 机构:IIIT Sri City, India, UCLA, USA, Wipro AI Labs, India, AI Institute, University of South Carolina, USA 备注:Accepted as Student Abstract at AAAI 2022 摘要:最近,针对代码混合(CM)或混合语言文本的NLP应用取得了巨大的发展势头,主要原因是在印度、墨西哥、欧洲、美国部分地区等多语言社会的社交媒体通信中,语言混合非常普遍。单词嵌入是当今任何NLP系统的基本构建块,CM语言的单词嵌入是一个尚未探索的领域。CM单词嵌入的主要瓶颈是语言切换的切换点。由于所见示例的高度差异,这些位置缺乏上下文和统计系统,无法对这种现象进行建模。在本文中,我们提出了我们的初步观察应用开关点为基础的位置编码技术的CM语言,特别是Hinglish(印地语英语)。结果仅略好于SOTA,但很明显,位置编码是为CM文本训练位置敏感语言模型的有效方法。 摘要:NLP applications for code-mixed (CM) or mix-lingual text have gained a significant momentum recently, the main reason being the prevalence of language mixing in social media communications in multi-lingual societies like India, Mexico, Europe, parts of USA etc. Word embeddings are basic build-ing blocks of any NLP system today, yet, word embedding for CM languages is an unexplored territory. The major bottleneck for CM word embeddings is switching points, where the language switches. These locations lack in contextually and statistical systems fail to model this phenomena due to high variance in the seen examples. In this paper we present our initial observations on applying switching point based positional encoding techniques for CM language, specifically Hinglish (Hindi - English). Results are only marginally better than SOTA, but it is evident that positional encoding could bean effective way to train position sensitive language models for CM text.
【17】 On-the-Fly Rectification for Robust Large-Vocabulary Topic Inference 标题:稳健大词汇量话题推理的飞翔纠错 链接:https://arxiv.org/abs/2111.06580
作者:Moontae Lee,Sungjun Cho,Kun Dong,David Mimno,David Bindel 机构:Cornell University 摘要:在许多数据域中,关于对象的联合外观的共现统计信息非常有用。通过将无监督学习问题转化为共现统计的分解,谱算法为后验推理(如潜在主题分析和社区检测)提供了透明高效的算法。然而,随着对象词汇表的增长,存储和运行基于共现统计的推理算法的成本会迅速增加。纠正共现(维护模型假设的关键过程)在出现稀有术语时变得越来越重要,但目前的技术无法扩展到大型词汇表。我们提出了一种新的方法,可以同时压缩和校正共现统计信息,并根据词汇的大小和潜在空间的大小进行适当的缩放。我们还提出了从压缩统计数据中学习潜在变量的新算法,并验证了我们的方法在文本和非文本数据上的性能与以前的方法相当。 摘要:Across many data domains, co-occurrence statistics about the joint appearance of objects are powerfully informative. By transforming unsupervised learning problems into decompositions of co-occurrence statistics, spectral algorithms provide transparent and efficient algorithms for posterior inference such as latent topic analysis and community detection. As object vocabularies grow, however, it becomes rapidly more expensive to store and run inference algorithms on co-occurrence statistics. Rectifying co-occurrence, the key process to uphold model assumptions, becomes increasingly more vital in the presence of rare terms, but current techniques cannot scale to large vocabularies. We propose novel methods that simultaneously compress and rectify co-occurrence statistics, scaling gracefully with the size of vocabulary and the dimension of latent space. We also present new algorithms learning latent variables from the compressed statistics, and verify that our methods perform comparably to previous approaches on both textual and non-textual data.
【18】 Self-supervised GAN Detector 标题:自监督GaN探测器 链接:https://arxiv.org/abs/2111.06575
作者:Yonghyun Jeong,Doyeon Kim,Pyounggeon Kim,Youngmin Ro,Jongwon Choi 机构:Samsung SDS,Chung-Ang University 摘要:尽管生成模式的最新发展为社会带来了各种各样的好处,但它也可能被恶意目的滥用,如欺诈、诽谤和假新闻。为了防止这种情况发生,进行了大量的研究,以区分生成的图像与真实图像,但在训练环境之外区分未看到的生成图像仍然存在挑战。这种限制是由于模型对特定GAN生成的训练数据的过度拟合问题而产生的数据依赖性造成的。为了克服这个问题,我们采用了一种自监督方案来提出一个新的框架。我们提出的方法由人工指纹发生器和GAN探测器组成,人工指纹发生器重建GAN图像的高质量人工指纹进行详细分析,GAN探测器通过学习重建的人工指纹识别GAN图像。为了提高人工指纹发生器的通用性,我们构建了具有不同上卷积层数的多个自动编码器。通过大量的烧蚀研究,我们的方法的鲁棒泛化性能优于以前最先进的算法的泛化性能,即使不使用训练数据集的GAN图像。 摘要:Although the recent advancement in generative models brings diverse advantages to society, it can also be abused with malicious purposes, such as fraud, defamation, and fake news. To prevent such cases, vigorous research is conducted to distinguish the generated images from the real images, but challenges still remain to distinguish the unseen generated images outside of the training settings. Such limitations occur due to data dependency arising from the model's overfitting issue to the training data generated by specific GANs. To overcome this issue, we adopt a self-supervised scheme to propose a novel framework. Our proposed method is composed of the artificial fingerprint generator reconstructing the high-quality artificial fingerprints of GAN images for detailed analysis, and the GAN detector distinguishing GAN images by learning the reconstructed artificial fingerprints. To improve the generalization of the artificial fingerprint generator, we build multiple autoencoders with different numbers of upconvolution layers. With numerous ablation studies, the robust generalization of our method is validated by outperforming the generalization of the previous state-of-the-art algorithms, even without utilizing the GAN images of the training dataset.
【19】 Nonlinear Tensor Ring Network 标题:非线性张量环网络 链接:https://arxiv.org/abs/2111.06532
作者:Xiao Peng Li,Qi Liu,Hing Cheung So 摘要:最先进的深度神经网络(DNN)已广泛应用于各种实际应用,并在认知问题上取得了显著的性能。然而,DNN在体系结构上的宽度和深度的增加导致了大量参数对存储和内存成本的挑战,从而限制了DNN在资源受限的平台(如便携式设备)上的使用。通过将冗余模型转换为紧凑模型,压缩技术似乎是减少存储和内存消耗的实用解决方案。在本文中,我们发展了一个非线性张量环网络(NTRN),其中完全连接层和卷积层都通过张量环分解进行压缩。此外,为了减少压缩造成的精度损失,在压缩层内的张量收缩和卷积运算中嵌入了一个非线性激活函数。实验结果证明了所提出的NTRN在三个数据集上的有效性和优越性。MNIST、时尚MNIST和Cifar-10。 摘要:The state-of-the-art deep neural networks (DNNs) have been widely applied for various real-world applications, and achieved significant performance for cognitive problems. However, the increment of DNNs' width and depth in architecture results in a huge amount of parameters to challenge the storage and memory cost, limiting to the usage of DNNs on resource-constrained platforms, such as portable devices. By converting redundant models into compact ones, compression technique appears to be a practical solution to reducing the storage and memory consumption. In this paper, we develop a nonlinear tensor ring network (NTRN) in which both fullyconnected and convolutional layers are compressed via tensor ring decomposition. Furthermore, to mitigate the accuracy loss caused by compression, a nonlinear activation function is embedded into the tensor contraction and convolution operations inside the compressed layer. Experimental results demonstrate the effectiveness and superiority of the proposed NTRN for image classification using two basic neural networks, LeNet-5 and VGG-11 on three datasets, viz. MNIST, Fashion MNIST and Cifar-10.
【20】 DPLL(MAPF): an Integration of Multi-Agent Path Finding and SAT Solving Technologies 标题:DPLL(MAPF):一种多Agent寻路和SAT求解技术的集成 链接:https://arxiv.org/abs/2111.06494
作者:Martin Čapek,Pavel Surynek 机构:Czech Technical University in Prague, Th´akurova , Praha , Czechia 摘要:在多智能体路径查找(MAPF)中,任务是为多个智能体从初始位置到给定的单个目标位置查找不冲突的路径。MAPF代表了一个经典的人工智能问题,通常通过启发式搜索解决。基于搜索技术的一个重要替代方法是将MAPF编译成不同的形式,如布尔可满足性(SAT)。当代基于SAT的MAPF方法将SAT解算器视为一个外部工具,其任务是返回输入MAPF布尔模型的所有决策变量的赋值。在这篇短文中,我们提出了一种称为DPLL(MAPF)的新编译方案,其中决策变量部分赋值与MAPF规则的一致性检查直接集成到SAT解算器中。此方案允许更自动化的编译,其中SAT解算器和一致性检查过程同时工作以创建布尔模型并搜索其满意的赋值。 摘要:In multi-agent path finding (MAPF), the task is to find non-conflicting paths for multiple agents from their initial positions to given individual goal positions. MAPF represents a classical artificial intelligence problem often addressed by heuristic-search. An important alternative to search-based techniques is compilation of MAPF to a different formalism such as Boolean satisfiability (SAT). Contemporary SAT-based approaches to MAPF regard the SAT solver as an external tool whose task is to return an assignment of all decision variables of a Boolean model of input MAPF. We present in this short paper a novel compilation scheme called DPLL(MAPF) in which the consistency checking of partial assignments of decision variables with respect to the MAPF rules is integrated directly into the SAT solver. This scheme allows for far more automated compilation where the SAT solver and the consistency checking procedure work together simultaneously to create the Boolean model and to search for its satisfying assignment.
【21】 Variational Auto-Encoder Architectures that Excel at Causal Inference 标题:擅长因果推理的变分自动编码器体系结构 链接:https://arxiv.org/abs/2111.06486
作者:Negar Hassanpour,Russell Greiner 机构:Department of Computing Science, University of Alberta, Amii, Edmonton, Canada 摘要:从观察数据(在个人或群体层面)估计因果效应对于做出许多类型的决策至关重要。解决这一任务的一种方法是学习数据基本因素的分解表示;当存在混杂因素(影响因果)时,这将变得更具挑战性。在本文中,我们采取了一种生成方法,该方法建立在变分自动编码器的最新进展的基础上,以同时了解这些潜在因素以及因果关系。我们提出了一个渐进的模型序列,其中每个模型都比前一个模型有所改进,最终形成了混合模型。我们的实证结果表明,这三种模型的性能都优于文献中最先进的区分方法和其他生成方法。 摘要:Estimating causal effects from observational data (at either an individual -- or a population -- level) is critical for making many types of decisions. One approach to address this task is to learn decomposed representations of the underlying factors of data; this becomes significantly more challenging when there are confounding factors (which influence both the cause and the effect). In this paper, we take a generative approach that builds on the recent advances in Variational Auto-Encoders to simultaneously learn those underlying factors as well as the causal effects. We propose a progressive sequence of models, where each improves over the previous one, culminating in the Hybrid model. Our empirical results demonstrate that the performance of all three proposed models are superior to both state-of-the-art discriminative as well as other generative approaches in the literature.
【22】 Sequential Aggregation and Rematerialization: Distributed Full-batch Training of Graph Neural Networks on Large Graphs 标题:顺序聚集与再物质化:图神经网络在大型图上的分布式全批次训练 链接:https://arxiv.org/abs/2111.06483
作者:Hesham Mostafa 摘要:我们提出了一种顺序聚合和再物质化(SAR)方案,用于在大型图上对图神经网络(GNN)进行分布式全批量训练。最近,GNNs的大规模训练主要是基于采样的方法和基于不可学习消息传递的方法。另一方面,SAR是一种分布式技术,可以在整个大型图形上直接训练任何GNN类型。合成孔径雷达的关键创新在于分布式顺序重物质化方案,该方案在向后传递过程中顺序重新构造,然后释放令人望而却步的大GNN计算图。这导致了出色的内存扩展行为,其中每个工作线程的内存消耗量与工作线程数呈线性下降,即使对于密集连接的图也是如此。使用SAR,我们报告了迄今为止最大规模的全批量GNN训练应用,并展示了随着工作人员数量的增加而节省的大量内存。我们还提出了一种基于核融合和注意矩阵重物质化的通用技术来优化基于注意模型的运行时和内存效率。我们表明,与SAR结合,我们优化的注意内核在基于注意的GNN中可以显著提高速度并节省内存。 摘要:We present the Sequential Aggregation and Rematerialization (SAR) scheme for distributed full-batch training of Graph Neural Networks (GNNs) on large graphs. Large-scale training of GNNs has recently been dominated by sampling-based methods and methods based on non-learnable message passing. SAR on the other hand is a distributed technique that can train any GNN type directly on an entire large graph. The key innovation in SAR is the distributed sequential rematerialization scheme which sequentially re-constructs then frees pieces of the prohibitively large GNN computational graph during the backward pass. This results in excellent memory scaling behavior where the memory consumption per worker goes down linearly with the number of workers, even for densely connected graphs. Using SAR, we report the largest applications of full-batch GNN training to-date, and demonstrate large memory savings as the number of workers increases. We also present a general technique based on kernel fusion and attention-matrix rematerialization to optimize both the runtime and memory efficiency of attention-based models. We show that, coupled with SAR, our optimized attention kernels lead to significant speedups and memory savings in attention-based GNNs.
【23】 SynthBio: A Case Study in Human-AI Collaborative Curation of Text Datasets 标题:SynthBio:文本数据集人工智能协同生成的实例研究 链接:https://arxiv.org/abs/2111.06467
作者:Ann Yuan,Daphne Ippolito,Vitaly Nikolaev,Chris Callison-Burch,Andy Coenen,Sebastian Gehrmann 机构:Google Research, University of Pennsylvania 备注:10 pages, 2 figures, accepted to NeurIPS 2021 Datasets and Benchmarks Track 摘要:NLP研究人员需要更多、更高质量的文本数据集。人类标记的数据集的收集成本很高,而通过自动检索从网络(如WikiBio)收集的数据集很嘈杂,可能包含不必要的偏见。此外,来自网络的数据通常包含在用于预训练模型的数据集中,导致训练集和测试集的无意交叉污染。在这项工作中,我们介绍了一种高效数据集整理的新方法:我们使用大型语言模型为人类评分员提供种子代,从而将数据集编写从编写任务更改为编辑任务。我们使用我们的方法来策划SynthBio——WikiBio的一个新评估集——由描述虚构个体的结构化属性列表组成,映射到自然语言传记。我们发现,我们的虚构传记数据集比维基百科的噪音小,而且在性别和国籍方面更平衡。 摘要:NLP researchers need more, higher-quality text datasets. Human-labeled datasets are expensive to collect, while datasets collected via automatic retrieval from the web such as WikiBio are noisy and can include undesired biases. Moreover, data sourced from the web is often included in datasets used to pretrain models, leading to inadvertent cross-contamination of training and test sets. In this work we introduce a novel method for efficient dataset curation: we use a large language model to provide seed generations to human raters, thereby changing dataset authoring from a writing task to an editing task. We use our method to curate SynthBio - a new evaluation set for WikiBio - composed of structured attribute lists describing fictional individuals, mapped to natural language biographies. We show that our dataset of fictional biographies is less noisy than WikiBio, and also more balanced with respect to gender and nationality.
【24】 Catalytic Role Of Noise And Necessity Of Inductive Biases In The Emergence Of Compositional Communication 标题:噪音的催化作用和诱导偏向在作文交际中的必要性 链接:https://arxiv.org/abs/2111.06464
作者:Łukasz Kuciński,Tomasz Korbak,Paweł Kołodziej,Piotr Miłoś 机构:University of Sussex, Polish Academy of Sciences†, Piotr Miło´s, University of Oxford, deepsense.ai 备注:NeurIPS 2021 摘要:如果复杂信号可以表示为更简单子部分的组合,则通信是合成的。在这篇文章中,我们从理论上证明,发展组合通信需要训练框架和数据上的归纳偏差。此外,我们还证明了在信号博弈中,合成性是自发产生的,在这种博弈中,代理通过噪声信道进行通信。我们通过实验证实,一系列噪声水平(取决于模型和数据)确实促进了合成。最后,我们提供了对这种依赖性的全面研究,并根据最近研究的组成性度量报告了结果:地形相似性、冲突计数和上下文独立性。 摘要:Communication is compositional if complex signals can be represented as a combination of simpler subparts. In this paper, we theoretically show that inductive biases on both the training framework and the data are needed to develop a compositional communication. Moreover, we prove that compositionality spontaneously arises in the signaling games, where agents communicate over a noisy channel. We experimentally confirm that a range of noise levels, which depends on the model and the data, indeed promotes compositionality. Finally, we provide a comprehensive study of this dependence and report results in terms of recently studied compositionality metrics: topographical similarity, conflict count, and context independence.
【25】 Expert Human-Level Driving in Gran Turismo Sport Using Deep Reinforcement Learning with Image-based Representation 标题:基于图像表示的深度强化学习在Gran Turismo运动中的专家人级驾驶 链接:https://arxiv.org/abs/2111.06449
作者:Ryuji Imamura,Takuma Seno,Kenta Kawamoto,Michael Spranger 机构:Tokyo, Sony AI Inc. 备注:Accepted at Deep Reinforcement Learning Workshop at Neural Information Processing Systems 2021 摘要:当人类玩虚拟赛车游戏时,他们使用游戏屏幕上的视觉环境信息来理解环境中的规则。相比之下,一款性能优于人类玩家的最先进的真实赛车游戏AI代理并不使用基于图像的环境信息,而是使用环境提供的紧凑而精确的测量。在本文中,提出了一种基于视觉的控制算法,并使用被称为高保真真实赛车模拟器的Gran Turismo Sport(GTS)在真实赛车场景中与人类运动员在相同条件下的性能进行了比较。在所提出的方法中,用从游戏屏幕图像中提取的特征表示代替传统最先进方法中构成观察的一部分的环境信息。我们证明了所提出的方法在高速驾驶场景下,即使以游戏屏幕图像作为高维输入,也能实现专家级的人-车控制。此外,它在计时任务中的表现优于GTS中的内置AI,其得分使其跻身前10%的约28000名人类玩家之列。 摘要:When humans play virtual racing games, they use visual environmental information on the game screen to understand the rules within the environments. In contrast, a state-of-the-art realistic racing game AI agent that outperforms human players does not use image-based environmental information but the compact and precise measurements provided by the environment. In this paper, a vision-based control algorithm is proposed and compared with human player performances under the same conditions in realistic racing scenarios using Gran Turismo Sport (GTS), which is known as a high-fidelity realistic racing simulator. In the proposed method, the environmental information that constitutes part of the observations in conventional state-of-the-art methods is replaced with feature representations extracted from game screen images. We demonstrate that the proposed method performs expert human-level vehicle control under high-speed driving scenarios even with game screen images as high-dimensional inputs. Additionally, it outperforms the built-in AI in GTS in a time trial task, and its score places it among the top 10% approximately 28,000 human players.
【26】 Observation Error Covariance Specification in Dynamical Systems for Data assimilation using Recurrent Neural Networks 标题:基于递归神经网络的动力系统数据同化观测误差协方差规范 链接:https://arxiv.org/abs/2111.06447
作者:Sibo Cheng,Mingming Qiu 机构:Data Science Instituite, Department of computing, Imperial College, London, UK, Institut Polytechnique de Paris, France, EDF R&D, France, Accepted for publication in Neural computing and applications 备注:The manuscript is accepted for publication in Neural computing and applications 摘要:基于时间序列观测数据,数据同化技术被广泛用于预测具有不确定性的复杂动力系统。误差协方差矩阵建模是数据同化算法中的一个重要组成部分,它对预测精度有很大影响。这些协方差的估计通常依赖于经验假设和物理约束,尤其是对于大尺寸系统,通常不精确且计算成本高。在这项工作中,我们提出了一种基于长短时记忆(LSTM)递归神经网络(RNN)的数据驱动方法,以提高动力系统数据同化中观测协方差规范的准确性和效率。与经典的后验校正方法不同,该方法从观测/模拟的时间序列数据中学习协方差矩阵,不需要任何关于先验误差分布的知识或假设。我们将这种新方法与两种最先进的协方差调整算法,即DI01和D05进行了比较,首先是在Lorenz动力系统中,然后是在具有不同协方差参数化的2D浅水孪生实验框架中,使用集合同化。这种新方法在观测协方差规范、同化精度和计算效率方面显示出显著的优势。 摘要:Data assimilation techniques are widely used to predict complex dynamical systems with uncertainties, based on time-series observation data. Error covariance matrices modelling is an important element in data assimilation algorithms which can considerably impact the forecasting accuracy. The estimation of these covariances, which usually relies on empirical assumptions and physical constraints, is often imprecise and computationally expensive especially for systems of large dimension. In this work, we propose a data-driven approach based on long short term memory (LSTM) recurrent neural networks (RNN) to improve both the accuracy and the efficiency of observation covariance specification in data assimilation for dynamical systems. Learning the covariance matrix from observed/simulated time-series data, the proposed approach does not require any knowledge or assumption about prior error distribution, unlike classical posterior tuning methods. We have compared the novel approach with two state-of-the-art covariance tuning algorithms, namely DI01 and D05, first in a Lorenz dynamical system and then in a 2D shallow water twin experiments framework with different covariance parameterization using ensemble assimilation. This novel method shows significant advantages in observation covariance specification, assimilation accuracy and computational efficiency.
【27】 Personalized multi-faceted trust modeling to determine trust links in social media and its potential for misinformation management 标题:个性化多方面信任建模以确定社交媒体中的信任链接及其对错误信息管理的潜力 链接:https://arxiv.org/abs/2111.06440
作者:Alexandre Parmentier,Robin Cohen,Xueguang Ma,Gaurav Sahu,Queenie Chen 机构:caCheriton School of Computer Science, University of WaterlooCorresponding authorresearch center 备注:28 pages 摘要:在本文中,我们提出了一种基于多智能体信任建模的人工智能方法来预测社交媒体中节点之间的信任链接。特别是,我们提出了一种数据驱动的多方面信任模型,该模型结合了许多不同的特性,可以进行全面的分析。我们重点展示相似用户的集群如何实现一个关键的新功能:支持更个性化的用户预测,从而为用户提供更准确的预测。在一个信任感知的项目推荐任务中,我们在一个大型Yelp数据集的上下文中评估了所提出的框架。然后,我们将讨论如何改进社交媒体中信任关系的检测,以帮助在线用户在最近人气激增的社交网络环境中抵御错误信息和谣言的传播。最后,我们回顾了一个特别脆弱的用户群体,即老年人,以说明对用户群体进行推理的价值,并展望了将已知偏好与通过数据分析获得的见解相结合的未来方向。 摘要:In this paper, we present an approach for predicting trust links between peers in social media, one that is grounded in the artificial intelligence area of multiagent trust modeling. In particular, we propose a data-driven multi-faceted trust modeling which incorporates many distinct features for a comprehensive analysis. We focus on demonstrating how clustering of similar users enables a critical new functionality: supporting more personalized, and thus more accurate predictions for users. Illustrated in a trust-aware item recommendation task, we evaluate the proposed framework in the context of a large Yelp dataset. We then discuss how improving the detection of trusted relationships in social media can assist in supporting online users in their battle against the spread of misinformation and rumours, within a social networking environment which has recently exploded in popularity. We conclude with a reflection on a particularly vulnerable user base, older adults, in order to illustrate the value of reasoning about groups of users, looking to some future directions for integrating known preferences with insights gained through data analysis.
【28】 Explainable AI (XAI): A Systematic Meta-Survey of Current Challenges and Future Opportunities 标题:可解释人工智能(XAI):对当前挑战和未来机遇的系统元调查 链接:https://arxiv.org/abs/2111.06420
作者:Waddah Saeed,Christian Omlin 机构:Center for Artificial Intelligence Research, University of Agder, Grimstad, Norway 备注:29 pages, 2 figures, 4 tables 摘要:在过去的十年里,人工智能(AI)取得了重大进展,算法被用于解决各种问题。然而,这种成功是通过增加模型的复杂性和采用缺乏透明度的黑盒人工智能模型来实现的。为了满足这一需求,提出了可解释人工智能(XAI),以使人工智能更加透明,从而促进人工智能在关键领域的应用。尽管文献中对XAI主题进行了多次回顾,确定了XAI的挑战和潜在研究方向,但这些挑战和研究方向是分散的。因此,本研究对XAI的挑战和未来研究方向进行了系统的元调查,分为两个主题:(1)XAI的一般挑战和研究方向;(2)基于机器学习生命周期阶段的XAI挑战和研究方向:设计、开发和部署。我们相信,我们的元调查为XAI地区的未来勘探提供了指南,从而有助于XAI文献。 摘要:The past decade has seen significant progress in artificial intelligence (AI), which has resulted in algorithms being adopted for resolving a variety of problems. However, this success has been met by increasing model complexity and employing black-box AI models that lack transparency. In response to this need, Explainable AI (XAI) has been proposed to make AI more transparent and thus advance the adoption of AI in critical domains. Although there are several reviews of XAI topics in the literature that identified challenges and potential research directions in XAI, these challenges and research directions are scattered. This study, hence, presents a systematic meta-survey for challenges and future research directions in XAI organized in two themes: (1) general challenges and research directions in XAI and (2) challenges and research directions in XAI based on machine learning life cycle's phases: design, development, and deployment. We believe that our meta-survey contributes to XAI literature by providing a guide for future exploration in the XAI area.
【29】 A Quantum Natural Language Processing Approach to Musical Intelligence 标题:音乐智能的一种量子自然语言处理方法 链接:https://arxiv.org/abs/2111.06741
作者:Eduardo Reck Miranda,Richie Yeung,Anna Pearson,Konstantinos Meichanetzidis,Bob Coecke 机构:ICCMR, University of Plymouth, Plymouth, UK, Cambridge Quantum, Oxford, UK 备注:Pre-publication draft of a chapter to appear in Quantum Computer Music, E. R. Miranda (Ed.) 摘要:人工智能(AI)在音乐方面取得了巨大的进步,特别是在音乐创作和通过互联网进入大型数据库进行商业化方面。我们有兴趣进一步推进这一领域,关注构图。与当前的黑盒人工智能方法不同,我们倡导对生成音乐系统的可解释的作曲观。特别是,我们正在从自然语言处理(NLP)的分布式合成分类(DisCoCat)建模框架(DisCoCat)中引入方法,其动机是音乐语法。量子计算是一项新兴技术,很可能在未来影响音乐产业。因此,我们正在开创量子自然语言处理(QNLP)方法,以开发新一代智能音乐系统。这项工作源于先前在量子硬件上实现DisCoCat语言模型的实验。在本章中,我们将介绍Quanthoven,这是有史以来第一个概念证明,它(a)证明了编程一台量子计算机来学习对传达不同含义的音乐进行分类是可能的,(b)说明了如何利用这种能力来开发一个系统来创作有意义的音乐片段。在讨论了我们目前对音乐作为通信媒介的理解及其与自然语言的关系之后,本章重点介绍了(a)将音乐作品编码为量子电路,以及(b)设计量子分类器的技术。本章以使用该系统创建的作品的演示结束。 摘要:There has been tremendous progress in Artificial Intelligence (AI) for music, in particular for musical composition and access to large databases for commercialisation through the Internet. We are interested in further advancing this field, focusing on composition. In contrast to current black-box AI methods, we are championing an interpretable compositional outlook on generative music systems. In particular, we are importing methods from the Distributional Compositional Categorical (DisCoCat) modelling framework for Natural Language Processing (NLP), motivated by musical grammars. Quantum computing is a nascent technology, which is very likely to impact the music industry in time to come. Thus, we are pioneering a Quantum Natural Language Processing (QNLP) approach to develop a new generation of intelligent musical systems. This work follows from previous experimental implementations of DisCoCat linguistic models on quantum hardware. In this chapter, we present Quanthoven, the first proof-of-concept ever built, which (a) demonstrates that it is possible to program a quantum computer to learn to classify music that conveys different meanings and (b) illustrates how such a capability might be leveraged to develop a system to compose meaningful pieces of music. After a discussion about our current understanding of music as a communication medium and its relationship to natural language, the chapter focuses on the techniques developed to (a) encode musical compositions as quantum circuits, and (b) design a quantum classifier. The chapter ends with demonstrations of compositions created with the system.