访问www.arxivdaily.com获取含摘要速递,涵盖CS|物理|数学|经济|统计|金融|生物|电气领域,更有搜索、收藏、发帖等功能!点击阅读原文即可访问
stat统计学,共计74篇
【1】 Learning from an Exploring Demonstrator: Optimal Reward Estimation for Bandits 标题:向探索者学习:土匪的最优报酬估计
作者:Wenshuo Guo,Kumar Krishna Agrawal,Aditya Grover,Vidya Muthukumar,Ashwin Pananjady 机构:⋄Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, †Facebook AI Research, ‡School of Electrical & Computer Engineering and School of Industrial & Systems Engineering, Georgia Institute of Technology 链接:https://arxiv.org/abs/2106.14866 摘要:通过观察一个低后悔演示者的学习过程,我们引入了估计多武装土匪实例回报的“逆土匪”问题。现有的逆强化学习相关问题的解决方法假设执行一个最优策略,因此存在可辨识性问题。相比之下,我们的范例利用了演示者在通往最优的道路上的行为,特别是在探索阶段,以获得一致的报酬估计。我们开发了简单而有效的奖励估计程序,用于一类基于高置信度的算法的演示,表明奖励估计随着算法的遗憾增加而变得越来越容易。我们将这些上界与适用于任何演示算法的信息论下界相匹配,从而描述探索和报酬估计之间的最佳权衡。对自然科学的合成数据和模拟实验设计数据的大量经验评估证实了我们的理论结果。 摘要:We introduce the "inverse bandit" problem of estimating the rewards of a multi-armed bandit instance from observing the learning process of a low-regret demonstrator. Existing approaches to the related problem of inverse reinforcement learning assume the execution of an optimal policy, and thereby suffer from an identifiability issue. In contrast, our paradigm leverages the demonstrator's behavior en route to optimality, and in particular, the exploration phase, to obtain consistent reward estimates. We develop simple and efficient reward estimation procedures for demonstrations within a class of upper-confidence-based algorithms, showing that reward estimation gets progressively easier as the regret of the algorithm increases. We match these upper bounds with information-theoretic lower bounds that apply to any demonstrator algorithm, thereby characterizing the optimal tradeoff between exploration and reward estimation. Extensive empirical evaluations on both synthetic data and simulated experimental design data from the natural sciences corroborate our theoretical results.
【2】 Bootstrapping the error of Oja's Algorithm 标题:Oja算法的误差自举
作者:Robert Lunde,Purnamrita Sarkar,Rachel Ward 机构:University of Michigan, University of Texas at Austin 链接:https://arxiv.org/abs/2106.14857 摘要:我们考虑了Oja的流式主成分分析算法中前导特征向量估计误差的不确定性量化问题,其中数据是由未知分布生成的IID。结合U-统计量文献中的经典工具和最近关于随机向量二次型和矩阵积集中的高维中心极限定理的结果,我们建立了总体特征向量和Oja算法输出之间的$\sin^2$误差的$\chi^2$近似结果。由于估计与近似分布相关的协方差矩阵需要未知模型参数的知识,我们提出了一种可以在线更新的乘法器自举算法。我们建立了bootstrap分布以高概率接近相应抽样分布的条件,从而在适当的渐近区域建立了bootstrap方法。 摘要:We consider the problem of quantifying uncertainty for the estimation error of the leading eigenvector from Oja's algorithm for streaming principal component analysis, where the data are generated IID from some unknown distribution. By combining classical tools from the U-statistics literature with recent results on high-dimensional central limit theorems for quadratic forms of random vectors and concentration of matrix products, we establish a $\chi^2$ approximation result for the $\sin^2$ error between the population eigenvector and the output of Oja's algorithm. Since estimating the covariance matrix associated with the approximating distribution requires knowledge of unknown model parameters, we propose a multiplier bootstrap algorithm that may be updated in an online manner. We establish conditions under which the bootstrap distribution is close to the corresponding sampling distribution with high probability, thereby establishing the bootstrap as a consistent inferential method in an appropriate asymptotic regime.
【3】 Dynamic Planning and Learning under Recovering Rewards 标题:报酬回收下的动态规划与学习
作者:David Simchi-Levi,Zeyu Zheng,Feng Zhu 机构: the decision maker cansimultaneously pull and collect rewards from at most K 1Institute for Data, USA 2Department of Indus-trial Engineering and Operations Research, University of Cali-fornia 备注:Accepted by ICML 2021 链接:https://arxiv.org/abs/2106.14813 摘要:基于实时流式电子商务、促销和推荐等新兴应用,我们引入了一类多武装强盗问题,该问题具有以下两个特点:(i)决策者在每个时间段可以从N$不同的武器中抽取并收取至多K$的奖励(ii)手臂被拉动后,预期回报立即下降,然后随着空闲时间的增加,非参数恢复。以T$时间段内期望累积报酬最大化为目标,提出、构造并证明了一类“纯周期策略”的性能保证。对于所有模型参数都已知的离线问题,我们提出的策略得到了一个近似比为$1-\mathcal O(1/\sqrt{K})$,当$K$增长到无穷大时,这个近似比是渐近最优的。针对模型参数未知且需要学习的在线问题,设计了一种基于置信上界(UCB)的策略,该策略与离线基准相比具有$\widetilde{\mathcal O}(N\sqrt{T})$遗憾。我们的框架和策略设计可能会被应用到其他离线规划和在线学习应用程序中,这些应用程序具有非固定和可恢复的回报。 摘要:Motivated by emerging applications such as live-streaming e-commerce, promotions and recommendations, we introduce a general class of multi-armed bandit problems that have the following two features: (i) the decision maker can pull and collect rewards from at most $K$ out of $N$ different arms in each time period; (ii) the expected reward of an arm immediately drops after it is pulled, and then non parametrically recovers as the idle time increases. With the objective of maximizing expected cumulative rewards over $T$ time periods, we propose, construct and prove performance guarantees for a class of "Purely Periodic Policies". For the offline problem when all model parameters are known, our proposed policy obtains an approximation ratio that is at the order of $1-\mathcal O(1/\sqrt{K})$, which is asymptotically optimal when $K$ grows to infinity. For the online problem when the model parameters are unknown and need to be learned, we design an Upper Confidence Bound (UCB) based policy that approximately has $\widetilde{\mathcal O}(N\sqrt{T})$ regret against the offline benchmark. Our framework and policy design may have the potential to be adapted into other offline planning and online learning applications with non-stationary and recovering rewards.
【4】 Adaptive greedy algorithm for moderately large dimensions in kernel conditional density estimation 标题:核条件密度估计中大维数的自适应贪婪算法
作者:Minh-Lien Jeanne Nguyen,Claire Lacour,Vincent Rivoirard 机构:Mathematical Institute, University of Leiden, Niels Bohrweg , CA Leiden, Netherlands, LAMA, CNRS, Univ Gustave Eiffel, Univ Paris Est Creteil, F-, Marne-la-Vallée, France, CEREMADE, CNRS, UMR , Université Paris-Dauphine, PSL University, Paris, France 链接:https://arxiv.org/abs/2106.14669 摘要:本文研究了在给定x i=x的情况下,由i.i.d.样本(x i,Y i)$\在R d,i=1,…..中的观测,估计Y i的条件密度f(x,$\乘以$),n。我们假设f只依赖于r未知的分量,通常是rd。提出了一种基于核规则的自适应全非参数估计策略。为了选择核规则的带宽,我们在Rodeo算法(Wasserman和Lafferty(2006))的启发下,提出了一种新的快速迭代算法来检测f的稀疏结构。更准确地说,在minimax设置下,我们的逐点估计既能适应正则性又能适应稀疏性,达到了准最优的收敛速度。其计算复杂度仅为O(dnlogn)。 摘要:This paper studies the estimation of the conditional density f (x, $\times$) of Y i given X i = x, from the observation of an i.i.d. sample (X i , Y i) $\in$ R d , i = 1,. .. , n. We assume that f depends only on r unknown components with typically r d. We provide an adaptive fully-nonparametric strategy based on kernel rules to estimate f. To select the bandwidth of our kernel rule, we propose a new fast iterative algorithm inspired by the Rodeo algorithm (Wasserman and Lafferty (2006)) to detect the sparsity structure of f. More precisely, in the minimax setting, our pointwise estimator, which is adaptive to both the regularity and the sparsity, achieves the quasi-optimal rate of convergence. Its computational complexity is only O(dn log n).
【5】 Data-driven Fair Resource Allocation For Novel Emerging Epidemics: A COVID-19 Convalescent Plasma Case Study 标题:针对新出现的流行病的数据驱动的公平资源分配:冠状病毒恢复期血浆案例研究
作者:Maryam Akbari-Moghaddam,Na Li,Douglas G. Down,Donald M. Arnold,Jeannie Callum,Philippe Bégin,Nancy M. Heddle 机构: Department of Computing and Software, McMaster University, Hamilton, Ontario, Canada, Community Health Sciences, University of Calgary, Calgary, Alberta, Canada 链接:https://arxiv.org/abs/2106.14667 摘要:流行病是严重的公共卫生威胁,减轻其影响的资源通常是有限的。决策者在预测对这些资源的需求方面面临挑战,因为通常无法获得有关该疾病的事先信息,该疾病的行为可能会周期性地发生变化(自然变化或公共卫生政策的结果),并因地理区域而异。在这项工作中,我们讨论了一个模型,适用于短期实时供需预测期间出现的疫情,而不必依赖人口信息。我们提出了一个数据驱动的混合整数规划(MIP)资源分配模型,该模型通过分配可用资源来最大化资源需求实体之间的公平性。将我们的MIP模型应用于COVID-19恢复期血浆(CCP)案例研究的数值结果表明,我们的方法有助于平衡有限产品(如CCP)的供需,并使需求主体的未满足需求比率最小化。 摘要:Epidemics are a serious public health threat, and the resources for mitigating their effects are typically limited. Decision-makers face challenges in forecasting the demand for these resources as prior information about the disease is often not available, the behaviour of the disease can periodically change (either naturally or as a result of public health policies) and differs by geographical region. In this work, we discuss a model that is suitable for short-term real-time supply and demand forecasting during emerging outbreaks without having to rely on demographic information. We propose a data-driven mixed-integer programming (MIP) resource allocation model that assigns available resources to maximize a notion of fairness among the resource-demanding entities. Numerical results from applying our MIP model to a COVID-19 Convalescent Plasma (CCP) case study suggest that our approach can help balance the supply and demand of limited products such as CCP and minimize the unmet demand ratios of the demand entities.
【6】 Pre-treatment of outliers and anomalies in plant data: Methodology and case study of a Vacuum Distillation Unit 标题:工厂数据中异常值和异常的预处理:方法和减压蒸馏装置的实例研究
作者:Kamil Oster,Stefan Güttel,Jonathan L. Shapiro,Lu Chen,Megan Jobson 机构:a Department of Mathematics, The University of Manchester, Alan Turing Building, Oxford Road, Manchester, M,PL, UK, b Process Integration Limited, Station House, Stamford New Road, Altrincham, WA,EP, UK 备注:33 pages, 20 figures, submitted to the Journal of Process Control (ref: JPROCONT-D-21-00332) 链接:https://arxiv.org/abs/2106.14641 摘要:数据预处理对于提高数据质量,从而从原始数据中提取准确的信息起着重要的作用。常用的数据预处理技术之一是离群点检测。所谓的3${\sigma}$方法是识别异常值的常用方法。如手稿所示,它没有识别出所有的异常值,导致数据的整体统计数据可能失真。这个问题会对进一步的数据分析产生重大影响,并会导致预测模型的准确性降低。异常值检测技术有很多种,但是除了理论工作之外,它们都需要案例研究。考虑了两种类型的异常值:短期(错误数据、噪声)和长期异常值(例如,长期故障)。使用的数据来自亚洲炼油厂的减压蒸馏装置(VDU),包括40个物理传感器(温度、压力和流量)。我们使用一种改进的3${\sigma}$阈值方法来识别短期异常值,即将传感器数据分成由变化点确定的块,并在每个块内计算3${\sigma}$阈值,表示接近正态分布。我们已经证明,分段3${\sigma}$方法比适用于整个时间序列的3${\sigma}$方法提供了一种更好的短期异常值检测方法。然而,对于长期的异常值(可以表示数据中的另一种状态),这并不能很好地执行。在这种情况下,我们使用主成分分析(PCA)和Hotelling的$T^2$统计来识别长期异常值。主成分分析的结果采用DBSCAN聚类方法。DBSCAN也能正确识别异常值(PCA方法能准确地检测到)支持PCA方法的一致性和准确性。 摘要:Data pre-treatment plays a significant role in improving data quality, thus allowing extraction of accurate information from raw data. One of the data pre-treatment techniques commonly used is outliers detection. The so-called 3${\sigma}$ method is a common practice to identify the outliers. As shown in the manuscript, it does not identify all outliers, resulting in possible distortion of the overall statistics of the data. This problem can have a significant impact on further data analysis and can lead to reduction in the accuracy of predictive models. There is a plethora of various techniques for outliers detection, however, aside from theoretical work, they all require case study work. Two types of outliers were considered: short-term (erroneous data, noise) and long-term outliers (e.g. malfunctioning for longer periods). The data used were taken from the vacuum distillation unit (VDU) of an Asian refinery and included 40 physical sensors (temperature, pressure and flow rate). We used a modified method for 3${\sigma}$ thresholds to identify the short-term outliers, i.e. ensors data are divided into chunks determined by change points and 3${\sigma}$ thresholds are calculated within each chunk representing near-normal distribution. We have shown that piecewise 3${\sigma}$ method offers a better approach to short-term outliers detection than 3${\sigma}$ method applied to the entire time series. Nevertheless, this does not perform well for long-term outliers (which can represent another state in the data). In this case, we used principal component analysis (PCA) with Hotelling's $T^2$ statistics to identify the long-term outliers. The results obtained with PCA were subject to DBSCAN clustering method. The outliers (which were visually obvious and correctly detected by the PCA method) were also correctly identified by DBSCAN which supported the consistency and accuracy of the PCA method.
【7】 Rao distances and Conformal Mapping 标题:Rao距离与保角映射
作者:Arni S. R. Srinivasa Rao,Steven G. Krantz 机构:Department of Mathematics, Washington University in St. Louis, Missouri, USA 备注:None 链接:https://arxiv.org/abs/2106.14635 摘要:在这篇文章中,我们描述了Rao距离(由于C.R.Rao)和保角三维物体上保角映射的思想。三个命题有助于我们构造\mathbb{R}^{3}中三维对象内的点与复杂平面内的线积分之间的距离。我们强调这些概念在虚拟旅游中的应用。 摘要:In this article, we have described the Rao distance (due to C.R. Rao) and ideas of conformal mappings on 3D objects with angle preservations. Three propositions help us to construct distances between the points within the 3D objects in \mathbb{R}^{3} and line integrals within complex planes. We highlight the application of these concepts to virtual tourism.
【8】 Algebraic Topology for Data Analysis 标题:用于数据分析的代数拓扑
作者:Daniel Trejo Medina,Karla Sarai Jimenez 备注:22 pages, in Spanish 链接:https://arxiv.org/abs/2106.14634 摘要:这项研究提出了一种新的数据分析工具,称为拓扑数据分析TDA,它是数学领域的基础,称为组合代数或最近的代数拓扑,通过充分利用计算统计、概率和拓扑等概念从集合中提取数学特征一系列的数据使我们能够创建和推断关于它们的一般信息和质量信息 摘要:This research addresses a new tool for data analysis known as Topological Data Analysis TDA It underlies an area of Mathematics known as Combinatorial Algebra or more recently Algebraic Topology which through making strong use of Computation Statistics Probability and Topology among other concepts extracts mathematical characteristics from a set of data that allow us associate create and infer general and quality information about them
【9】 Whittle estimation with (quasi-)analytic wavelets 标题:用(准)解析小波进行惠特尔估计
作者:Sophie Achard,Irène Gannaz 机构:Univ. Grenoble Alpes, CNRS, Inria, Grenoble INP, LJK, France, Irene Gannaz, Univ Lyon, INSA Lyon, UJM, UCBL, ECL, ICJ, UMR, Villeurbanne, France 链接:https://arxiv.org/abs/2106.14633 摘要:长记忆的概念被认为是在多元时间序列的情况下,不一定高斯或平稳。长记忆特性由描述每个过程的自相关结构的长记忆参数和测量时间序列之间耦合的长运行协方差定义。在模型中引入了一个阶段项来扩展模型的类别。在这种情况下,我们引入了一种时间序列的准解析小波表示。我们首先证明了小波系数的协方差为过程的协方差结构(包括相位项)提供了一个适当的估计量。然后提出了基于惠特尔近似的一致估计。仿真结果表明,在线性时间序列和多元分数布朗运动上,有限样本的估计是令人满意的。在神经科学的真实数据集上的一个应用被展示,在那里长记忆和大脑连接被推断出来。 摘要:The notion of long-memory is considered in the case of multivariate time series, not necessarily Gaussian nor stationary. The long-memory characteristics are defined by the long-memory parameters describing the autocorrelation structure of each process and the long-run covariance measuring the coupling between time series. A phase term is present in the model to widen the classes of models. We introduce a representation of the time series by quasi-analytic wavelets for inference in this setting. We first show that the covariance of the wavelet coefficients provides an adequate estimator of the covariance structure of the processes, including the phase term. Consistent estimators are then proposed which is based on a Whittle approximation. Simulations highlight a satisfactory behavior of the estimation on finite samples on some linear time series and on multivariate fractional Brownian motions. An application on a real dataset in neuroscience is displayed, where long-memory and brain connectivity are inferred.
【10】 Improved Prediction and Network Estimation Using the Monotone Single Index Multi-variate Autoregressive Model 标题:基于单调单指标多元自回归模型的改进预测和网络估计
作者:Yue Gao,Garvesh Raskutti 机构:Department of Statistics, University of Wisconsin Madison, Madison, WI , USA 链接:https://arxiv.org/abs/2106.14630 摘要:多变量点过程或时间序列数据的网络估计是一个非常重要的问题。以前的工作主要集中在需要已知参数模型的参数方法上,这使得估计过程对模型的不规范性、非线性和异质性的鲁棒性较差。本文提出了一种基于单调单指标多变量自回归模型(SIMAM)的半参数方法。我们为相依数据和交替投影梯度下降算法提供了理论保证。值得注意的是,我们没有显式地假设过程上的混合条件(尽管我们确实需要类似于受限强凸性的条件),并且我们实现了形式为$O(T^{-\frac{1}{3}}\sqrt{s\log(TM)})$(在独立设计情况下是最优的)的速率,其中$s$是表示稀疏度,$M$是演员的数量,$T$是时间点的数量。此外,在模拟数据和两个实际数据的例子中,我们证明了SIMAM方法在预测和网络估计方面都优于最先进的参数化方法。 摘要:Network estimation from multi-variate point process or time series data is a problem of fundamental importance. Prior work has focused on parametric approaches that require a known parametric model, which makes estimation procedures less robust to model mis-specification, non-linearities and heterogeneities. In this paper, we develop a semi-parametric approach based on the monotone single-index multi-variate autoregressive model (SIMAM) which addresses these challenges. We provide theoretical guarantees for dependent data and an alternating projected gradient descent algorithm. Significantly we do not explicitly assume mixing conditions on the process (although we do require conditions analogous to restricted strong convexity) and we achieve rates of the form $O(T^{-\frac{1}{3}} \sqrt{s\log(TM)})$ (optimal in the independent design case) where $s$ is the threshold for the maximum in-degree of the network that indicates the sparsity level, $M$ is the number of actors and $T$ is the number of time points. In addition, we demonstrate the superior performance both on simulated data and two real data examples where our SIMAM approach out-performs state-of-the-art parametric methods both in terms of prediction and network estimation.
【11】 BNPqte: A Bayesian Nonparametric Approach to Causal Inference on Quantiles in R 标题:BNPqTE:R中分位数因果推断的贝叶斯非参数方法
作者:Chuji Luo,Michael J. Daniels 机构:University of Florida 备注:44 pages, 13 figures 链接:https://arxiv.org/abs/2106.14599 摘要:在这篇文章中,我们介绍了BNPqte R软件包,它实现了Xu,Daniels和Winterstein(2018)的贝叶斯非参数方法,用于在观察性研究中估计分位数治疗效果。这种方法提供了潜在结果分布的灵活模型,因此它能够捕获结果、治疗和混杂因素之间的各种潜在关系,并同时估计多个分位数的治疗效果。具体而言,该方法使用贝叶斯加性回归树(BART)模型来估计倾向得分,使用多元正态分布的Dirichlet过程混合模型(DPM)来估计给定倾向得分的潜在结果的条件分布。BNPqte-R包通过为联合和条件密度估计中的多元正态模型的DPM设计有效的R函数,为该方法提供了一个快速的实现。这些R函数大大提高了DPM模型在密度估计中的效率。bnpqtter包中与BART相关的R函数继承自bartr包,并对变量重要性和分裂概率进行了两次修改。为了最大限度地提高计算效率,每个模型的实际采样和计算都是在C++代码中执行的。ARMADILO C++库也用于快速线性代数计算。 摘要:In this article, we introduce the BNPqte R package which implements the Bayesian nonparametric approach of Xu, Daniels and Winterstein (2018) for estimating quantile treatment effects in observational studies. This approach provides flexible modeling of the distributions of potential outcomes, so it is capable of capturing a variety of underlying relationships among the outcomes, treatments and confounders and estimating multiple quantile treatment effects simultaneously. Specifically, this approach uses a Bayesian additive regression trees (BART) model to estimate the propensity score and a Dirichlet process mixture (DPM) of multivariate normals model to estimate the conditional distribution of the potential outcome given the estimated propensity score. The BNPqte R package provides a fast implementation for this approach by designing efficient R functions for the DPM of multivariate normals model in joint and conditional density estimation. These R functions largely improve the efficiency of the DPM model in density estimation, compared to the popular DPpackage. BART-related R functions in the BNPqte R package are inherited from the BART R package with two modifications on variable importance and split probability. To maximize computational efficiency, the actual sampling and computation for each model are carried out in C++ code. The Armadillo C++ library is also used for fast linear algebra calculations.
【12】 Variance Reduction for Matrix Computations with Applications to Gaussian Processes 标题:矩阵计算的方差化简方法及其在高斯过程中的应用
作者:Anant Mathur,Sarat Moka,Zdravko Botev 机构: University of New South Wales High Street, Kensington Sydney, NSW , Macquarie University, Balaclava Rd, Macquarie Park, NSW , Australia 备注:20 pages, 3 figures 链接:https://arxiv.org/abs/2106.14565 摘要:除了计算速度和内存方面的最新发展外,方法学的进步也促进了随机模拟性能的显著提高。在这篇论文中,我们主要研究通过矩阵分解来减少矩阵计算的方差。我们提供洞察现有方差减少方法估计项目的大型矩阵。当矩阵被分解时,流行的方法不会利用方差的减少。我们展示了如何计算矩阵的平方根分解可以在一些重要的情况下获得更好的随机性能。此外,我们提出了一个矩阵积迹的因式分解估计,并用数值方法证明了在估计高斯过程对数似然的某些问题上,该估计的效率可以提高1000倍。此外,我们还提出了一种新的半正定矩阵对数行列式的估计,其中对数行列式被视为概率密度的正规化常数。 摘要:In addition to recent developments in computing speed and memory, methodological advances have contributed to significant gains in the performance of stochastic simulation. In this paper, we focus on variance reduction for matrix computations via matrix factorization. We provide insights into existing variance reduction methods for estimating the entries of large matrices. Popular methods do not exploit the reduction in variance that is possible when the matrix is factorized. We show how computing the square root factorization of the matrix can achieve in some important cases arbitrarily better stochastic performance. In addition, we propose a factorized estimator for the trace of a product of matrices and numerically demonstrate that the estimator can be up to 1,000 times more efficient on certain problems of estimating the log-likelihood of a Gaussian process. Additionally, we provide a new estimator of the log-determinant of a positive semi-definite matrix where the log-determinant is treated as a normalizing constant of a probability density.
【13】 What to do if N is two? 标题:如果N是2,该怎么办?
作者:Pascal Fries,Eric Maris 机构: Ernst Strüngmann Institute (ESI) for Neuroscience in Cooperation with Max Planck Society, Deutschordenstraße , Frankfurt, Germany., Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, Montessorilaan , EN Nijmegen, Netherlands. 备注:9 pages, 1 figure 链接:https://arxiv.org/abs/2106.14562 摘要:体内神经生理学领域目前使用的统计标准是基于传统而非形式分析。通常,来自两个(或几个)动物的数据被汇集在一起进行一次统计测试,或者第一个动物的显著测试被复制到另一个(或几个)动物中。人们普遍认为,使用一种以上的动物可以对种群进行推断。在这里,我们解释说,一个有用的人口推断将需要更大的数字和不同的统计方法。外地应考虑按照这一标准进行研究,可能通过协调多中心的努力,对一些特别重要的问题进行研究。然而,对于许多问题来说,这在道德和/或经济上是不合理的。我们解释了为什么在那些对两个(或几个)动物的研究中,任何有用的推断都局限于被调查动物的样本,而不管它是基于几个动物、两个动物还是单个动物。 摘要:The field of in-vivo neurophysiology currently uses statistical standards that are based on tradition rather than formal analysis. Typically, data from two (or few) animals are pooled for one statistical test, or a significant test in a first animal is replicated in one (or few) further animals. The use of more than one animal is widely believed to allow an inference on the population. Here, we explain that a useful inference on the population would require larger numbers and a different statistical approach. The field should consider to perform studies at that standard, potentially through coordinated multi-center efforts, for selected questions of exceptional importance. Yet, for many questions, this is ethically and/or economically not justifiable. We explain why in those studies with two (or few) animals, any useful inference is limited to the sample of investigated animals, irrespective of whether it is based on few animals, two animals or a single animal.
【14】 Change-Point Detection in Dynamic Networks with Missing Links 标题:具有缺失链路的动态网络中的变点检测
作者:Farida Enikeeva,Olga Klopp 机构:Laboratoire de Math´ematiques et Applications, UMR CNRS , Universit´e de Poitiers, France, ESSEC Business School and CREST, ENSAE, France 链接:https://arxiv.org/abs/2106.14470 摘要:动态网络中结构变化频繁,其检测在欺诈检测、网络安全等许多场合都是一个重要的问题。现实生活中的网络往往由于个体无反应或网络规模过大而无法完全观测到。本文研究了部分观测网络时间序列上的变化点检测问题。目的是测试网络参数是否有变化。我们的方法基于矩阵CUSUM检验统计量,允许网络规模不断扩大。我们证明了所提出的测试是极小极大最优的,并且对丢失的链接具有鲁棒性。通过仿真研究和实际数据应用,验证了该方法的良好性能。 摘要:Structural changes occur in dynamic networks quite frequently and its detection is an important question in many situations such as fraud detection or cybersecurity. Real-life networks are often incompletely observed due to individual non-response or network size. In the present paper we consider the problem of change-point detection at a temporal sequence of partially observed networks. The goal is to test whether there is a change in the network parameters. Our approach is based on the Matrix CUSUM test statistic and allows growing size of networks. We show that the proposed test is minimax optimal and robust to missing links. We also demonstrate the good behavior of our approach in practice through simulation study and a real-data application.
【15】 Malaria Risk Mapping Using Routine Health System Incidence Data in Zambia 标题:利用赞比亚常规卫生系统发病率数据绘制疟疾风险图
作者:Benjamin M. Taylor,Ricardo Andrade-Pacheco,Hugh Sturrock,Busiku Hamainza,Kafula Silumbe,John Miller,Thomas P. Eisele,Francois Rerolle,Hannah Slater,Adam Bennett 链接:https://arxiv.org/abs/2106.14436 摘要:赞比亚疟疾监测系统的改进使得能够更好地监测发病率,并在精确的空间尺度上确定应对措施的目标。随着传播减少,在精细的空间尺度上理解风险的异质性变得越来越重要。然而,在使用卫生系统数据进行高分辨率风险测绘方面存在一些挑战:卫生设施的集水区不明确且重叠,报告的依据也不一致。我们提出了一个新的推断框架,根据赞比亚卫生系统报告的确诊病例数据的正式缩小比例,绘制疟疾发病率数据的风险图。我们结合了2011-2016年大型社区干预试验的数据和基于寻求治疗行为的卫生设施集水区模型;我们的月发病率模型是一个聚集对数高斯-考克斯过程,这使我们能够在精细尺度上预测发病率。我们预测全国每月疟疾发病率为5公里$^2$:尽管2016年卫生系统报告了480万例疟疾病例,但我们估计在社区一级发生的病例数接近1000万例。随着赞比亚继续扩大以社区为基础的疟疾发病率报告,这些产出提供了社区一级疟疾负担的现实估计,以及在次集水区一级针对性干预措施的高分辨率风险图。 摘要:Improvements to Zambia's malaria surveillance system allow better monitoring of incidence and targetting of responses at refined spatial scales. As transmission decreases, understanding heterogeneity in risk at fine spatial scales becomes increasingly important. However, there are challenges in using health system data for high-resolution risk mapping: health facilities have undefined and overlapping catchment areas, and report on an inconsistent basis. We propose a novel inferential framework for risk mapping of malaria incidence data based on formal down-scaling of confirmed case data reported through the health system in Zambia. We combine data from large community intervention trials in 2011-2016 and model health facility catchments based upon treatment-seeking behaviours; our model for monthly incidence is an aggregated log-Gaussian Cox process, which allows us to predict incidence at fine scale. We predicted monthly malaria incidence at 5km$^2$ resolution nationally: whereas 4.8 million malaria cases were reported through the health system in 2016, we estimated that the number of cases occurring at the community level was closer to 10 million. As Zambia continues to scale up community-based reporting of malaria incidence, these outputs provide realistic estimates of community-level malaria burden as well as high resolution risk maps for targeting interventions at the sub-catchment level.
【16】 Exact simulation of extrinsic stress-release processes 标题:外在应力释放过程的精确模拟
作者:Young Lee,Patrick J. Laub,Thomas Taimre,Hongbiao Zhao,Jiancang Zhuang 链接:https://arxiv.org/abs/2106.14415 摘要:我们提出了一个新的和直接的算法,模拟精确的样本路径的广义应力释放过程。详细地计算了联合到达间隔时间的精确规律,并用它导出了该算法。在此基础上,推导了该过程的鞅发生器,导出了相应的理论矩,推广了Borovkov&Vere-Jones(2000)的一些结果,并证明了算法的有效性。 摘要:We present a new and straightforward algorithm that simulates exact sample paths for a generalized stress-release process. The computation of the exact law of the joint interarrival times is detailed and used to derive this algorithm. Furthermore, the martingale generator of the process is derived and induces theoretical moments which generalize some results of Borovkov & Vere-Jones (2000) and are used to demonstrate the validity of our simulation algorithm.
【17】 Universal inference with composite likelihoods 标题:具有复合似然的普适推理
作者:Hien D Nguyen,Jessica Bagnall-Guerreiro,Andrew T Jones 链接:https://arxiv.org/abs/2106.14399 摘要:当数据来自不允许可处理的联合规范的数据生成过程(dgp)时,最大复合似然估计是最大似然估计的有用替代方法。我们证明了由边缘和条件规范组成的一般复合似然允许简单地构造复合似然比统计,从中可以构造有限样本有效置信集和假设检验。这些统计量具有普遍性,因为它们可以由任何DGP参数的估计器构造而成。我们通过使用一对条件指定的二元模型的模拟研究来证明我们的方法。 摘要:Maximum composite likelihood estimation is a useful alternative to maximum likelihood estimation when data arise from data generating processes (DGPs) that do not admit tractable joint specification. We demonstrate that generic composite likelihoods consisting of marginal and conditional specifications permit the simple construction of composite likelihood ratio-like statistics from which finite-sample valid confidence sets and hypothesis tests can be constructed. These statistics are universal in the sense that they can be constructed from any estimator for the parameter of the underlying DGP. We demonstrate our methodology via a simulation study using a pair of conditionally specified bivariate models.
【18】 Flexible Variational Bayes based on a Copula of a Mixture of Normals 标题:基于混合正态分布Copula的柔性变分Bayes
作者:David Gunawan,Robert Kohn,David Nott 机构:School of Mathematics and Applied Statistics, University of, Wollongong, School of Economics, UNSW Business School, University of New, South Wales, Australian Center of Excellence for Mathematical and Statistical, Frontiers 备注:39 pages 链接:https://arxiv.org/abs/2106.14392 摘要:变分Bayes方法通过一系列可处理的分布来逼近后验密度,并使用优化来估计逼近的未知参数。当精确推理很难或费用很高时,变分近似是有用的。本文提出了一种基于混合法向量copula的变分逼近方法,该方法采用自然梯度法和方差缩减法。该方法的有效性通过使用模拟数据集和真实数据集来近似多峰、倾斜和重尾后验分布来说明,包括在贝叶斯深度前馈神经网络回归模型中的应用。每一个例子都表明,所提出的变分近似比相应的高斯copula和混合的法线变分近似更精确。 摘要:Variational Bayes methods approximate the posterior density by a family of tractable distributions and use optimisation to estimate the unknown parameters of the approximation. Variational approximation is useful when exact inference is intractable or very costly. Our article develops a flexible variational approximation based on a copula of a mixture of normals, which is implemented using the natural gradient and a variance reduction method. The efficacy of the approach is illustrated by using simulated and real datasets to approximate multimodal, skewed and heavy-tailed posterior distributions, including an application to Bayesian deep feedforward neural network regression models. Each example shows that the proposed variational approximation is much more accurate than the corresponding Gaussian copula and a mixture of normals variational approximations.
【19】 Towards Model-informed Precision Dosing with Expert-in-the-loop Machine Learning 标题:基于专家在环机器学习的模型信息精确配药
作者:Yihuang Kang,Yi-Wen Chiu,Ming-Yen Lin,Fang-yi Su,Sheng-Tai Huang 机构:Department of Information Management, National Sun Yat-sen University, Kaohsiung, Taiwan, Division of Nephrology, Department of Internal Medicine, Kaohsiung Medical University Hospital, fangyi 链接:https://arxiv.org/abs/2106.14384 摘要:机器学习(ML)及其应用已经改变了我们的生活,但它也创造了与公平、负责、透明和道德人工智能发展相关的问题。由于ML模型还不能完全理解,很明显,我们仍然需要人类参与算法决策过程。在本文中,我们考虑了一个ML框架,它可以通过将人类专家加入到模型学习循环中来加速模型学习并提高模型的可解释性。针对数据标注成本高、缺乏合适的数据来建立目标任务与输入特征之间的关联模型的学习问题,提出了一种新的人在回路ML框架。实验结果表明,该方法可以从数据中学习可解释的规则,并可以用规则表示编辑代替数据注释,从而降低专家的工作量。该方法还可以通过在迭代模型学习过程中引入专家反馈来消除算法偏差。 摘要:Machine Learning (ML) and its applications have been transforming our lives but it is also creating issues related to the development of fair, accountable, transparent, and ethical Artificial Intelligence. As the ML models are not fully comprehensible yet, it is obvious that we still need humans to be part of algorithmic decision-making processes. In this paper, we consider a ML framework that may accelerate model learning and improve its interpretability by incorporating human experts into the model learning loop. We propose a novel human-in-the-loop ML framework aimed at dealing with learning problems that the cost of data annotation is high and the lack of appropriate data to model the association between the target tasks and the input features. With an application to precision dosing, our experimental results show that the approach can learn interpretable rules from data and may potentially lower experts' workload by replacing data annotation with rule representation editing. The approach may also help remove algorithmic bias by introducing experts' feedback into the iterative model learning process.
【20】 Parametric Analysis of Gumbel Type-II Distribution under Step-stress Life Test 标题:阶跃应力寿命试验下冈贝尔Ⅱ型分布的参数分析
作者:Subhankar Dutta,Farha Sultana,Suchandan Kayal 机构:Department of Mathematics, National Institute of Technology, Rourkela-, India., Department of Mathematics and Statistics, Indian Institute of Technology, Kanpur- 链接:https://arxiv.org/abs/2106.14377 摘要:本文研究了基于篡改随机变量(TRV)模型的简单阶跃应力寿命试验(SSLT)参数推断问题。实验装置在正应力条件下的基线寿命服从GumbelⅡ型分布,形状参数分别为$\alpha$和$\lambda$。基于Ⅱ型截尾样本,推导了模型参数的极大似然估计(MLE)和Bayes估计。利用观测到的Fisher信息矩阵得到未知参数的渐近区间。在平方误差损失函数和线性损失函数下,利用马尔可夫链蒙特卡罗(MCMC)方法得到了Bayes估计。我们还构造了未知模型参数的最高后验密度(HPD)区间。大量的模拟研究被用来研究所提出的估计器的有限样本特性。最后,通过对一个实际数据集的分析说明了这些方法。 摘要:In this paper, we focus on the parametric inference based on the Tampered Random Variable (TRV) model for simple step-stress life testing (SSLT) using Type-II censored data. The baseline lifetime of the experimental units under normal stress conditions follows Gumbel Type-II distribution with $\alpha$ and $\lambda$ being the shape and scale parameters, respectively. Maximum likelihood estimator (MLE) and Bayes estimator of the model parameters are derived based on Type-II censored samples. We obtain asymptotic intervals of the unknown parameters using the observed Fisher information matrix. Bayes estimators are obtained using Markov Chain Monte Carlo (MCMC) method under squared error loss function and LINEX loss function. We also construct highest posterior density (HPD) intervals of the unknown model parameters. Extensive simulation studies are performed to investigate the finite sample properties of the proposed estimators. Finally, the methods are illustrated with the analysis of a real data set.
【21】 Estimation of the marginal effect of antidepressants on body mass index under confounding and endogenous covariate-driven monitoring times 标题:混杂内生协变量驱动监测时间下抗抑郁药对体重指数边际效应的估计
作者:Janie Coulombe,Erica E. M. Moodie,Robert W. Platt,Christel Renoux 链接:https://arxiv.org/abs/2106.14364 摘要:在利用电子健康记录数据研究抗抑郁药对体重指数的边际影响时,我们面临着几个挑战。患者的特征会影响暴露(混淆)以及常规就诊的时间(测量过程),并且这些特征可能会在就诊后发生改变,这可能会在监测和体重指数之间产生依赖性(当视为随机或随机过程时)。这可能导致一种形式的选择偏差,扭曲了抗抑郁药边际效应的估计。有人提出了访问权重的反向强度来调整这些不平衡,但是没有一种方法能够解决复杂的情况,即协变量和监测过程在时间上相互影响,从而导致内生性,这种情况很可能发生在电子健康记录中。我们回顾了由于结果依赖性随访时间导致的选择偏差是如何产生的,并提出了一个新的累积权重来模拟一个完整的监测路径,以应对上述挑战,并对抗抑郁药对体重指数的影响作出可靠的估计。更具体地说,我们使用来自英国临床实践研究数据链的数据,比较两种常用抗抑郁药西酞普兰和氟西汀对体重指数的边际影响。结果与那些不考虑内生协变量过程的依赖程度的简单方法进行了比较。 摘要:In studying the marginal effect of antidepressants on body mass index using electronic health records data, we face several challenges. Patients' characteristics can affect the exposure (confounding) as well as the timing of routine visits (measurement process), and those characteristics may be altered following a visit which can create dependencies between the monitoring and body mass index when viewed as a stochastic or random processes in time. This may result in a form of selection bias that distorts the estimation of the marginal effect of the antidepressant. Inverse intensity of visit weights have been proposed to adjust for these imbalances, however no approaches have addressed complex settings where the covariate and the monitoring processes affect each other in time so as to induce endogeneity, a situation likely to occur in electronic health records. We review how selection bias due to outcome-dependent follow-up times may arise and propose a new cumulated weight that models a complete monitoring path so as to address the above-mentioned challenges and produce a reliable estimate of the impact of antidepressants on body mass index. More specifically, we do so using data from the Clinical Practice Research Datalink in the United Kingdom, comparing the marginal effect of two commonly used antidepressants, citalopram and fluoxetine, on body mass index. The results are compared to those obtained with simpler methods that do not account for the extent of the dependence due to an endogenous covariate process.
【22】 A deep look into the Dagum family of isotropic covariance functions 标题:深入研究各向同性协方差函数的Dagum族
作者:Tarik Faouzi,Emilio Porcu,Igor Kondrashuk,Anatoliy Malyarenko 机构: University of Bio BioEMILIO PORCU, University of Bio BioANATOLIY MALYARENKO, Khalifa University at Abu Dhabi, & School of Computer Science and Statistics 备注:15 Pages 链接:https://arxiv.org/abs/2106.14353 摘要:各向同性协方差函数的Dagum族有两个参数,允许在欧氏空间上平稳各向同性高斯随机场的分形维数和Hurst效应的解耦。基于Dagum族在某些参数限制下具有完全单调性的事实,给出了Dagum族Rd具有正定性的充分条件。Dagum族的光谱性质只在很有限的范围内被研究过,本文对此方向进行了深入的探讨。特别地,我们研究了Dagum模型的各向同性谱密度(Hankel变换)的有限性和渐近性。此外,我们还利用Fox{Wright函数建立了Dagum谱密度的一些闭式表达式。最后,我们给出了这类谱密度的渐近性质。 摘要:The Dagum family of isotropic covariance functions has two parameters that allow for decoupling of the fractal dimension and Hurst effect for Gaussian random fields that are stationary and isotropic over Euclidean spaces. Sufficient conditions that allow for positive definiteness in Rd of the Dagum family have been proposed on the basis of the fact that the Dagum family allows for complete monotonicity under some parameter restrictions. The spectral properties of the Dagum family have been inspected to a very limited extent only, and this paper gives insight into this direction. Specifically, we study finite and asymptotic properties of the isotropic spectral density (intended as the Hankel transform) of the Dagum model. Also, we establish some closed forms expressions for the Dagum spectral density in terms of the Fox{Wright functions. Finally, we provide asymptotic properties for such a class of spectral densities.
【23】 Instance-optimality in optimal value estimation: Adaptivity via variance-reduced Q-learning 标题:最优值估计中的实例最优性:基于减方差Q学习的自适应
作者:Koulik Khamaru,Eric Xia,Martin J. Wainwright,Michael I. Jordan 机构:Department of Statistics:, and, Department of Electrical Engineering and Computer Sciences‹, UC Berkeley, Berkeley, CA 链接:https://arxiv.org/abs/2106.14352 摘要:强化学习中的各种算法在其收敛速度和最终精度上表现出极大的变化,这是问题结构的函数。这种特定于实例的行为不会被现有的全局极大极小边界所捕获,这是本质上最坏的情况。分析了具有离散状态和行为的折扣Markov决策过程的最优$Q$值函数的估计问题,并在$\ell\infty$范数下确定了一个控制估计难度的实例依赖函数。利用一个局部极大极小框架,我们证明了这个泛函在任何估计过程的精度下界都是成立的。在另一个方向上,我们通过分析$Q$-学习的方差缩减版本,建立了我们的下界的锐度,直到在状态空间和动作空间中因子的对数。我们的理论提供了一种在$Q$-学习环境下区分“容易”问题和“难”问题的精确方法,如一个困难连续体的集合所示。 摘要:Various algorithms in reinforcement learning exhibit dramatic variability in their convergence rates and ultimate accuracy as a function of the problem structure. Such instance-specific behavior is not captured by existing global minimax bounds, which are worst-case in nature. We analyze the problem of estimating optimal $Q$-value functions for a discounted Markov decision process with discrete states and actions and identify an instance-dependent functional that controls the difficulty of estimation in the $\ell_\infty$-norm. Using a local minimax framework, we show that this functional arises in lower bounds on the accuracy on any estimation procedure. In the other direction, we establish the sharpness of our lower bounds, up to factors logarithmic in the state and action spaces, by analyzing a variance-reduced version of $Q$-learning. Our theory provides a precise way of distinguishing "easy" problems from "hard" ones in the context of $Q$-learning, as illustrated by an ensemble with a continuum of difficulty.
【24】 More on verification of probability forecasts for football outcomes: score decompositions, reliability, and discrimination analyses 标题:更多关于足球结果概率预测的验证:比分分解、可靠性和判别性分析
作者:Jean-Louis Foulley 备注:10 pages, 2 figures 链接:https://arxiv.org/abs/2106.14345 摘要:预测足球比赛的主场、平局和客场胜负在很大程度上取决于对这些事件的事前概率推断,以及通过概率评分规则(Brier、排名概率、对数、零一分)的事后验证。通常,对预测程序质量的评估仅限于报告平均得分值。本文的目的是提出额外的验证工具,例如将分数分解为几个特别感兴趣的组件。使用不同的分块技术(固定阈值、分位数、logistic和iso回归)给出了可靠性和鉴别的图形和数字诊断以及同类统计方法。基于典型的泊松回归模型,在小组赛结束时对欧洲冠军联赛(C1)结果的概率预测中说明了这些程序,与博彩公司赔率和使用的任何技术得出的结果相比,在可靠性方面具有相当好的结果。在机器学习和不同的应用领域(气象学,医学)的研究联系进行了讨论。 摘要:Forecast of football outcomes in terms of Home Win, Draw and Away Win relies largely on ex ante probability elicitation of these events and ex post verification of them via computation of probability scoring rules (Brier, Ranked Probability, Logarithmic, Zero-One scores). Usually, appraisal of the quality of forecasting procedures is restricted to reporting mean score values. The purpose of this article is to propose additional tools of verification, such as score decompositions into several components of special interest. Graphical and numerical diagnoses of reliability and discrimination and kindred statistical methods are presented using different techniques of binning (fixed thresholds, quantiles, logistic and iso regression). These procedures are illustrated on probability forecasts for the outcomes of the UEFA Champions League (C1) at the end of the group stage based on typical Poisson regression models with reasonably good results in terms of reliability as compared to those obtained from bookmaker odds and whatever the technique used. Links with research in machine learning and different areas of application (meteorology, medicine) are discussed.
【25】 Use of Variational Inference in Music Emotion Recognition 标题:变分推理在音乐情感识别中的应用
作者:Nathalie Deziderio,Hugo Tremonte de Carvalho 机构:Brasil, Rio de Janeiro, de mar¸co de , arXiv:,.,v, [stat.ML] , Jun 链接:https://arxiv.org/abs/2106.14323 摘要:这项工作旨在将统计技术应用于音乐情感识别领域,这是信号处理界公认的一个领域,但从统计角度进行的探索却很少。在这里,我们打开了该领域内的几种可能性,应用现代贝叶斯统计技术和开发有效的算法,重点是获得的结果的适用性。虽然这个项目的动机是开发一个基于情感的音乐推荐系统,但它的主要贡献是一个适应性很强的多元模型,可以用来解释任何有兴趣以有效的方式应用正则化的数据库。广义地说,我们将探讨一个健全的理论统计分析在一个能够理解一个著名数据库的算法建模中能起到什么作用,以及用这种方法能得到什么。 摘要:This work was developed aiming to employ Statistical techniques to the field of Music Emotion Recognition, a well-recognized area within the Signal Processing world, but hardly explored from the statistical point of view. Here, we opened several possibilities within the field, applying modern Bayesian Statistics techniques and developing efficient algorithms, focusing on the applicability of the results obtained. Although the motivation for this project was the development of a emotion-based music recommendation system, its main contribution is a highly adaptable multivariate model that can be useful interpreting any database where there is an interest in applying regularization in an efficient manner. Broadly speaking, we will explore what role a sound theoretical statistical analysis can play in the modeling of an algorithm that is able to understand a well-known database and what can be gained with this kind of approach.
【26】 Sparse Logistic Tensor Decomposition for Binary Data 标题:二值数据的稀疏Logistic张量分解
作者:Jianhao Zhang,Yoonkyung Lee 机构:and, Department of Statistics, The Ohio State University 链接:https://arxiv.org/abs/2106.14258 摘要:张量数据在许多应用领域中的应用越来越广泛。我们发展了几种二元张量数据的张量分解方法。与经典的具有平方误差损失的连续值数据的张量分解不同,我们给出了具有Bernoulli似然的二元数据的logistic张量分解。为了提高估计因子的可解释性,进一步提高其稳定性,我们提出了考虑$\ell{1}$-范数和$\ell{0}$-范数正则化似然的稀疏logistic张量分解公式。为了处理由此产生的优化问题,我们开发了结合张量幂法和优化最小化(MM)算法优点的计算算法。通过仿真研究,证明了本文方法在二元张量数据分析中的实用性。为了说明所提出的方法的有效性,我们分析了一个有关国家及其政治关系的数据集,并对估计因素进行联合聚类,以找出国家与政治关系之间的关联。 摘要:Tensor data are increasingly available in many application domains. We develop several tensor decomposition methods for binary tensor data. Different from classical tensor decompositions for continuous-valued data with squared error loss, we formulate logistic tensor decompositions for binary data with a Bernoulli likelihood. To enhance the interpretability of estimated factors and improve their stability further, we propose sparse formulations of logistic tensor decomposition by considering $\ell_{1}$-norm and $\ell_{0}$-norm regularized likelihood. To handle the resulting optimization problems, we develop computational algorithms which combine the strengths of tensor power method and majorization-minimization (MM) algorithm. Through simulation studies, we demonstrate the utility of our methods in analysis of binary tensor data. To illustrate the effectiveness of the proposed methods, we analyze a dataset concerning nations and their political relations and perform co-clustering of estimated factors to find associations between the nations and political relations.
【27】 On Graphical Models and Convex Geometry 标题:关于图模型与凸几何
作者:Haim Bar,Martin T. Wells 机构:Department of Statistics, University of Connecticut, Department of Statistics and Data Science, Cornell University 链接:https://arxiv.org/abs/2106.14255 摘要:我们引入一个beta分布的混合模型来确定$P$较大时$P$预测因子之间的显著相关性。该方法依赖于凸几何中的定理,我们用这些定理来说明如何控制图形模型中边缘检测的错误率。我们的“betaMix”方法不需要任何关于网络结构的假设,也不假设网络是稀疏的。本文的结果适用于一类广泛的数据生成分布,包括轻尾和重尾球对称分布。 摘要:We introduce a mixture-model of beta distributions to identify significant correlations among $P$ predictors when $P$ is large. The method relies on theorems in convex geometry, which we use to show how to control the error rate of edge detection in graphical models. Our `betaMix' method does not require any assumptions about the network structure, nor does it assume that the network is sparse. The results in this article hold for a wide class of data generating distributions that include light-tailed and heavy-tailed spherically symmetric distributions.
【28】 A Generalizability Score for Aggregate Causal Effect 标题:综合因果效应的概括性评分
作者:Rui Chen,Guanhua Chen,Menggang Yu 机构:Department of Statistics, University of Wisconsin, Madison, WI, U.S.A., Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, WI, Summary 备注:31 pages, 4 figures 链接:https://arxiv.org/abs/2106.14243 摘要:科学家们经常将群体水平的因果量(如从源群体到目标群体的平均治疗效果)概括起来。当因果效应是异质性的时,源人群和目标人群在主题特征上的差异可能使这种概括变得困难和不可靠。重新加权或回归可以用来调整这种差异时,概括。然而,如果两个群体之间存在有限的协变量分布重叠,这些方法通常会产生较大的方差。我们提出一个概化得分来解决这个问题。分数可以作为一个尺度来选择目标亚群进行综合。分数的简化版本避免使用任何结果信息,因此可以防止与无意中获取此类信息相关的故意偏见。模拟研究和实际数据分析都证明了这种选择的令人信服的结果。 摘要:Scientists frequently generalize population level causal quantities such as average treatment effect from a source population to a target population. When the causal effects are heterogeneous, differences in subject characteristics between the source and target populations may make such a generalization difficult and unreliable. Reweighting or regression can be used to adjust for such differences when generalizing. However, these methods typically suffer from large variance if there is limited covariate distribution overlap between the two populations. We propose a generalizability score to address this issue. The score can be used as a yardstick to select target subpopulations for generalization. A simplified version of the score avoids using any outcome information and thus can prevent deliberate biases associated with inadvertent access to such information. Both simulation studies and real data analysis demonstrate convincing results for such selection.
【29】 New copulas and their applications to symmetrizations of bivariate copulas 标题:新的Copula及其在二元Copula对称化中的应用
作者:Mohamed El Maazouz,Ahmed Sani 机构:Universit´e Ibn Zohr, Facult´e des Sciences, D´epartement de Math´ematiques, [Subject classification ,]Primary , Hxx;, Exx Secondary , Axx 备注:There are some figires which illusrate the asymmery of couplas constructed and will appear in the publishd version 链接:https://arxiv.org/abs/2106.14240 摘要:基于微扰理论,引入新的copula来阐明不对称copula的对称化过程。我们还给出了\emph{对称化}copula的一些性质。最后,我们研究了具有指定对称化的copula族。顺便说一下,我们在拓扑上研究了全对称copula集,并给出了它的一些经典和新的性质。 摘要:New copulas, based on perturbation theory, are introduced to clarify a \emph{symmetrization} procedure for asymmetric copulas. We give also some properties of the \emph{symmetrized} copula. Finally, we examine families of copulas with a prescribed symmetrized one. By the way, we study topologically, the set of all symmetric copulas and give some of its classical and new properties.
【30】 Interpretable Network Representation Learning with Principal Component Analysis 标题:基于主成分分析的可解释网络表征学习
作者:James D. Wilson,Jihui Lee 机构:Department of Psychiatry, University of Pittsburgh Medical Center, Pittsburgh, PA , USA, Division of Biostatistics, Department of Population Health Sciences, Weill Cornell Medicine, New York, NY , USA, Editor: 备注:33 pages. Submitted and currently under review 链接:https://arxiv.org/abs/2106.14238 摘要:研究了网络值数据样本的可解释网络表示学习问题。我们提出了网络主成分分析(PCAN)算法,通过子图计数统计来识别网络样本的统计意义低维表示。PCAN程序提供了一个可解释的框架,人们可以很容易地可视化、探索和制定网络样本的预测模型。此外,我们还介绍了一种基于快速采样的算法sPCAN,该算法在计算效率上明显高于相应算法,但仍具有可解释性的优点。我们研究了这两种方法之间的关系,并分析了它们在网络样本是基于核的随机图集合的情况下的大样本性质。我们证明了在这种情况下,sPCAN方法的嵌入具有中心极限定理,而且PCAN和sPCAN的总体嵌入是等价的。我们评估PCAN在自然网络样本(包括功能连接网络样本和描述美国参议院政治共同投票习惯的动态网络)中可视化、聚类和分类观察结果的能力。我们的分析表明,我们提出的算法提供了信息和歧视性特征描述网络在每个样本。PCAN和sPCAN方法建立在当前网络表征学习文献的基础上,为网络价值数据的解释性学习研究开辟了一条新的道路。PCAN和sPCAN方法的公开软件可在https://www.github.com/jihuilee/. 摘要:We consider the problem of interpretable network representation learning for samples of network-valued data. We propose the Principal Component Analysis for Networks (PCAN) algorithm to identify statistically meaningful low-dimensional representations of a network sample via subgraph count statistics. The PCAN procedure provides an interpretable framework for which one can readily visualize, explore, and formulate predictive models for network samples. We furthermore introduce a fast sampling-based algorithm, sPCAN, which is significantly more computationally efficient than its counterpart, but still enjoys advantages of interpretability. We investigate the relationship between these two methods and analyze their large-sample properties under the common regime where the sample of networks is a collection of kernel-based random graphs. We show that under this regime, the embeddings of the sPCAN method enjoy a central limit theorem and moreover that the population level embeddings of PCAN and sPCAN are equivalent. We assess PCAN's ability to visualize, cluster, and classify observations in network samples arising in nature, including functional connectivity network samples and dynamic networks describing the political co-voting habits of the U.S. Senate. Our analyses reveal that our proposed algorithm provides informative and discriminatory features describing the networks in each sample. The PCAN and sPCAN methods build on the current literature of network representation learning and set the stage for a new line of research in interpretable learning on network-valued data. Publicly available software for the PCAN and sPCAN methods are available at https://www.github.com/jihuilee/.
【31】 An Approach to Causal Inference over Stochastic Networks 标题:一种基于随机网络的因果推理方法
作者:Duncan A. Clark,Mark S. Handcock 机构:University of California - Los Angeles, Los Angeles, USA 链接:https://arxiv.org/abs/2106.14145 摘要:声称因果推论在网络环境下,必须仔细考虑往往复杂的依赖性之间的结果行动者。特别重要的是治疗溢出或结果干扰效应。当参与者通过底层网络结构连接时,我们考虑因果推理。我们的主要贡献是一个因果关系模型,当潜在的网络是不可观察的和演员协变量随时间随机演变。我们为关系和协变量的生成过程建立了一个联合模型,避免了限制性的可分性假设和确定性的网络假设,这些假设在大多数感兴趣的社会网络环境中都不成立。我们的框架采用了一类高度通用的指数族随机网络模型(ERNM),其中马尔可夫随机场(MRF)和指数族随机图模型(ERGM)是其特例。在贝叶斯框架下,我们提出了一种基于潜在结果的推理方法,并对交换算法进行了简单的改进,以允许从ERNM后验概率中进行抽样。我们给出了一个仿真研究的结果,证明了该方法的有效性。最后,我们在青少年友谊网络背景下的吸烟随时间变化的案例研究中展示了该框架的价值。 摘要:Claiming causal inferences in network settings necessitates careful consideration of the often complex dependency between outcomes for actors. Of particular importance are treatment spillover or outcome interference effects. We consider causal inference when the actors are connected via an underlying network structure. Our key contribution is a model for causality when the underlying network is unobserved and the actor covariates evolve stochastically over time. We develop a joint model for the relational and covariate generating process that avoids restrictive separability assumptions and deterministic network assumptions that do not hold in the majority of social network settings of interest. Our framework utilizes the highly general class of Exponential-family Random Network models (ERNM) of which Markov Random Fields (MRF) and Exponential-family Random Graph models (ERGM) are special cases. We present potential outcome based inference within a Bayesian framework, and propose a simple modification to the exchange algorithm to allow for sampling from ERNM posteriors. We present results of a simulation study demonstrating the validity of the approach. Finally, we demonstrate the value of the framework in a case-study of smoking over time in the context of adolescent friendship networks.
【32】 Score-Based Change Detection for Gradient-Based Learning Machines 标题:基于分数的梯度学习机变化检测
作者:Lang Liu,Joseph Salmon,Zaid Harchaoui 机构: Department of Statistics, University of Washington, Seattle, IMAG, University of Montpellier, CNRS, Montpellier 链接:https://arxiv.org/abs/2106.14122 摘要:机器学习算法的广泛应用需要自动变化检测算法来监控它们的行为。当机器学习算法从一个连续的、可能进化的数据流中学习时,用一个伴随的变化检测算法来补充它以促进它的监视和控制是可取的,而且常常是关键的。我们提出了一种通用的基于分数的变化检测方法,该方法可以检测通过经验风险最小化训练的机器学习模型中任意数量的组件的变化。这种提出的统计假设检验可以很容易地实现在可微编程框架内设计的模型。我们建立了假设检验的一致性,并说明了如何对其进行校正以达到规定的虚警率。我们说明了该方法对合成和真实数据的多功能性。 摘要:The widespread use of machine learning algorithms calls for automatic change detection algorithms to monitor their behavior over time. As a machine learning algorithm learns from a continuous, possibly evolving, stream of data, it is desirable and often critical to supplement it with a companion change detection algorithm to facilitate its monitoring and control. We present a generic score-based change detection method that can detect a change in any number of components of a machine learning model trained via empirical risk minimization. This proposed statistical hypothesis test can be readily implemented for such models designed within a differentiable programming framework. We establish the consistency of the hypothesis test and show how to calibrate it to achieve a prescribed false alarm rate. We illustrate the versatility of the approach on synthetic and real data.
【33】 Parmsurv: a SAS Macro for Flexible Parametric Survival Analysis with Long-Term Predictions 标题:Parmsurv:用于长期预测柔性参数生存分析的SAS宏
作者:Han Fu,Shahrul Mt-Isa,Richard Baumgartner,William Malbecq 机构: College of Public Health, The Ohio State University, Columbus, OH, USA, Biostatistics and Research Decision Sciences (BARDS) Health Technology Assessment (HTA), Statistics, MSD Research Laboratories (MRL), MSD Zurich, Switzerland 备注:15 pages, 1 figure, 10 tables, accepted by The Clinical Data Science Conference - PHUSE US Connect 2021 链接:https://arxiv.org/abs/2106.14109 摘要:健康经济评估通常需要预测随访期之后的存活率。参数生存模型比Cox模型更适合于经济建模。广义gamma(GG)分布和广义F(GF)分布是一个广泛的族,包含了几乎所有常用的具有各种危险形状和任意复杂性的分布。在这项研究中,我们提出了一个新的SAS宏来实现各种灵活的参数模型,包括GG和GF分布及其特例,以及Gompertz分布。还支持适当的自定义发行版。与现有的SAS程序不同,该宏不仅支持位置参数的回归,还支持辅助参数的回归,大大提高了模型的灵活性。此外,SAS宏支持加权回归、分层回归和稳健推理。这项研究通过几个例子说明了SAS宏如何用于灵活的生存建模和外推。 摘要:Health economic evaluations often require predictions of survival rates beyond the follow-up period. Parametric survival models can be more convenient for economic modelling than the Cox model. The generalized gamma (GG) and generalized F (GF) distributions are extensive families that contain almost all commonly used distributions with various hazard shapes and arbitrary complexity. In this study, we present a new SAS macro for implementing a wide variety of flexible parametric models including the GG and GF distributions and their special cases, as well as the Gompertz distribution. Proper custom distributions are also supported. Different from existing SAS procedures, this macro not only supports regression on the location parameter but also on ancillary parameters, which greatly increases model flexibility. In addition, the SAS macro supports weighted regression, stratified regression and robust inference. This study demonstrates with several examples how the SAS macro can be used for flexible survival modeling and extrapolation.
【34】 Using relative weight analysis with residualization to detect relevant nonlinear interaction effects in ordinary and logistic regressions 标题:用带残差的相对权重分析检验普通回归和Logistic回归中的相关非线性交互效应
作者:Maikol Solís,Carlos Pasquier 机构: Universidad de Costa Rica, cr†Universidad de Costa Rica 链接:https://arxiv.org/abs/2106.14095 摘要:相对权重分析是检测模型中一个变量或交互作用是否相关的经典工具。在这篇论文中,我们将集中讨论用限制三次样条构造非线性相互作用的相对权重。我们的目标是提供一个可访问的方法来分析一个多元模型,并确定一个最具代表性的变量集子集。在此基础上,我们提出了一种在剩余权重分析中同时处理控制项、固定项、自由项和交互项的方法。在模型中,对相互作用的主效应进行适当的剩余化,以保持其真实效应。我们用两个模拟的例子来验证这个方法。 摘要:The relative weight analysis is a classic tool to detect if one variable or interaction in a model is relevant or not. In this paper, we will focus on the construction of relative weights for non-linear interactions using restricted cubic splines. Our aim is to provide an accessible method to analyze a multivariate model and identify one subset with the most representative set of variables. Furthermore, we developed a procedure treating control, fixed, free and interactions terms at the same time in the residual weight analysis. The interactions are residualized properly against their main effects to keep their true effect in the model. We test this method with two simulated examples.
【35】 Deep Learning Partial Least Squares 标题:深度学习偏最小二乘法
作者:Nicholas Polson,Vadim Sokolov,Jianeng Xu 机构:Booth School of Business, University of Chicago, Department of Systems Engineering, and Operations Research, George Mason University 链接:https://arxiv.org/abs/2106.14085 摘要:在深度学习中使用偏最小二乘法提供高维数据简化技术。我们的框架提供了PLS的非线性扩展,以及一种在深度学习中进行特征选择和架构设计的方法。这导致了针对预测性问题定制的深度学习的统计解释。我们可以利用PLS的screeplot、biplot等工具进行模型诊断。后验预测不确定度可用MCMC方法在最后一层。因此,我们实现了两全其美:可扩展性和快速预测规则构建以及不确定性量化。我们的关键结构是通过预测输出分数作为输入分数的深度学习者,在PLS中采用深度学习。与PLS一样,我们的X-scores是使用SVD构造的,并且应用于回归和分类问题,并且快速且可伸缩。继弗兰克和弗里德曼1993年,我们提供了一个贝叶斯收缩解释我们的非线性预测。我们介绍了各种新的偏最小二乘模型:PLS-ReLU、PLS-autocoder、PLS-Trees和PLS-GP。为了说明我们的方法,我们使用模拟的例子和分析偏好的橙汁和预测葡萄酒质量作为输入特性的函数。我们还说明了Brillinger的估计过程,以提供特征选择和数据降维。最后,我们总结了未来的研究方向。 摘要:High dimensional data reduction techniques are provided by using partial least squares within deep learning. Our framework provides a nonlinear extension of PLS together with a disciplined approach to feature selection and architecture design in deep learning. This leads to a statistical interpretation of deep learning that is tailor made for predictive problems. We can use the tools of PLS, such as scree-plot, bi-plot to provide model diagnostics. Posterior predictive uncertainty is available using MCMC methods at the last layer. Thus we achieve the best of both worlds: scalability and fast predictive rule construction together with uncertainty quantification. Our key construct is to employ deep learning within PLS by predicting the output scores as a deep learner of the input scores. As with PLS our X-scores are constructed using SVD and applied to both regression and classification problems and are fast and scalable. Following Frank and Friedman 1993, we provide a Bayesian shrinkage interpretation of our nonlinear predictor. We introduce a variety of new partial least squares models: PLS-ReLU, PLS-Autoencoder, PLS-Trees and PLS-GP. To illustrate our methodology, we use simulated examples and the analysis of preferences of orange juice and predicting wine quality as a function of input characteristics. We also illustrate Brillinger's estimation procedure to provide the feature selection and data dimension reduction. Finally, we conclude with directions for future research.
【36】 Bayesian Time-Varying Tensor Vector Autoregressive Models for Dynamic Effective Connectivity 标题:动态有效连通性的贝叶斯时变张量向量自回归模型
作者:Wei Zhang,Ivor Cribben,sonia Petrone,Michele Guindani 机构:Guindani§, Department of Decision Sciences, Bocconi University, Department of Accounting and Business Analytics, Alberta, School of Business, Neuroscience and Mental Health Institute, University of, Department of Statistics, University of California, Irvine 链接:https://arxiv.org/abs/2106.14083 摘要:功能性磁共振成像(fMRI)的最新进展研究了在整个实验过程中,一些大脑区域如何直接影响大脑其他区域的活动,即动态有效连接。时变向量自回归(TV-VAR)模型已被用于为此目的进行推断,但由于待估计参数的数量随时间序列的数量二次增加,因此它们的计算量非常大。在本文中,我们提出了一种计算效率高的贝叶斯时变VAR方法来建模高维时间序列。该框架对不同时滞下的VAR系数矩阵进行张量分解。通过假设在任何给定时间只有张量分解中的一个子集是活动的,可以捕获动态变化的连接模式。潜在的二进制时间序列通过一个方便的Ising先验规范在每个时刻选择有源分量。所提出的先验结构鼓励张量结构的稀疏性,并允许通过后验分布确定模型的复杂性。更具体地说,稀疏诱导先验被用来允许系数的全局局部收缩,自动确定张量分解的秩,并指导自回归滞后的选择。我们通过模拟研究和一个实际的功能磁共振成像(fMRI)研究,包括一个读书实验,展示了我们的模型公式的性能。 摘要:Recent developments in functional magnetic resonance imaging (fMRI) investigate how some brain regions directly influence the activity of other regions of the brain {\it dynamically} throughout the course of an experiment, namely dynamic effective connectivity. Time-varying vector autoregressive (TV-VAR) models have been employed to draw inferencesfor this purpose, but they are very computationally intensive, since the number of parameters to be estimated increases quadratically with the number of time series. In this paper, we propose a computationally efficient Bayesian time-varying VAR approach for modeling high-dimensional time series. The proposed framework employs a tensor decomposition for the VAR coefficient matrices at different lags. Dynamically varying connectivity patterns are captured by assuming that at any given time only a subset of components in the tensor decomposition is active. Latent binary time series select the active components at each time via a convenient Ising prior specification. The proposed prior structure encourages sparsity in the tensor structure and allows to ascertain model complexity through the posterior distribution. More specifically, sparsity-inducing priors are employed to allow for global-local shrinkage of the coefficients, to determine automatically the rank of the tensor decomposition and to guide the selection of the lags of the auto-regression. We show the performances of our model formulation via simulation studies and data from a real fMRI study involving a book reading experiment.
【37】 A general, simple, robust method to account for measurement error when analyzing data with an internal validation subsample 标题:一种在使用内部验证子样本分析数据时解决测量误差的通用、简单、健壮的方法
作者:Walter K Kremers 机构:Affiliation:, Department of Quantitative Sciences, First St. SW, Rochester MN , USA, Email:, Key words: Measurement error, differential error, Berkson error, bias, bias correction 链接:https://arxiv.org/abs/2106.14063 摘要:背景:在流行病学数据中,量化或分类方面的测量误差经常发生,并对推断产生强烈影响。在确定、记录或提取数据时,可能会出现测量误差。尽管测量误差的影响可能是严重的,并且描述得很好,但是简单的直接的一般解析解并不容易用于统计分析,并且测量误差常常得不到承认或解释。一般来说,为了解释测量误差,我们需要一些数据,在这些数据中我们可以观察变量一次有误差和一次无误差,以建立两者之间的关系。方法:在这里,我们描述了一个通用的方法,当存在一个验证子样本,其中变量被测量一次有误差和一次无误差时,解释参数回归设置的结果和/或预测变量的测量误差。该方法不描述并且因此不依赖于有误差和无误差测量的变量之间的特定关系,并且通常对测量误差的类型(例如非微分误差、微分误差或Berkson误差)具有鲁棒性。结果:仿真研究表明,与仅基于误差测量的变量的模型相比,该方法减少了偏差;与仅基于验证子样本中无误差测量的变量的模型相比,该方法减少了方差。结论:所提出的估计器具有良好的偏差和方差性质,易于经验推导,对不同类型的测量误差具有鲁棒性。该方法在有测量误差的数据分析中是一种有价值的工具。 摘要:Background: Measurement errors in terms of quantification or classification frequently occur in epidemiologic data and can strongly impact inference. Measurement errors may occur when ascertaining, recording or extracting data. Although the effects of measurement errors can be severe and are well described, simple straight forward general analytic solutions are not readily available for statistical analysis and measurement error is frequently not acknowledged or accounted for. Generally, to account for measurement error requires some data where we can observe the variables once with and once without error, to establish the relationship between the two. Methods: Here we describe a general method accounting for measurement error in outcome and/or predictor variables for the parametric regression setting when there is a validation subsample where variables are measured once with and once without error. The method does not describe and thus does not depend on the particular relation between the variables measured with and without error, and is generally robust to the type of measurement error, for example nondifferential, differential or Berkson errors. Results: Simulation studies show how the method reduces bias compared to models based upon variables measured with error alone and reduces variances compared to models based upon the variables measured without error in the validation subsample alone. Conclusion: The proposed estimator has favorable properties in terms of bias and variance, is easily derived empirically, and is robust to different types of measurement error. This method should be a valuable tool in the analysis of data with measurement error.
【38】 The mbsts package: Multivariate Bayesian Structural Time Series Models in R 标题:MBSTS软件包:R中的多变量贝叶斯结构时间序列模型
作者:Ning Ning,Jinwen Qiu 链接:https://arxiv.org/abs/2106.14045 摘要:多元贝叶斯结构时间序列(MBSTS)模型{QIU2018多元,Jammalamadaka2019Predicting}作为许多结构时间序列模型的推广版本,处理多个相关时间序列的推断和预测,其中还可以选择对每个目标序列使用不同的同期预测候选池。MBSTS模型具有广泛的应用,是特征选择、时间序列预测、即时预报、因果影响推断等的理想模型。本文介绍了如何使用R package\pkg{mbsts}进行mbsts建模,在package中的用户友好函数和开发人员友好函数之间建立桥梁,并给出了相应的方法。对\pkg{mbsts}包中的模拟数据集和面向对象函数进行了说明,使用户能够灵活地添加或删除某些组件,并简化或复杂化某些设置。 摘要:The multivariate Bayesian structural time series (MBSTS) model \citep{qiu2018multivariate,Jammalamadaka2019Predicting} as a generalized version of many structural time series models, deals with inference and prediction for multiple correlated time series, where one also has the choice of using a different candidate pool of contemporaneous predictors for each target series. The MBSTS model has wide applications and is ideal for feature selection, time series forecasting, nowcasting, inferring causal impact, and others. This paper demonstrates how to use the R package \pkg{mbsts} for MBSTS modeling, establishing a bridge between user-friendly and developer-friendly functions in package and the corresponding methodology. A simulated dataset and object-oriented functions in the \pkg{mbsts} package are explained in the way that enables users to flexibly add or deduct some components, as well as to simplify or complicate some settings.
【39】 Bahadur efficiencies of the Epps--Pulley test for normality 标题:EPPS的Bahadur效率--滑轮正态检验
作者:Bruno Ebner,Norbert Henze 机构:In memoriam Yakov Yu. Nikitin (,–,) 备注:13 pages, 2 tables 链接:https://arxiv.org/abs/2106.13962 摘要:Epps和滑轮(1983)提出的正态性检验是基于经验分布函数的检验的有力竞争者。与后一种方法不同的是,它已被推广到任意维正态性的真仿射不变量和普遍一致性检验。我们获得了Epps和滑轮试验的近似Bahadur效率,从而补充了Milo\v{s}eviúc等人(2021)的最新结果。对于Epps——滑轮测试中固有的调谐参数的某些值,在六个接近正态性的替代品的整个范围内,该测试优于Milo\v{s}evi'c et al.(2021)中考虑的每个竞争对手。 摘要:The test for normality suggested by Epps and Pulley (1983) is a serious competitor to tests based on the empirical distribution function. In contrast to the latter procedures, it has been generalized to obtain a genuine affine invariant and universally consistent test for normality in any dimension. We obtain approximate Bahadur efficiencies for the test of Epps and Pulley, thus complementing recent results of Milo\v{s}evi\'c et al. (2021). For certain values of a tuning parameter that is inherent in the Epps--Pulley test, this test outperforms each of its competitors considered in Milo\v{s}evi\'c et al. (2021), over the whole range of six close alternatives to normality.
【40】 Functional Classwise Principal Component Analysis: A Novel Classification Framework 标题:功能分类主成分分析:一种新的分类框架
作者:Avishek Chatterjee,Satyaki Mazumder,Koel Das 机构: Das are with the Department ofMathematics and Statistics 链接:https://arxiv.org/abs/2106.13959 摘要:近年来,功能数据分析(FDA)已成功地应用于高维数据分类领域。在本文中,我们提出了一个新的分类框架,利用功能数据和分类主成分分析(PCA)。该方法适用于高维时间序列数据的小样本问题。该方法提取分段线性函数特征空间,特别适用于硬分类问题,将时间序列数据转化为函数数据,利用分类函数PCA进行特征提取,然后利用贝叶斯线性分类器进行分类。我们将该方法应用于神经科学、食品科学、医学和化学计量学等多个领域的合成数据集和实时序列数据,证明了该方法的有效性。 摘要:In recent times, functional data analysis (FDA) has been successfully applied in the field of high dimensional data classification. In this paper, we present a novel classification framework using functional data and classwise Principal Component Analysis (PCA). Our proposed method can be used in high dimensional time series data which typically suffers from small sample size problem. Our method extracts a piece wise linear functional feature space and is particularly suitable for hard classification problems.The proposed framework converts time series data into functional data and uses classwise functional PCA for feature extraction followed by classification using a Bayesian linear classifier. We demonstrate the efficacy of our proposed method by applying it to both synthetic data sets and real time series data from diverse fields including but not limited to neuroscience, food science, medical sciences and chemometrics.
【41】 Optimal prediction of Markov chains with and without spectral gap 标题:有无谱间隙马氏链的最优预测
作者:Yanjun Han,Soham Jana,Yihong Wu 备注:52 pages 链接:https://arxiv.org/abs/2106.13947 摘要:我们研究了依赖数据下的学习问题:从一个状态为$k$的平稳马尔可夫链中观察一条长度为$n$的轨迹,目的是预测下一个状态。对于$3\leq k\leq O(\sqrt{n})$,使用通用压缩技术,Kullback-Leibler散度的最佳预测风险显示为$\Theta(\frac{k^2}{n}\log\frac{n}{k^2})$,而Falahatgar et al.,2016中之前显示的$k=2$的最佳预测风险为$\Theta(\frac{\log\log n}{n})$,慢于参数速率$O(\frac{k^2}{n})$,可以归因于数据中的内存,因为马尔可夫链的谱隙可以任意小。为了量化记忆效应,我们研究了具有规定光谱间隙的不可约可逆链。除了刻画两个状态的最优预测风险外,我们还表明,只要谱隙不太小,Markov模型的预测风险为$O(\frac{k^2}{n})$,这与具有相同参数数的iid模型的预测风险一致。 摘要:We study the following learning problem with dependent data: Observing a trajectory of length $n$ from a stationary Markov chain with $k$ states, the goal is to predict the next state. For $3 \leq k \leq O(\sqrt{n})$, using techniques from universal compression, the optimal prediction risk in Kullback-Leibler divergence is shown to be $\Theta(\frac{k^2}{n}\log \frac{n}{k^2})$, in contrast to the optimal rate of $\Theta(\frac{\log \log n}{n})$ for $k=2$ previously shown in Falahatgar et al., 2016. These rates, slower than the parametric rate of $O(\frac{k^2}{n})$, can be attributed to the memory in the data, as the spectral gap of the Markov chain can be arbitrarily small. To quantify the memory effect, we study irreducible reversible chains with a prescribed spectral gap. In addition to characterizing the optimal prediction risk for two states, we show that, as long as the spectral gap is not excessively small, the prediction risk in the Markov model is $O(\frac{k^2}{n})$, which coincides with that of an iid model with the same number of parameters.
【42】 Outlier-Resistant Estimators for Average Treatment Effect in Causal Inference 标题:因果推断中平均处理效果的抗异常值估计量
作者:Kazuharu Harada,Hironori Fujisawa 链接:https://arxiv.org/abs/2106.13946 摘要:因果量的估计有时会遇到异常值。我们研究了在具有挑战性但现实的环境下平均治疗效果(ATE)的抗离群值估计。我们假设离群值的比率不一定很小,它可以依赖于协变量。我们提出了三种类型的ATE估计,它将著名的逆概率加权(IPW)/双稳健(DR)估计与密度幂权相结合。在非均匀污染下,我们的方法可以减少由异常值引起的偏差。特别是在均匀污染条件下,我们的估计值与真实的ATE近似一致。基于影响函数的分析表明,即使在非均匀污染条件下,如果离群值的比率很小,离群值的不利影响也可以忽略不计。我们还导出了估计量的渐近性质。我们通过montecarlo模拟和实际数据分析来评估我们的估计器的性能。估计潜在结果中位数的比较方法没有足够的抗异常值能力。在实验中,我们的方法优于比较方法。 摘要:Estimators for causal quantities sometimes suffer from outliers. We investigate outlier-resistant estimation for the average treatment effect (ATE) under challenging but realistic settings. We assume that the ratio of outliers is not necessarily small and that it can depend on covariates. We propose three types of estimators for the ATE, which combines the well-known inverse probability weighting (IPW)/doubly robust (DR) estimators with the density-power weight. Under heterogeneous contamination, our methods can reduce the bias caused by outliers. In particular, under homogeneous contamination, our estimators are approximately consistent with the true ATE. An influence-function-based analysis indicates that the adverse effect of outliers is negligible if the ratio of outliers is small even under heterogeneous contamination. We also derived the asymptotic properties of our estimators. We evaluated the performance of our estimators through Monte-Carlo simulations and real data analysis. The comparative methods, which estimate the median of the potential outcome, do not have enough outlier resistance. In experiments, our methods outperformed the comparative methods.
【43】 Hypothesis Testing for Two Sample Comparison of Network Data 标题:网络数据两个样本比较的假设检验
作者:Han Feng,Xing Qiu,Hongyu Miao 机构: Department of Biostatistics and Data Science, Universityof Texas Health Science Center at Houston, Department of Biostatistics and Computational Biology, University of Rochester 备注:40 pages, 3 figures 链接:https://arxiv.org/abs/2106.13931 摘要:网络数据是一种主要的对象数据类型,已被广泛收集或从常见的来源,如大脑成像。此类数据包含数字、拓扑和几何信息,并且可能需要在某些非欧几里德空间中进行适当的统计分析。网络数据统计方法的发展具有挑战性,目前尚处于起步阶段;例如,对于网络数据的基本两样本测试的非欧几里德对应物在文献中是稀缺的。在本研究中,我们提出了一个新的架构来比较两个独立样本的网路。具体地说,本文提出了商欧氏距离的近似距离度量,并结合网络谱距离对网络的局部相异性和全局相异性进行了量化。将置换非欧几里德方差分析应用于所提出的距离度量,以比较两组独立的网络。通过综合仿真研究和实际应用,证明了该方法的优越性。研究了该检验的渐近性质,并讨论了它的高维扩展。 摘要:Network data is a major object data type that has been widely collected or derived from common sources such as brain imaging. Such data contains numeric, topological, and geometrical information, and may be necessarily considered in certain non-Euclidean space for appropriate statistical analysis. The development of statistical methodologies for network data is challenging and currently at its infancy; for instance, the non-Euclidean counterpart of basic two-sample tests for network data is scarce in literature. In this study, a novel framework is presented for two independent sample comparison of networks. Specifically, an approximation distance metric to quotient Euclidean distance is proposed, and then combined with network spectral distance to quantify the local and global dissimilarity of networks simultaneously. A permutational non-Euclidean analysis of variance is adapted to the proposed distance metric for the comparison of two independent groups of networks. Comprehensive simulation studies and real applications are conducted to demonstrate the superior performance of our method over other alternatives. The asymptotic properties of the proposed test are investigated and its high-dimensional extension is discussed as well.
【44】 Extending the Patra-Sen Approach to Estimating the Background Component in a Two-Component Mixture Model 标题:扩展Patra-Sen方法估计双组分混合模型中的背景分量
作者:Ery Arias-Castro,He Jiang 机构: University of California, edu~eariasca†Department of Mathematics 备注:34 pages, 11 figures, 17 tables 链接:https://arxiv.org/abs/2106.13925 摘要:Patra和Sen(2016)考虑了一个双组分混合模型,其中一个组分起背景作用,而另一个组分起信号作用,并建议通过简单地“最大化”其权重来估计背景组分。在他们的工作中,背景成分是一个完全已知的分布,我们将他们的方法扩展到三个象征性的设置:当背景分布是对称的;当它是单调的;当它是圆木凹面时。在每个设置中,我们推导背景分量的估计量,建立一致性,并提供一个置信区间。当背景分量被认为是对称的或单调的时,它的估计是直接的,当它是对数凹的时,它的估计需要计算一个最大凹最小值,我们用序列二次规划来实现。与现有的方法相比,我们的方法的优点是对背景组件的先验知识要求更少,因此不容易出现模型错误。我们在许多合成和真实数据集上说明了这种方法。 摘要:Patra and Sen (2016) consider a two-component mixture model, where one component plays the role of background while the other plays the role of signal, and propose to estimate the background component by simply "maximizing" its weight. While in their work the background component is a completely known distribution, we extend their approach here to three emblematic settings: when the background distribution is symmetric; when it is monotonic; and when it is log-concave. In each setting, we derive estimators for the background component, establish consistency, and provide a confidence band. While the estimation of a background component is straightforward when it is taken to be symmetric or monotonic, when it is log-concave its estimation requires the computation of a largest concave minorant, which we implement using sequential quadratic programming. Compared to existing methods, our method has the advantage of requiring much less prior knowledge on the background component, and is thus less prone to model misspecification. We illustrate this methodology on a number of synthetic and real datasets.
【45】 Statistical Methods for the meta-analysis paper by Itzhaky et al 标题:Itzhaky等人Meta分析论文的统计方法
作者:Steven P. Ellis 备注:37 pages, 3 figures 链接:https://arxiv.org/abs/2106.13874 摘要:本文描述了Itzhaky等人所用的统计方法(“系统回顾和荟萃分析:26年的减少青少年自杀风险的心理社会干预随机临床试验”)。这篇论文是一个荟萃分析的随机对照临床试验测试方法,以防止自杀行为和/或意念在青年。尤其是在行为方面,元数据的分析更具挑战性。本文分为两个部分。首先是对所用统计方法的非正式讨论。第二章对一些公式和方法进行了详细的数学推导。 摘要:This document describes the statistical methods used in Itzhaky et al ("Systematic Review and Meta-analysis: Twenty-six Years of Randomized Clinical Trials of Psychosocial Interventions to Reduce Suicide Risk in Adolescents"). That paper is a meta-analysis of randomized controlled clinical trials testing methods for preventing suicidal behavior and/or ideation in youth. Particularly on the behavior side the meta-data are challenging to analyze. This paper has two parts. The first is an informal discussion of the statistical methods used. The second gives detailed mathematical derivations of some formulas and methods.
【46】 On assessing excess mortality in Germany during the COVID-19 pandemic 标题:关于评估德国冠状病毒大流行期间的超额死亡率
作者:Giacomo De Nicola,Göran Kauermann,Michael Höhle 机构:Department of Statistics, LMU Munich, Germany, Department of Mathematics, University of Stockholm, Sweden 链接:https://arxiv.org/abs/2106.13827 摘要:2019年冠状病毒病(COVID-19)在普通人群中的伤亡人数非常高。评估这一数字的确切大小是一个非常重要的问题,因为仅仅依靠官方报告的COVID-19相关死亡病例就有可能产生多种偏差。处理这一问题的方法之一是将大流行期间的总死亡率与使用往年观察到的死亡率数字计算的预期死亡率进行比较。在本文中,我们在现有方法的基础上,提出了两种计算预期死亡率和超额死亡率的方法,即每周和每年的方法。特别关注年龄的作用,年龄在COVID-19相关死亡率和总死亡率中起着中心作用。我们利用德国2016年至2020年的年龄分层死亡率数据来计算2020年COVID-19大流行期间的年龄组特定超额死亡率,以此说明我们的方法。 摘要:Coronavirus disease 2019 (COVID-19) is associated with a very high number of casualties in the general population. Assessing the exact magnitude of this number is a non-trivial problem, as relying only on officially reported COVID-19 associated fatalities runs the risk of incurring in several kinds of biases. One of the ways to approach the issue is to compare overall mortality during the pandemic with expected mortality computed using the observed mortality figures of previous years. In this paper, we build on existing methodology and propose two ways to compute expected as well as excess mortality, namely at the weekly and at the yearly level. Particular focus is put on the role of age, which plays a central part in both COVID-19-associated and overall mortality. We illustrate our methods by making use of age-stratified mortality data from the years 2016 to 2020 in Germany to compute age group-specific excess mortality during the COVID-19 pandemic in 2020.
【47】 Multi-task curriculum learning in a complex, visual, hard-exploration domain: Minecraft 标题:复杂、直观、难探索领域的多任务课程学习:“我的世界”
作者:Ingmar Kanitscheider,Joost Huizinga,David Farhi,William Hebgen Guss,Brandon Houghton,Raul Sampedro,Peter Zhokhov,Bowen Baker,Adrien Ecoffet,Jie Tang,Oleg Klimov,Jeff Clune 机构:OpenAI 备注:first submission 链接:https://arxiv.org/abs/2106.14876 摘要:强化学习的一个重要挑战是训练能够解决各种任务的智能体。如果任务相互依赖(例如,在学习跑步之前需要先学会走路),课程学习可以通过关注下一个最佳学习任务来加快学习速度。我们探索课程学习在一个复杂的,视觉领域与许多艰难的探索挑战:地雷。我们发现,学习进度(定义为任务成功概率的变化)是自动构建有效课程的可学习性的可靠度量。我们介绍了一个学习进度为基础的课程,并测试了一个复杂的强化学习问题(称为“西蒙说”),其中一个代理人是指示获得一个理想的目标项目。许多必要的技能相互依赖。实验表明:(1)获得新项目的一个集内探索奖励可以提高性能;(2)在整个训练过程中动态调整这个奖励,使它只适用于代理不能可靠获得的项目,从而进一步提高性能,(3)基于学习进度的课程优雅地遵循agent的学习曲线;(4)当基于学习进度的课程与动态探索奖金相结合时,它的学习效率更高,获得的绩效远远高于统一基线。这些结果表明,将事件内和跨训练探索奖金与学习进度相结合,创造了一种很有前途的自动课程生成方法,这可能会大大提高我们训练能力更强、通常更智能的代理的能力。 摘要:An important challenge in reinforcement learning is training agents that can solve a wide variety of tasks. If tasks depend on each other (e.g. needing to learn to walk before learning to run), curriculum learning can speed up learning by focusing on the next best task to learn. We explore curriculum learning in a complex, visual domain with many hard exploration challenges: Minecraft. We find that learning progress (defined as a change in success probability of a task) is a reliable measure of learnability for automatically constructing an effective curriculum. We introduce a learning-progress based curriculum and test it on a complex reinforcement learning problem (called "Simon Says") where an agent is instructed to obtain a desired goal item. Many of the required skills depend on each other. Experiments demonstrate that: (1) a within-episode exploration bonus for obtaining new items improves performance, (2) dynamically adjusting this bonus across training such that it only applies to items the agent cannot reliably obtain yet further increases performance, (3) the learning-progress based curriculum elegantly follows the learning curve of the agent, and (4) when the learning-progress based curriculum is combined with the dynamic exploration bonus it learns much more efficiently and obtains far higher performance than uniform baselines. These results suggest that combining intra-episode and across-training exploration bonuses with learning progress creates a promising method for automated curriculum generation, which may substantially increase our ability to train more capable, generally intelligent agents.
【48】 Gaussian Process Regression for Active Sensing Probabilistic Structural Health Monitoring: Experimental Assessment Across Multiple Damage and Loading Scenarios 标题:用于主动传感概率结构健康监测的高斯过程回归:多个损伤和加载情景下的实验评估
作者:Ahmad Amer,Fotis Kopsaftopoulos 机构:Intelligent Structural Systems Laboratory (ISSL), Department of Mechanical, Aerospace and Nuclear Engineering, Rensselaer Polytechnic Institute, Troy, NY, USA 备注:47 pages, 34 figures 链接:https://arxiv.org/abs/2106.14841 摘要:在不久的将来,结构健康监测(SHM)技术将能够克服当前维护和生命周期管理模式的缺点,即:成本高、停机时间长、安全管理模式不够理想以及完全自主操作的适用性有限。在SHM中,最具挑战性的任务之一是损伤量化。当涉及到不同的操作和环境条件时,当前的方法面临准确性和/或鲁棒性问题。此外,当前框架的损坏/无损坏范例并不能为现场的维护人员提供太多信息,以便进行正确的决策。基于广泛应用的损伤指数(DIs)和高斯过程回归模型(GPRMs),提出了一种新的结构损伤量化框架。新颖性在于计算来自特定状态的传入测试DI点的概率,这允许概率决策。该框架被应用于三个试验案例:碳纤维增强塑料(CFRP)试件(带配重的试件)模拟损伤、带缺口的铝试件和带配重的铝试件(带配重的试件)模拟不同加载状态下的损伤。本文提出的状态预测方法应用于前两个测试用例中的单状态量化,以及假设加载状态已知的第三个测试用例。最后,假设损伤尺寸和载荷都未知,将该方法应用于第三个测试用例,以便从输入DI测试点同时预测两者。在应用这一框架时,使用了两种形式的GPRMs(标准和变分异方差)来评估它们在三个测试用例中的性能。 摘要:In the near future, Structural Health Monitoring (SHM) technologies will be capable of overcoming the drawbacks in the current maintenance and life-cycle management paradigms, namely: cost, increased downtime, less-than-optimal safety management paradigm and the limited applicability of fully-autonomous operations. In the context of SHM, one of the most challenging tasks is damage quantification. Current methods face accuracy and/or robustness issues when it comes to varying operating and environmental conditions. In addition, the damage/no-damage paradigm of current frameworks does not offer much information to maintainers on the ground for proper decision-making. In this study, a novel structural damage quantification framework is proposed based on widely-used Damage Indices (DIs) and Gaussian Process Regression Models (GPRMs). The novelty lies in calculating the probability of an incoming test DI point originating from a specific state, which allows for probability-educated decision-making. This framework is applied to three test cases: a Carbon Fiber-Reinforced Plastic (CFRP) coupon with attached weights as simulated damage, an aluminum coupon with a notch, and an aluminum coupon with attached weights as simulated damage under varying loading states. The state prediction method presented herein is applied to single-state quantification in the first two test cases, as well as the third one assuming the loading state is known. Finally, the proposed method is applied to the third test case assuming neither the damage size nor the load is known in order to predict both simultaneously from incoming DI test points. In applying this framework, two forms of GPRMs (standard and variational heteroscedastic) are used in order to critically assess their performances with respect to the three test cases.
【49】 Understanding Dynamics of Nonlinear Representation Learning and Its Application 标题:非线性表征学习的理解动力学及其应用
作者:Kenji Kawaguchi,Linjun Zhang,Zhun Deng 机构:Harvard University, Rutgers University 链接:https://arxiv.org/abs/2106.14836 摘要:世界环境的表示在机器智能中起着至关重要的作用。直接在图像像素值等原始感官表征空间进行推理和推理往往效率低下。表征学习允许我们从原始的感官数据中自动发现合适的表征。例如,给定原始的感官数据,多层感知器在其隐藏层学习非线性表示,随后在其输出层用于分类(或回归)。这是在训练过程中通过最小化有监督或无监督的损失隐式发生的。本文研究了这种内隐非线性表征学习的动力学。我们确定了一对新的假设和一个新的条件,称为公共模型结构假设和数据体系结构对齐条件。在一般模型结构假设下,证明了数据结构对齐条件对全局收敛是充分的,对全局最优性是必要的。我们的结果为模型结构的设计提供了实际指导:例如,公共模型结构假设可以作为使用特定模型结构而不是其他模型结构的理由。作为一个应用,我们推导了一个新的训练框架,该框架通过依赖于每个数据和结构自动修改任何给定的训练算法来满足数据结构对齐条件,而不必假设它。在给定标准训练算法的情况下,运行其修改版本的框架在保持具有竞争力的(实际)测试性能的同时,通过卷积、跳过连接和标准基准数据集(包括MNIST、CIFAR-10、CIFAR-100)的批标准化,为ResNet-18提供全局收敛保证,Semeion、KMNIST和SVHN。 摘要:Representations of the world environment play a crucial role in machine intelligence. It is often inefficient to conduct reasoning and inference directly in the space of raw sensory representations, such as pixel values of images. Representation learning allows us to automatically discover suitable representations from raw sensory data. For example, given raw sensory data, a multilayer perceptron learns nonlinear representations at its hidden layers, which are subsequently used for classification (or regression) at its output layer. This happens implicitly during training through minimizing a supervised or unsupervised loss. In this paper, we study the dynamics of such implicit nonlinear representation learning. We identify a pair of a new assumption and a novel condition, called the common model structure assumption and the data-architecture alignment condition. Under the common model structure assumption, the data-architecture alignment condition is shown to be sufficient for the global convergence and necessary for the global optimality. Our results provide practical guidance for designing a model structure: e.g., the common model structure assumption can be used as a justification for using a particular model structure instead of others. As an application, we then derive a new training framework, which satisfies the data-architecture alignment condition without assuming it by automatically modifying any given training algorithm dependently on each data and architecture. Given a standard training algorithm, the framework running its modified version is empirically shown to maintain competitive (practical) test performances while providing global convergence guarantees for ResNet-18 with convolutions, skip connections, and batch normalization with standard benchmark datasets, including MNIST, CIFAR-10, CIFAR-100, Semeion, KMNIST and SVHN.
【50】 Central Limit Theorem for product of dependent random variables 标题:相依随机变量乘积的中心极限定理
作者:JunTao Duan,Ionel Popescu,Fan Zhou 链接:https://arxiv.org/abs/2106.14825 摘要:给定$\{xuk\}$是一个鞅差序列。并给出另一个$\{Y\u k\}$,它在序列中具有依赖性。假设$\{X\u k\}$与$\{Y\u k\}$独立,我们研究了两个序列$\sum\u{k=1}^{n}X\u k Y\u k$乘积和的性质。我们得到了乘积CLT,它是经典中心极限定理的一个修正,可以用于随机投影的研究。我们还得到了类似于经典CLT中Berry-Essen定理的收敛速度。 摘要:Given $\{X_k\}$ is a martingale difference sequence. And given another $\{Y_k\}$ which has dependency within the sequence. Assume $\{X_k\}$ is independent with $\{Y_k\}$, we study the properties of the sums of product of two sequences $\sum_{k=1}^{n} X_k Y_k$. We obtain product-CLT, a modification of classical central limit theorem, which can be useful in the study of random projections. We also obtain the rate of convergence which is similar to the Berry-Essen theorem in the classical CLT.
【51】 Laplace Redux -- Effortless Bayesian Deep Learning 标题:Laplace Redux--轻松的贝叶斯深度学习
作者:Erik Daxberger,Agustinus Kristiadi,Alexander Immer,Runa Eschenhagen,Matthias Bauer,Philipp Hennig 机构:University of Cambridge, MPI for Intelligent Systems, Tübingen, University of Tübingen, ETH Zurich, Max Planck ETH CLS, DeepMind, London 备注:Source Code: this https URL; Library Documentation: this https URL 链接:https://arxiv.org/abs/2106.14806 摘要:深度学习的贝叶斯公式已被证明具有令人信服的理论特性,并提供实际的功能优势,如改进的预测不确定性量化和模型选择。拉普拉斯近似(LA)是一个经典的,可以说是最简单的家庭的近似棘手的后验深层神经网络。然而,尽管它很简单,LA并不像变分贝叶斯或深群那样受欢迎。这可能是由于假设LA由于涉及Hessian计算而昂贵,难以实现,或者它产生较差的结果。在这项工作中,我们表明这些是误解:我们(i)审查了LA的各种变体,包括成本开销最小的版本(ii)引入“laplace”,这是一个易于使用的软件库,为PyTorch提供用户友好的访问所有主要口味的LA;以及(iii)通过大量实验证明,LA在性能方面与更流行的替代方案具有竞争力,同时在计算成本方面表现出色。我们希望,这项工作将作为一个催化剂,以更广泛地采用LA在实际的深度学习,包括在领域贝叶斯方法通常不考虑在目前。 摘要:Bayesian formulations of deep learning have been shown to have compelling theoretical properties and offer practical functional benefits, such as improved predictive uncertainty quantification and model selection. The Laplace approximation (LA) is a classic, and arguably the simplest family of approximations for the intractable posteriors of deep neural networks. Yet, despite its simplicity, the LA is not as popular as alternatives like variational Bayes or deep ensembles. This may be due to assumptions that the LA is expensive due to the involved Hessian computation, that it is difficult to implement, or that it yields inferior results. In this work we show that these are misconceptions: we (i) review the range of variants of the LA including versions with minimal cost overhead; (ii) introduce "laplace", an easy-to-use software library for PyTorch offering user-friendly access to all major flavors of the LA; and (iii) demonstrate through extensive experiments that the LA is competitive with more popular alternatives in terms of performance, while excelling in terms of computational cost. We hope that this work will serve as a catalyst to a wider adoption of the LA in practical deep learning, including in domains where Bayesian approaches are not typically considered at the moment.
【52】 On Locality of Local Explanation Models 标题:关于局部解释模型的局部性
作者:Sahra Ghalebikesabi,Lucile Ter-Minassian,Karla Diaz-Ordaz,Chris Holmes 机构:The London School of Hygiene & Tropical Medicine & The Alan Turing Institute, University of Oxford & The Alan Turing Institute 备注:Submitted to NeurIPS 2021 链接:https://arxiv.org/abs/2106.14648 摘要:Shapley值通过模拟全局总体分布下的特征缺失,为特定实例的模型结果提供模型无关的特征属性。当对局部模型行为感兴趣时,使用全局总体可能导致潜在的误导结果。因此,我们考虑制定邻域参考分布,以提高当地的解释性夏普利值。通过这样做,我们发现Nadaraya-Watson估计,一个研究得很好的核回归,可以表示为一个自标准化的重要抽样估计。在经验上,我们观察到邻域Shapley值确定了有意义的稀疏特征相关属性,从而提供了对局部模型行为的洞察,这是对传统Shapley分析的补充。它们还增加了对敌方分类器构造的流形解释性和鲁棒性。 摘要:Shapley values provide model agnostic feature attributions for model outcome at a particular instance by simulating feature absence under a global population distribution. The use of a global population can lead to potentially misleading results when local model behaviour is of interest. Hence we consider the formulation of neighbourhood reference distributions that improve the local interpretability of Shapley values. By doing so, we find that the Nadaraya-Watson estimator, a well-studied kernel regressor, can be expressed as a self-normalised importance sampling estimator. Empirically, we observe that Neighbourhood Shapley values identify meaningful sparse feature relevance attributions that provide insight into local model behaviour, complimenting conventional Shapley analysis. They also increase on-manifold explainability and robustness to the construction of adversarial classifiers.
【53】 The Convergence Rate of SGD's Final Iterate: Analysis on Dimension Dependence 标题:SGD最终迭代的收敛速度:维数相关性分析
作者:Daogao Liu,Zhou Lu 机构:University of Washington, Princeton University 链接:https://arxiv.org/abs/2106.14588 摘要:随机梯度下降法(SGD)是最优化中最简单、最流行的方法之一。SGD的收敛速度已经得到了广泛的研究,并对运行平均格式进行了严密的分析,但最终迭代的次优性仍然没有得到很好的理解。shamir2013stochastic给出了SGD最小化非光滑凸函数的最终迭代的最著名上界,Lipschitz凸函数的上界为$O(\logt/\sqrt{T})$,强凸性的附加假设为$O(\logt/T)$。然而,最为人所知的下界比上界差$\log T$。harvey2019tight给出了匹配的下界,但它们的构造需要维度$d=T$。然后,koren2020open询问了如何在定维环境下刻画SGD的最终迭代收敛性。在本文中,我们在更一般的条件下对任意$d\leq T$回答了这个问题,证明了在标准步长下SGD最终迭代的次优性的$\Omega(\log d/\sqrt{T})$下界和$\Omega(\log d/T)$下界。我们的结果给出了SGD最终迭代收敛的第一个一般维数依赖下界,部分解决了koren2020open提出的COLT开放问题。我们还提供了进一步的证据来证明一维的正确率应该是$\Theta(1/\sqrt{T})$,例如在比koren2020open更一般的设置下,一维特例的紧上界是$\O(1/\sqrt{T})$。 摘要:Stochastic Gradient Descent (SGD) is among the simplest and most popular methods in optimization. The convergence rate for SGD has been extensively studied and tight analyses have been established for the running average scheme, but the sub-optimality of the final iterate is still not well-understood. shamir2013stochastic gave the best known upper bound for the final iterate of SGD minimizing non-smooth convex functions, which is $O(\log T/\sqrt{T})$ for Lipschitz convex functions and $O(\log T/ T)$ with additional assumption on strongly convexity. The best known lower bounds, however, are worse than the upper bounds by a factor of $\log T$. harvey2019tight gave matching lower bounds but their construction requires dimension $d= T$. It was then asked by koren2020open how to characterize the final-iterate convergence of SGD in the constant dimension setting. In this paper, we answer this question in the more general setting for any $d\leq T$, proving $\Omega(\log d/\sqrt{T})$ and $\Omega(\log d/T)$ lower bounds for the sub-optimality of the final iterate of SGD in minimizing non-smooth Lipschitz convex and strongly convex functions respectively with standard step size schedules. Our results provide the first general dimension dependent lower bound on the convergence of SGD's final iterate, partially resolving a COLT open question raised by koren2020open. We also present further evidence to show the correct rate in one dimension should be $\Theta(1/\sqrt{T})$, such as a proof of a tight $O(1/\sqrt{T})$ upper bound for one-dimensional special cases in settings more general than koren2020open.
【54】 Systematic evaluation of variability detection methods for eROSITA 标题:eROSITA变异性检测方法的系统评价
作者:Johannes Buchner,Thomas Boller,David Bogensberger,Adam Malyali,Kirpal Nandra,Joern Wilms,Tom Dwelly,Teng Liu 机构: Max Planck Institute for Extraterrestrial Physics, Giessenbachstrasse, Garching, Germany, Dr., Karl Remeis-Observatory and Erlangen Centre for Astroparticle Physics, Friedrich-Alexander-, Universität Erlangen-Nürnberg,Sternwartstr. , Bamberg, Germany 备注:Resubmitted version after a positive first referee report. Variability analysis tools available this https URL 15 min Talk: this https URL To appear on A&A, Special Issue: The Early Data Release of eROSITA and Mikhail Pavlinsky ART-XC on the SRG Mission 链接:https://arxiv.org/abs/2106.14529 摘要:研究了在稀疏和不规则采样的X射线光照曲线中探测源变率的可靠性。这是由史无前例的调查能力的eROSITA船载SRG的动机,提供了数千个来源的光曲线在其最后的深度赤道深实地调查。评价了四种检测变异性的方法:超额方差、幅度最大偏差、贝叶斯块和一种新的超额方差贝叶斯公式。根据模拟的恒定光源的泊松光曲线判断变异性的误检率,并标定显著性阈值。模拟与耀斑注入赞成振幅最大偏差为最敏感的低假探测。模拟白色和红色随机源变异有利于贝叶斯方法。这一结果也适用于厄洛西塔的全天巡天计划中预计的百万个来源。 摘要:The reliability of detecting source variability in sparsely and irregularly sampled X-ray light curves is investigated. This is motivated by the unprecedented survey capabilities of eROSITA onboard SRG, providing light curves for many thousand sources in its final-depth equatorial deep field survey. Four methods for detecting variability are evaluated: excess variance, amplitude maximum deviations, Bayesian blocks and a new Bayesian formulation of the excess variance. We judge the false detection rate of variability based on simulated Poisson light curves of constant sources, and calibrate significance thresholds. Simulations with flares injected favour the amplitude maximum deviation as most sensitive at low false detections. Simulations with white and red stochastic source variability favour Bayesian methods. The results are applicable also for the million sources expected in eROSITA's all-sky survey.
【55】 Poisoning the Search Space in Neural Architecture Search 标题:神经结构搜索中的搜索空间毒化
作者:Robert Wu,Nayan Saxena,Rohan Jain 机构:Department of Computer Science, University of Toronto, Department of Statistical Sciences, Departments of Computer Science & Mathematics 备注:All authors contributed equally. Appears in AdvML Workshop @ ICML2021: A Blessing in Disguise: The Prospects and Perils of Adversarial Machine Learning 链接:https://arxiv.org/abs/2106.14406 摘要:深度学习已被证明是一种高效的问题解决工具,可用于跨多个领域(如医疗保健和自动驾驶)的目标检测和图像分割。这种性能的核心在于神经结构设计,它严重依赖于领域知识和研究人员的经验。最近,在给定可能操作的初始搜索空间的情况下,这种寻找最佳体系结构的过程被神经体系结构搜索(NAS)自动化了。在本文中,我们评估了一种称为高效NAS(ENAS)的算法对原始搜索空间中精心设计的无效操作的数据不可知中毒攻击的鲁棒性。通过在CIFAR-10数据集上评估算法性能,我们实证地证明了我们新的搜索空间中毒(SSP)方法和多实例中毒攻击是如何利用ENAS控制器的设计缺陷导致子网络的预测错误率过高的。我们的结果为使用NAS进行更具对抗性的健壮体系结构搜索提供了挑战。 摘要:Deep learning has proven to be a highly effective problem-solving tool for object detection and image segmentation across various domains such as healthcare and autonomous driving. At the heart of this performance lies neural architecture design which relies heavily on domain knowledge and prior experience on the researchers' behalf. More recently, this process of finding the most optimal architectures, given an initial search space of possible operations, was automated by Neural Architecture Search (NAS). In this paper, we evaluate the robustness of one such algorithm known as Efficient NAS (ENAS) against data agnostic poisoning attacks on the original search space with carefully designed ineffective operations. By evaluating algorithm performance on the CIFAR-10 dataset, we empirically demonstrate how our novel search space poisoning (SSP) approach and multiple-instance poisoning attacks exploit design flaws in the ENAS controller to result in inflated prediction error rates for child networks. Our results provide insights into the challenges to surmount in using NAS for more adversarially robust architecture search.
【56】 Armoured Fighting Vehicle Team Performance Prediction against Missile Attacks with Directed Energy Weapons 标题:定向能武器对抗导弹攻击的装甲战车编队性能预测
作者:Graham V. Weinberg,Mitchell Kracman 链接:https://arxiv.org/abs/2106.14381 摘要:最近的一项研究引入了一种程序来量化装甲战车小组在遭受单一导弹攻击时的生存能力。特别是这项研究探讨了协同主动防护系统的概念,重点是高功率射频定向能武器提供车辆防御的情况。本文的目的是说明如何将这种分析扩展到一种以上的导弹威胁。这是通过引入一个跳跃随机过程来实现的,其状态表示在给定时刻被击溃的导弹数量。通过考虑这个随机过程的逗留时间进行分析,并说明了考虑这些跳跃时间与辅助随机过程的转移概率之间的关系。后者的概率则与导弹威胁的探测和破坏概率有关。这些逗留时间的总和可以用来量化团队在任何给定时刻的生存能力。由于在本文中对高能激光的应用有很大的兴趣,因此数值例子将集中在装甲战车队防御的这种定向能武器上。 摘要:A recent study has introduced a procedure to quantify the survivability of a team of armoured fighting vehicles when it is subjected to a single missile attack. In particular this study investigated the concept of collaborative active protection systems, focusing on the case where vehicle defence is provided by high power radio frequency directed energy weapons. The purpose of the current paper is to demonstrate how this analysis can be extended to account for more than one missile threat. This is achieved by introducing a jump stochastic process whose states represent the number of missiles defeated at a given time instant. Analysis proceeds through consideration of the sojourn times of this stochastic process, and it is shown how consideration of these jump times can be related to transition probabilities of the auxiliary stochastic process. The latter probabilities are then related to the probabilities of detection and disruption of missile threats. The sum of these sojourn times can then be used to quantify the survivability of the team at any given time instant. Due to the fact that there is much interest in the application of high energy lasers in the context of this paper, the numerical examples will thus focus on such directed energy weapons for armoured fighting vehicle team defence.
【57】 High-probability Bounds for Non-Convex Stochastic Optimization with Heavy Tails 标题:重尾非凸随机优化问题的高概率界
作者:Ashok Cutkosky,Harsh Mehta 机构:Boston University, Google Research 链接:https://arxiv.org/abs/2106.14343 摘要:我们考虑使用梯度估计可能有重尾的一阶算法的非凸随机优化。我们证明了当梯度在(1,2]$中只有有界的$\mathfrak{p}$次矩时,梯度裁剪、动量和归一化梯度下降的组合以最大的概率收敛到临界点。然后我们考虑二阶光滑损失的情况,据我们所知,在这种情况下还没有研究过,并且再次获得了任何$\mathfrak{p}$的高概率界。此外,我们的结果适用于任意光滑范数,而典型的SGD分析需要Hilbert空间范数。进一步地,我们证明了在一个合适的“磨合”期之后,每次迭代的目标值都会单调地减小,直到确定了一个临界点,这为学习率“热身”的流行实践提供了直觉,也产生了最后一次迭代的保证。 摘要:We consider non-convex stochastic optimization using first-order algorithms for which the gradient estimates may have heavy tails. We show that a combination of gradient clipping, momentum, and normalized gradient descent yields convergence to critical points in high-probability with best-known rates for smooth losses when the gradients only have bounded $\mathfrak{p}$th moments for some $\mathfrak{p}\in(1,2]$. We then consider the case of second-order smooth losses, which to our knowledge have not been studied in this setting, and again obtain high-probability bounds for any $\mathfrak{p}$. Moreover, our results hold for arbitrary smooth norms, in contrast to the typical SGD analysis which requires a Hilbert space norm. Further, we show that after a suitable "burn-in" period, the objective value will monotonically decrease for every iteration until a critical point is identified, which provides intuition behind the popular practice of learning rate "warm-up" and also yields a last-iterate guarantee.
【58】 Stabilizing Equilibrium Models by Jacobian Regularization 标题:用雅可比正则化稳定平衡模型
作者:Shaojie Bai,Vladlen Koltun,J. Zico Kolter 机构: one could directly differentiate through the final 1Carnegie Mellon University 备注:ICML 2021 Short Oral 链接:https://arxiv.org/abs/2106.14342 摘要:深度平衡网络(DEQs)是一类新的模型,它避开了传统的深度模型,有利于寻找单个非线性层的不动点。这些模型已被证明在使用更少内存的同时,实现了与最先进的深度网络相竞争的性能。然而,它们也比较慢,对于架构选择来说比较脆弱,并且会给模型带来潜在的不稳定性。本文提出了一种DEQ模型的正则化方法,它将不动点更新方程的雅可比矩阵显式正则化,以稳定平衡模型的学习。我们表明,这种正则化只增加了最小的计算成本,显著地稳定了前向和后向过程中的不动点收敛,并且可以很好地扩展到高维、真实的领域(例如WikiText-103语言建模和ImageNet分类)。使用这种方法,我们首次演示了一种隐式深度模型,该模型的运行速度和性能水平与流行的传统深度网络(如ResNet-101)大致相同,同时仍然保持了deq的恒定内存占用和架构简单性。代码位于https://github.com/locuslab/deq . 摘要:Deep equilibrium networks (DEQs) are a new class of models that eschews traditional depth in favor of finding the fixed point of a single nonlinear layer. These models have been shown to achieve performance competitive with the state-of-the-art deep networks while using significantly less memory. Yet they are also slower, brittle to architectural choices, and introduce potential instability to the model. In this paper, we propose a regularization scheme for DEQ models that explicitly regularizes the Jacobian of the fixed-point update equations to stabilize the learning of equilibrium models. We show that this regularization adds only minimal computational cost, significantly stabilizes the fixed-point convergence in both forward and backward passes, and scales well to high-dimensional, realistic domains (e.g., WikiText-103 language modeling and ImageNet classification). Using this method, we demonstrate, for the first time, an implicit-depth model that runs with approximately the same speed and level of performance as popular conventional deep networks such as ResNet-101, while still maintaining the constant memory footprint and architectural simplicity of DEQs. Code is available at https://github.com/locuslab/deq .
【59】 Regret Analysis in Deterministic Reinforcement Learning 标题:确定性强化学习中的后悔分析
作者:Damianos Tranos,Alexandre Proutiere 机构: School of Electrical Engineering and Computer Science 链接:https://arxiv.org/abs/2106.14338 摘要:考虑具有确定性转移的马尔可夫决策过程(MDPs),研究后悔最小化问题,这是分析和设计最优学习算法的核心。我们提出了显式依赖于系统参数的对数问题特定遗憾下界(与以前的minimax方法相比),从而真正量化了任何学习算法可达到的性能基本极限。确定性mdp可以解释为图,并根据它们的循环进行分析,我们利用这一事实来识别一类确定性mdp,其遗憾下限可以通过数值确定。我们在一个确定性的线搜索问题和一个具有状态相关奖励的确定性MDP上进一步举例说明了这个结果,我们可以显式地说明它的遗憾下界。这些界与已知的多臂bandit问题的特定问题界有相似之处,并表明在确定性MDP上导航不必对学习算法的性能产生影响。 摘要:We consider Markov Decision Processes (MDPs) with deterministic transitions and study the problem of regret minimization, which is central to the analysis and design of optimal learning algorithms. We present logarithmic problem-specific regret lower bounds that explicitly depend on the system parameter (in contrast to previous minimax approaches) and thus, truly quantify the fundamental limit of performance achievable by any learning algorithm. Deterministic MDPs can be interpreted as graphs and analyzed in terms of their cycles, a fact which we leverage in order to identify a class of deterministic MDPs whose regret lower bound can be determined numerically. We further exemplify this result on a deterministic line search problem, and a deterministic MDP with state-dependent rewards, whose regret lower bounds we can state explicitly. These bounds share similarities with the known problem-specific bound of the multi-armed bandit problem and suggest that navigation on a deterministic MDP need not have an effect on the performance of a learning algorithm.
【60】 Learning stochastic object models from medical imaging measurements by use of advanced AmbientGANs 标题:利用先进的AmbientGANs从医学成像测量中学习随机对象模型
作者:Weimin Zhou,Sayantan Bhadra,Frank J. Brooks,Hua Li,Mark A. Anastasio 备注:Submitted to IEEE Transactions on Medical Imaging. arXiv admin note: substantial text overlap with arXiv:2006.00033 链接:https://arxiv.org/abs/2106.14324 摘要:为了通过计算机模拟客观地评估新的医学成像技术,重要的是要考虑到所有来源的变异,有助于图像数据。可变性的一个重要来源,可以大大限制观察员的表现,是与可变性,在合奏的对象成像。这种可变性的来源可以用随机对象模型(som)来描述,som是一种生成性模型,可以用来从虚拟成像对象的分布中取样。通常希望通过使用具有良好特征的成像系统获得的实验成像测量来建立som,但是这项任务仍然具有挑战性。深层生成性神经网络,如生成性对抗性网络(GANs)在这类任务中具有潜力。为了从成像测量中建立som,提出了一种用测量算子扩充GAN的环境GAN。然而,原始的AmbientGAN不能立即从现代的训练程序和GAN结构中获益,这限制了它应用于实际大小的医学图像数据的能力。为了避免这一问题,本文提出了一种改进的氛围训练策略,该策略适用于现代渐进式或多分辨率训练方法,如用于渐进式生长的GANs和基于风格的GANs。利用所提出的训练程序所建立的环境,通过与程式化成像系统相对应的计算机模拟测量数据,以受控的方式进行了系统的验证。最后,利用模拟的单线圈实验磁共振成像数据,在较少程式化的条件下验证了该方法。 摘要:In order to objectively assess new medical imaging technologies via computer-simulations, it is important to account for all sources of variability that contribute to image data. One important source of variability that can significantly limit observer performance is associated with the variability in the ensemble of objects to-be-imaged. This source of variability can be described by stochastic object models (SOMs), which are generative models that can be employed to sample from a distribution of to-be-virtually-imaged objects. It is generally desirable to establish SOMs from experimental imaging measurements acquired by use of a well-characterized imaging system, but this task has remained challenging. Deep generative neural networks, such as generative adversarial networks (GANs) hold potential for such tasks. To establish SOMs from imaging measurements, an AmbientGAN has been proposed that augments a GAN with a measurement operator. However, the original AmbientGAN could not immediately benefit from modern training procedures and GAN architectures, which limited its ability to be applied to realistically sized medical image data. To circumvent this, in this work, a modified AmbientGAN training strategy is proposed that is suitable for modern progressive or multi-resolution training approaches such as employed in the Progressive Growing of GANs and Style-based GANs. AmbientGANs established by use of the proposed training procedure are systematically validated in a controlled way by use of computer-simulated measurement data corresponding to a stylized imaging system. Finally, emulated single-coil experimental magnetic resonance imaging data are employed to demonstrate the methods under less stylized conditions.
【61】 Global Convergence of Gradient Descent for Asymmetric Low-Rank Matrix Factorization 标题:非对称低秩矩阵分解的梯度下降全局收敛性
作者:Tian Ye,Simon S. Du 机构:Institute for Interdisciplinary Information Sciences, Tsinghua University, Paul G. Allen School of Computer Science and Engineering, University of Washington 链接:https://arxiv.org/abs/2106.14289 摘要:我们研究了非对称低秩分解问题:\[\min\uuu{\mathbf{U}\in\mathbb{R}^{m\times d}、\mathbf{V}\in\mathbb{R}^{n\times d}\frac{1}{2}\\mathbf{U}\mathbf{V}^\top-\mathbf{\Sigma}\{uf^2\],其中$\mathbf{\Sigma}$是一个给定的大小为$m\times n$、秩为$d$的矩阵。这是一个典型的问题,在优化中有两个困难:1)非凸性和2)非光滑性(由于$\mathbf{U}$和$\mathbf{V}$的不平衡性)。这也是一个原型更复杂的问题,如不对称矩阵传感和矩阵完成。尽管随机初始化的梯度下降算法具有非凸性和非光滑性,但经验证明它可以在多项式时间内解决这一问题。现有的解释这一现象的理论都需要对算法进行人工修改,例如在每次迭代中添加噪声,并添加一个平衡正则化器来平衡$\mathbf{U}$和$\mathbf{V}$。本文首先证明了随机初始化梯度下降收敛于多项式率的非对称低秩因子分解问题的全局最小值。为了证明这一点,我们发展了1)一种新的对称化技术来捕捉对称性和非对称性的大小;2)一种定量扰动分析来逼近矩阵导数。我们相信这两种方法对于其他相关的非凸问题都是有用的。 摘要:We study the asymmetric low-rank factorization problem: \[\min_{\mathbf{U} \in \mathbb{R}^{m \times d}, \mathbf{V} \in \mathbb{R}^{n \times d}} \frac{1}{2}\|\mathbf{U}\mathbf{V}^\top -\mathbf{\Sigma}\|_F^2\] where $\mathbf{\Sigma}$ is a given matrix of size $m \times n$ and rank $d$. This is a canonical problem that admits two difficulties in optimization: 1) non-convexity and 2) non-smoothness (due to unbalancedness of $\mathbf{U}$ and $\mathbf{V}$). This is also a prototype for more complex problems such as asymmetric matrix sensing and matrix completion. Despite being non-convex and non-smooth, it has been observed empirically that the randomly initialized gradient descent algorithm can solve this problem in polynomial time. Existing theories to explain this phenomenon all require artificial modifications of the algorithm, such as adding noise in each iteration and adding a balancing regularizer to balance the $\mathbf{U}$ and $\mathbf{V}$. This paper presents the first proof that shows randomly initialized gradient descent converges to a global minimum of the asymmetric low-rank factorization problem with a polynomial rate. For the proof, we develop 1) a new symmetrization technique to capture the magnitudes of the symmetry and asymmetry, and 2) a quantitative perturbation analysis to approximate matrix derivatives. We believe both are useful for other related non-convex problems.
【62】 How many moments does MMD compare? 标题:MMD比较了几个时刻?
作者:Rustem Takhanov 机构: School of Sciences and Humanities 链接:https://arxiv.org/abs/2106.14277 摘要:本文提出了一种研究Mercer核的新方法,它对应于一个特殊的核$K$a伪微分算子$p({\mathbf x},D)$,使得$\mathcal{F}p({\mathbf x},D)^\dag p({\mathbf x},D)\mathcal{F}^{-1}$作用于光滑函数的方式与与与$K$相关联的积分运算符相同(其中$\mathcal{F}$是傅里叶变换)。我们证明了伪微分算子定义的核能够一致逼近紧集上的任意连续Mercer核。符号$p({\mathbf x},{\mathbf y})$封装了许多关于由内核$K$定义的最大平均差异距离结构的有用信息。我们用$p$奇异值分解的前$r$项之和来近似$p({\mathbf x},{\mathbf y})$,表示为$p\r({\mathbf x},{\mathbf y})$。如果与$p({\mathbf x},{\mathbf y})$相关的积分算子的有序奇异值迅速衰减,则由新符号$p\r$定义的MMD距离与初始符号仅略有不同。此外,新的MMD距离可以解释为比较两个概率分布的$r$局部矩的聚合结果。后者的结果在与$p$有关的积分算子的右奇异向量一致有界的条件下成立。但是,即使这不能满足,我们仍然可以认为,p$和p\r$之间的Hilbert-Schmidt距离消失了。因此,我们报告了一个有趣的现象:MMD距离度量两个概率分布相对于一定数量的局部矩的差异,$r^\ast$,而这个数字$r^\ast$取决于奇异值$p$消失的速度。 摘要:We present a new way of study of Mercer kernels, by corresponding to a special kernel $K$ a pseudo-differential operator $p({\mathbf x}, D)$ such that $\mathcal{F} p({\mathbf x}, D)^\dag p({\mathbf x}, D) \mathcal{F}^{-1}$ acts on smooth functions in the same way as an integral operator associated with $K$ (where $\mathcal{F}$ is the Fourier transform). We show that kernels defined by pseudo-differential operators are able to approximate uniformly any continuous Mercer kernel on a compact set. The symbol $p({\mathbf x}, {\mathbf y})$ encapsulates a lot of useful information about the structure of the Maximum Mean Discrepancy distance defined by the kernel $K$. We approximate $p({\mathbf x}, {\mathbf y})$ with the sum of the first $r$ terms of the Singular Value Decomposition of $p$, denoted by $p_r({\mathbf x}, {\mathbf y})$. If ordered singular values of the integral operator associated with $p({\mathbf x}, {\mathbf y})$ die down rapidly, the MMD distance defined by the new symbol $p_r$ differs from the initial one only slightly. Moreover, the new MMD distance can be interpreted as an aggregated result of comparing $r$ local moments of two probability distributions. The latter results holds under the condition that right singular vectors of the integral operator associated with $p$ are uniformly bounded. But even if this is not satisfied we can still hold that the Hilbert-Schmidt distance between $p$ and $p_r$ vanishes. Thus, we report an interesting phenomenon: the MMD distance measures the difference of two probability distributions with respect to a certain number of local moments, $r^\ast$, and this number $r^\ast$ depends on the speed with which singular values of $p$ die down.
【63】 Fast and stable modification of the Gauss-Newton method for low-rank signal estimation 标题:低秩信号估计高斯-牛顿法的快速稳定修正
作者:Nikita Zvonarev,Nina Golyandina 机构:St.Petersburg State University, Universitetskaya nab. ,, St.Petersburg, Correspondence, nab. ,, St.Petersburg, Russia., Summary 备注:arXiv admin note: text overlap with arXiv:2101.09779, arXiv:1803.01419 链接:https://arxiv.org/abs/2106.14215 摘要:研究了低秩信号估计的加权非线性最小二乘问题。讨论了构造长时间序列稳定快速的数值解的问题。提出了一种改进的加权Gauss-Newton方法,该方法可以通过直接变量投影到低秩信号空间来实现。对于在$p$阶自回归噪声存在下提供信号最大似然估计的权重矩阵,迭代的计算成本为$O(N r^2+N p^2+r N\log N)$,因为$N$趋于无穷大,其中$N$是时间序列长度,$r$是近似时间序列的秩。此外,该方法可以应用于缺失值的数据,而不增加计算量。从浮点数值稳定性和计算量两个方面与基于变量投影法的最新方法进行了比较。 摘要:The weighted nonlinear least-squares problem for low-rank signal estimation is considered. The problem of constructing a numerical solution that is stable and fast for long time series is addressed. A modified weighted Gauss-Newton method, which can be implemented through the direct variable projection onto a space of low-rank signals, is proposed. For a weight matrix which provides the maximum likelihood estimator of the signal in the presence of autoregressive noise of order $p$ the computational cost of iterations is $O(N r^2 + N p^2 + r N \log N)$ as $N$ tends to infinity, where $N$ is the time-series length, $r$ is the rank of the approximating time series. Moreover, the proposed method can be applied to data with missing values, without increasing the computational cost. The method is compared with state-of-the-art methods based on the variable projection approach in terms of floating-point numerical stability and computational cost.
【64】 Nonparametric estimation of continuous DPPs with kernel methods 标题:基于核方法的连续DPP的非参数估计
作者:Michaël Fanuel,Rémi Bardenet 备注:23 pages 链接:https://arxiv.org/abs/2106.14210 摘要:行列式点过程是排斥点模式的统计模型。抽样和推理都适用于DPPs,这是负相关模型中的一个罕见特征,解释了DPPs在机器学习和空间统计中的流行。提出了有限情况下的参数和非参数推理方法,即当点模式位于有限地集中时。在连续情况下,只研究了参数方法,而DPPs的非参数极大似然问题(迹类算子上的优化问题)仍然是一个悬而未决的问题。本文证明了这个极大似然(MLE)问题的一个限制形式在RKHS中非负函数的一个最近的表示中心定理的范围内。这导致了一个有限维问题,与原始极大似然估计有很强的统计联系。此外,我们提出、分析并证明了一个不动点算法来解决这个有限维问题。最后,我们还提供了DPP相关核的受控估计,从而提供了更高的可解释性。 摘要:Determinantal Point Process (DPPs) are statistical models for repulsive point patterns. Both sampling and inference are tractable for DPPs, a rare feature among models with negative dependence that explains their popularity in machine learning and spatial statistics. Parametric and nonparametric inference methods have been proposed in the finite case, i.e. when the point patterns live in a finite ground set. In the continuous case, only parametric methods have been investigated, while nonparametric maximum likelihood for DPPs -- an optimization problem over trace-class operators -- has remained an open question. In this paper, we show that a restricted version of this maximum likelihood (MLE) problem falls within the scope of a recent representer theorem for nonnegative functions in an RKHS. This leads to a finite-dimensional problem, with strong statistical ties to the original MLE. Moreover, we propose, analyze, and demonstrate a fixed point algorithm to solve this finite-dimensional problem. Finally, we also provide a controlled estimate of the correlation kernel of the DPP, thus providing more interpretability.
【65】 On Hyperspectral Unmixing 标题:关于高光谱解混
作者:Wing-Kin Ma 机构:Department of Electronic Engineering, The Chinese University of Hong Kong, Hong Kong SAR of China 备注:to appear in IGARSS 2021, Special Session on "The Contributions of Jos\'e Manuel Bioucas-Dias to Remote Sensing Data Processing" 链接:https://arxiv.org/abs/2106.14177 摘要:在这篇文章中,作者回顾了Jos'e Bioucas Dias对高光谱分解(HU)的重要贡献,以纪念他作为一位有影响力的学者,以及他将许多美丽的思想引入高光谱领域。我们的故事将从顶点成分分析(VCA)开始,VCA是最著名的HU算法之一,被Google学者引用了2000多篇。VCA是开创性的,发明于HU研究刚刚开始出现的时候,它显示了对一个当时不太了解的课题的敏锐见解。然后我们将转向剑麻,另一种广泛使用的算法。剑麻不仅是一个非常成功的算法,它也是它的发明者在应用优化和在实际噪声情况下的智能公式方面的独创性的证明。我们的旅程将以依赖成分分析(DECA)结束,这也许是一个不太知名的贡献。DECA采用了一种统计推断框架,作者的最新研究表明,这种框架有很大的发展潜力,例如剑麻和DECA之间存在着潜在的联系。在这方面,DECA的发展显示了未来几年的远见。 摘要:In this article the author reviews Jos\'e Bioucas-Dias' key contributions to hyperspectral unmixing (HU), in memory of him as an influential scholar and for his many beautiful ideas introduced to the hyperspectral community. Our story will start with vertex component analysis (VCA) -- one of the most celebrated HU algorithms, with more than 2,000 Google Scholar citations. VCA was pioneering, invented at a time when HU research just began to emerge, and it shows sharp insights on a then less-understood subject. Then we will turn to SISAL, another widely-used algorithm. SISAL is not only a highly successful algorithm, it is also a demonstration of its inventor's ingenuity on applied optimization and on smart formulation for practical noisy cases. Our tour will end with dependent component analysis (DECA), perhaps a less well-known contribution. DECA adopts a statistical inference framework, and the author's latest research indicates that such framework has great potential for further development, e.g., there are hidden connections between SISAL and DECA. The development of DECA shows foresight years ahead, in that regard.
【66】 Stochastic Parametrization using Compressed Sensing: Application to the Lorenz-96 Atmospheric Model 标题:压缩传感随机参数化:在Lorenz-96大气模式中的应用
作者:Amartya Mukherjee,Yusuf Aydogdu,Thambirajah Ravichandran,Navaratnam Sri Namachchivaya 机构: SRI NAMACHCHIVAYA 1 1Department of Applied Mathematics, University of Waterloo 备注:21 pages, 8 figures 链接:https://arxiv.org/abs/2106.14110 摘要:近年来,随着压缩感知技术的出现,越来越多的基于信号稀疏表示的优化和回归技术受到广泛关注。本文利用高维空间中的稀疏近似建立模型(向量场)来模拟精细尺度过程的行为,使显式模拟成为参数化的在线基准。在低维构建模型的整合过程中,观测数据被同化以提供预测。我们概述了如何将本文提出的参数化方案和低维滤波算法应用于Lorenz-96大气模式,该模式模拟了中纬度大气动力学的微观对流过程。 摘要:Growing set of optimization and regression techniques, based upon sparse representations of signals, to build models from data sets has received widespread attention recently with the advent of compressed sensing. In this paper, sparse approximations in high-dimensional spaces are used to build models (vector fields) to emulate the behavior of the fine-scale process, so that explicit simulations become an online benchmark for parameterization. Observations are assimilated during the integration of low-dimensional built model to provide predictions. We outline how the parameterization schemes developed here and the low-dimensional filtering algorithm can be applied to the Lorenz-96 atmospheric model that mimics mid-latitude atmospheric dynamics with microscopic convective processes.
【67】 Model-Advantage Optimization for Model-Based Reinforcement Learning 标题:基于模型的强化学习的模型优势优化
作者:Nirbhay Modhe,Harish Kamath,Dhruv Batra,Ashwin Kalyan 机构:Georgia Tech, Allen Institute for AI 链接:https://arxiv.org/abs/2106.14080 摘要:传统上,基于模型的强化学习(MBRL)算法的设计目标是学习环境的精确动态。这导致了模型学习的目标与寻找最优策略的整体学习问题之间的不匹配。价值感知模型学习是最大似然法的一种替代模型学习范式,它通过学习策略的价值函数为模型学习提供信息。虽然这种模式在理论上是合理的,但它并没有超出玩具设置的范围。在这项工作中,我们提出了一个新的价值意识的目标,这是一个上限的绝对性能差异的政策在两个模型。此外,我们还提出了一个通用算法,该算法修改了标准的MBRL管道——实现具有价值感知目标的学习。我们提出的目标,结合这个算法,是第一个成功的实例价值意识的MBRL在具有挑战性的连续控制环境,优于以往的价值意识的目标和具有竞争力的性能w.r.t.MLE为基础的MBRL方法。 摘要:Model-based Reinforcement Learning (MBRL) algorithms have been traditionally designed with the goal of learning accurate dynamics of the environment. This introduces a mismatch between the objectives of model-learning and the overall learning problem of finding an optimal policy. Value-aware model learning, an alternative model-learning paradigm to maximum likelihood, proposes to inform model-learning through the value function of the learnt policy. While this paradigm is theoretically sound, it does not scale beyond toy settings. In this work, we propose a novel value-aware objective that is an upper bound on the absolute performance difference of a policy across two models. Further, we propose a general purpose algorithm that modifies the standard MBRL pipeline -- enabling learning with value aware objectives. Our proposed objective, in conjunction with this algorithm, is the first successful instantiation of value-aware MBRL on challenging continuous control environments, outperforming previous value-aware objectives and with competitive performance w.r.t. MLE-based MBRL approaches.
【68】 The Role of Contextual Information in Best Arm Identification 标题:上下文信息在最佳ARM识别中的作用
作者:Masahiro Kato,Kaito Ariu 机构:CyberAgent Inc., KTH 链接:https://arxiv.org/abs/2106.14077 摘要:研究了随机盗贼在有上下文(协变量)信息的情况下,具有固定置信度的最优手臂识别问题。虽然我们可以在每一轮中使用上下文信息,但我们感兴趣的是边缘化的平均报酬超过上下文分布。我们的目标是在给定的错误率下,用最少的采样次数来确定最佳的arm。我们给出了问题的实例特定样本复杂度下界。然后,我们提出了一种上下文感知的跟踪停止策略,其中arm绘制的比例跟踪最优分配集,并证明了arm绘制的期望数目与下界渐近匹配。我们证明,与Garivier&Kaufmann(2016)的结果相比,上下文信息可以提高识别最佳边缘化平均报酬的效率。我们实验证实,上下文信息有助于更快的最佳手臂识别。 摘要:We study the best-arm identification problem with fixed confidence when contextual (covariate) information is available in stochastic bandits. Although we can use contextual information in each round, we are interested in the marginalized mean reward over the contextual distribution. Our goal is to identify the best arm with a minimal number of samplings under a given value of the error rate. We show the instance-specific sample complexity lower bounds for the problem. Then, we propose a context-aware version of the "Track-and-Stop" strategy, wherein the proportion of the arm draws tracks the set of optimal allocations and prove that the expected number of arm draws matches the lower bound asymptotically. We demonstrate that contextual information can be used to improve the efficiency of the identification of the best marginalized mean reward compared with the results of Garivier & Kaufmann (2016). We experimentally confirm that context information contributes to faster best-arm identification.
【69】 Contextual Inverse Optimization: Offline and Online Learning 标题:上下文逆向优化:离线和在线学习
作者:Omar Besbes,Yuri Fonseca,Ilan Lobel 链接:https://arxiv.org/abs/2106.14015 摘要:我们研究了具有反馈信息的离线和在线上下文优化问题,其中我们不是观察损失,而是在事后观察一个完全了解目标函数的预言者将采取的最佳行动。我们的目标是最大限度地减少遗憾,遗憾是指我们的损失与一个无所不知的先知所造成的损失之间的差异。在离线环境中,决策者拥有过去时期的可用信息,需要做出一个决策;而在在线环境中,决策者根据每个时期的一组新的可行行动和上下文功能,随时间动态优化决策。对于离线设置,我们描述了最优的minimax策略,建立了可以实现的性能作为数据诱导的信息的基本几何结构的函数。在在线设置中,我们利用这种几何特征来优化累积后悔。我们发展了一个算法,产生这个问题的第一个遗憾界是对数的时间范围。 摘要:We study the problems of offline and online contextual optimization with feedback information, where instead of observing the loss, we observe, after-the-fact, the optimal action an oracle with full knowledge of the objective function would have taken. We aim to minimize regret, which is defined as the difference between our losses and the ones incurred by an all-knowing oracle. In the offline setting, the decision-maker has information available from past periods and needs to make one decision, while in the online setting, the decision-maker optimizes decisions dynamically over time based a new set of feasible actions and contextual functions in each period. For the offline setting, we characterize the optimal minimax policy, establishing the performance that can be achieved as a function of the underlying geometry of the information induced by the data. In the online setting, we leverage this geometric characterization to optimize the cumulative regret. We develop an algorithm that yields the first regret bound for this problem that is logarithmic in the time horizon.
【70】 Detecting anomalies in heterogeneous population-scale VAT networks 标题:异质人群规模增值税网络中的异常检测
作者:Angelos Alexopoulos,Petros Dellaportas,Stanley Gyoshev,Christos Kotsogiannis,Sofia C. Olhede,Trifon Pavkov 备注:14 pages, 5 figures, 3 tables 链接:https://arxiv.org/abs/2106.14005 摘要:网络科学中的异常检测是确定异常边缘、节点、子图或其他网络事件的方法。异构网络通常包含超出观察到的网络本身的信息。增值税(VAT,一种商品和服务税)网络由增值税注册纳税人的成对互动定义,在人口规模上进行分析,需要可扩展的算法。通过对VAT异常性质的定量理解,我们定义了一种利用微观、中尺度和全球尺度模式的信息识别VAT异常的方法,这些信息可以作为人口尺度网络分析来解释和有效实施。所提出的方法是自动化的,并可实时实施,使税务机关能够通过在增值税系统内进行欺诈的早期识别,防止税收收入的巨大损失。 摘要:Anomaly detection in network science is the method to determine aberrant edges, nodes, subgraphs or other network events. Heterogeneous networks typically contain information going beyond the observed network itself. Value Added Tax (VAT, a tax on goods and services) networks, defined from pairwise interactions of VAT registered taxpayers, are analysed at a population-scale requiring scalable algorithms. By adopting a quantitative understanding of the nature of VAT-anomalies, we define a method that identifies them utilising information from micro-scale, meso-scale and global-scale patterns that can be interpreted, and efficiently implemented, as population-scale network analysis. The proposed method is automatable, and implementable in real time, enabling revenue authorities to prevent large losses of tax revenues through performing early identification of fraud within the VAT system.
【71】 Implicit Gradient Alignment in Distributed and Federated Learning 标题:分布式联合学习中的隐式梯度对齐
作者:Yatin Dandi,Luis Barba,Martin Jaggi 机构:IIT Kanpur, India, EPFL, Switzerland 链接:https://arxiv.org/abs/2106.13897 摘要:在分布式和联合学习中实现全局收敛的一个主要障碍是,由于分布式数据的异构性和随机性,客户端或小批量之间的梯度不一致。缓解这一问题的一种方法是鼓励在整个训练过程中跨不同客户调整梯度。我们的分析表明,这一目标可以通过使用正确的优化方法来实现,该方法复制了SGD的隐式正则化效应,从而实现梯度对齐,并提高了测试精度。由于这种正则化在SGD中的存在完全依赖于训练过程中不同小批量的连续使用,因此在训练大的小批量时,这种正则化是不存在的。为了在提高并行性的同时获得这种正则化的泛化优势,我们提出了一种新的gradallign算法,该算法在每次更新时都允许使用任意大的批处理,同时产生相同的隐式正则化。在不同的分布式和联邦学习环境下,我们通过实验验证了算法的有效性。 摘要:A major obstacle to achieving global convergence in distributed and federated learning is the misalignment of gradients across clients, or mini-batches due to heterogeneity and stochasticity of the distributed data. One way to alleviate this problem is to encourage the alignment of gradients across different clients throughout training. Our analysis reveals that this goal can be accomplished by utilizing the right optimization method that replicates the implicit regularization effect of SGD, leading to gradient alignment as well as improvements in test accuracies. Since the existence of this regularization in SGD completely relies on the sequential use of different mini-batches during training, it is inherently absent when training with large mini-batches. To obtain the generalization benefits of this regularization while increasing parallelism, we propose a novel GradAlign algorithm that induces the same implicit regularization while allowing the use of arbitrarily large batches in each update. We experimentally validate the benefit of our algorithm in different distributed and federated learning settings.
【72】 Self-paced Principal Component Analysis 标题:自定步主成分分析
作者:Zhao Kang,Hongfei Liu,Jiangxin Li,Xiaofeng Zhu,Ling Tian 机构: [ 1 2] develop a computationallysimple paradigm for image denoising using superpixel-basedThe authors are with the School of Computer Science and Engineering, University of Electronic Science and Technology of China 链接:https://arxiv.org/abs/2106.13880 摘要:主成分分析(PCA)在降维和特征提取方面有着广泛的应用。鲁棒主元分析(RPCA)在l1范数、l2范数、p范数等不同的鲁棒距离度量下,能在一定程度上处理噪声或异常值。然而,现实世界中的数据可能显示这些简单函数无法完全捕获的结构。另外,现有方法对复杂样本和简单样本一视同仁。相比之下,人类通常采用的学习模式是从简单到复杂,从少到多。基于这一原理,我们提出了一种新的方法,称为自步PCA(SPCA),以进一步降低噪声和异常值的影响。值得注意的是,在每次迭代开始时计算每个样本的复杂度,以便将从简单到更复杂的样本集成到训练中。基于交替优化,SPCA找到一个最优的投影矩阵,并迭代地滤除异常值。理论分析证明了SPCA的合理性。在流行数据集上的大量实验表明,该方法能显著提高现有结果。 摘要:Principal Component Analysis (PCA) has been widely used for dimensionality reduction and feature extraction. Robust PCA (RPCA), under different robust distance metrics, such as l1-norm and l2, p-norm, can deal with noise or outliers to some extent. However, real-world data may display structures that can not be fully captured by these simple functions. In addition, existing methods treat complex and simple samples equally. By contrast, a learning pattern typically adopted by human beings is to learn from simple to complex and less to more. Based on this principle, we propose a novel method called Self-paced PCA (SPCA) to further reduce the effect of noise and outliers. Notably, the complexity of each sample is calculated at the beginning of each iteration in order to integrate samples from simple to more complex into training. Based on an alternating optimization, SPCA finds an optimal projection matrix and filters out outliers iteratively. Theoretical analysis is presented to show the rationality of SPCA. Extensive experiments on popular data sets demonstrate that the proposed method can improve the state of-the-art results considerably.
【73】 Scene Uncertainty and the Wellington Posterior of Deterministic Image Classifiers 标题:场景不确定性与确定性图像分类器的惠灵顿后验
作者:Stephanie Tsuei,Aditya Golatkar,Stefano Soatto 机构:Department of Computer Science, UCLA, Los Angeles, CA 链接:https://arxiv.org/abs/2106.13870 摘要:提出了一种在给定输入数据上估计图像分类器结果不确定性的方法。通常用于图像分类的深度神经网络是从输入图像到输出类的确定性映射。因此,他们在给定数据上的结果不涉及不确定性,因此我们必须明确定义、测量和解释“信心”时所指的可变性。为此,我们介绍了惠灵顿后验法,它是根据产生给定图像的同一场景可能产生的数据而获得的结果的分布。由于有无限多的场景可以生成给定的图像,惠灵顿后方需要从场景以外的其他描绘归纳。我们探讨了使用数据增强、置乱和模型线性化的替代方法。其他的替代方案包括生成对抗网络、条件先验网络和有监督的单视图重建。我们通过推断视频中时间相邻帧的类别来检验这些替代方法。这些发展只是评估深度网络分类器可靠性的一小步,其方式与安全关键应用兼容。 摘要:We propose a method to estimate the uncertainty of the outcome of an image classifier on a given input datum. Deep neural networks commonly used for image classification are deterministic maps from an input image to an output class. As such, their outcome on a given datum involves no uncertainty, so we must specify what variability we are referring to when defining, measuring and interpreting "confidence." To this end, we introduce the Wellington Posterior, which is the distribution of outcomes that would have been obtained in response to data that could have been generated by the same scene that produced the given image. Since there are infinitely many scenes that could have generated the given image, the Wellington Posterior requires induction from scenes other than the one portrayed. We explore alternate methods using data augmentation, ensembling, and model linearization. Additional alternatives include generative adversarial networks, conditional prior networks, and supervised single-view reconstruction. We test these alternatives against the empirical posterior obtained by inferring the class of temporally adjacent frames in a video. These developments are only a small step towards assessing the reliability of deep network classifiers in a manner that is compatible with safety-critical applications.
【74】 Extensions to Multifidelity Monte Carlo Methods for Simulations of Chaotic Systems 标题:混沌系统仿真的多保真蒙特卡罗方法的扩展
作者:Todd A. Oliver,Christopher S. Simmons,Robert D. Moser 机构:Oden Institute for Computational Engineering and Sciences, The University of Texas at Austin, Department of Mechanical Engineering, The University of Texas at Austin, Office of Information Technology, The University of Texas at Dallas 备注:28 pages, 12 figures 链接:https://arxiv.org/abs/2106.13844 摘要:多理想montecarlo方法通常依赖于由标准montecarlo抽样组成的预处理阶段来估计不同保真度模型之间的相关系数,以确定每个层次的权重和样本数。对于计算密集型模型,就像在混沌系统的模拟中经常遇到的那样,这种前期成本可能会令人望而却步。在这项工作中,相关估计程序是发展的情况下,最高和次最高保真度模型是通过离散相同的数学模型使用不同的分辨率。该方法使用离散化误差估计来估计所需的相关系数,而不需要对保真度最高的模型进行采样,从而大大降低了预处理阶段的成本。该方法通过使用离散化误差估计来解决混沌问题,离散化误差估计考虑了一般感兴趣量的统计性质,并且伴随的有限采样误差污染了这些感兴趣量的估计。然后在一个基于Kuramoto-Sivashinsky方程的模型问题上演示了该方法。 摘要:Multifidelity Monte Carlo methods often rely on a preprocessing phase consisting of standard Monte Carlo sampling to estimate correlation coefficients between models of different fidelity to determine the weights and number of samples for each level. For computationally intensive models, as are often encountered in simulations of chaotic systems, this up-front cost can be prohibitive. In this work, a correlation estimation procedure is developed for the case in which the highest and next highest fidelity models are generated via discretizing the same mathematical model using different resolution. The procedure uses discretization error estimates to estimate the required correlation coefficient without the need to sample the highest fidelity model, which can dramatically decrease the cost of the preprocessing phase. The method is extended to chaotic problems by using discretization error estimates that account for the statistical nature of common quantities of interest and the accompanying finite sampling errors that pollute estimates of such quantities of interest. The methodology is then demonstrated on a model problem based on the Kuramoto-Sivashinsky equation.