统计学学术速递[7.27]

公众号-arXiv每日学术速递

发布于 2021-07-28 14:48:34

7500

发布于 2021-07-28 14:48:34

文章被收录于专栏：arXiv每日学术速递arXiv每日学术速递

访问www.arxivdaily.com获取含摘要速递，涵盖CS|物理|数学|经济|统计|金融|生物|电气领域，更有搜索、收藏、发帖等功能！点击阅读原文即可访问

stat统计学，共计45篇

【1】 Inference for Heteroskedastic PCA with Missing Data 标题：具有缺失数据的异方差主元分析的推断

作者：Yuling Yan,Yuxin Chen,Jianqing Fan 链接：https://arxiv.org/abs/2107.12365 摘要：本文研究了如何构造高维主成分分析（PCA）的置信域，这是一个尚未得到充分研究的问题。虽然计算非线性/非凸估计量的不确定性测度在高维上通常是困难的，但由于普遍存在的缺失数据和异方差噪声，这一挑战进一步加剧。我们提出了一套基于两个估计器对主子空间进行有效推断的解决方案：一种基于vanilla SVD的方法，以及一种更精细的迭代方案$\textsf{HeteroPCA}$（Zhang et al.，2018）。我们发展了这两种估计量的非渐近分布保证，并演示了如何调用它们来计算主子空间的置信域和尖峰协方差矩阵的入口置信区间。特别值得强调的是建立在$\textsf{HeteroPCA}$之上的推理过程，它不仅有效，而且在更广泛的情况下在统计上也是有效的（例如，它涵盖了更广泛的丢失率和信噪比）。我们的解决方案是完全数据驱动和适应异方差随机噪声，无需事先了解噪声水平和噪声分布。摘要：This paper studies how to construct confidence regions for principal component analysis (PCA) in high dimension, a problem that has been vastly under-explored. While computing measures of uncertainty for nonlinear/nonconvex estimators is in general difficult in high dimension, the challenge is further compounded by the prevalent presence of missing data and heteroskedastic noise. We propose a suite of solutions to perform valid inference on the principal subspace based on two estimators: a vanilla SVD-based approach, and a more refined iterative scheme called $\textsf{HeteroPCA}$ (Zhang et al., 2018). We develop non-asymptotic distributional guarantees for both estimators, and demonstrate how these can be invoked to compute both confidence regions for the principal subspace and entrywise confidence intervals for the spiked covariance matrix. Particularly worth highlighting is the inference procedure built on top of $\textsf{HeteroPCA}$, which is not only valid but also statistically efficient for broader scenarios (e.g., it covers a wider range of missing rates and signal-to-noise ratios). Our solutions are fully data-driven and adaptive to heteroskedastic random noise, without requiring prior knowledge about the noise levels and noise distributions.

【2】 Plugin Estimation of Smooth Optimal Transport Maps 标题：光滑最优传输图的插件估计

作者：Tudor Manole,Sivaraman Balakrishnan,Jonathan Niles-Weed,Larry Wasserman 机构：∗Department of Statistics and Data Science, Carnegie Mellon University, †Machine Learning Department, Carnegie Mellon University, ‡Courant Institute of Mathematical Sciences, New York University 链接：https://arxiv.org/abs/2107.12364 摘要：我们分析了两个分布之间的最优传输映射的若干自然估计，并证明了它们是minimax最优的。我们采用插件方法：我们的估计器仅仅是从我们的观察中得到的度量之间的最佳耦合，适当地扩展，以便它们定义$\mathbb{R}^d$上的函数。当下垫映射假设为Lipschitz时，我们证明了计算经验测度之间的最优耦合，并使用线性平滑器对其进行扩展，已经给出了一个minimax最优估计。当基础映射具有更高的正则性时，我们证明了适当的非参数密度估计之间的最佳耦合产生更快的速率。我们的工作还提供了二次Wasserstein距离的相应插件估计的风险的新的界，并且我们展示了这个问题如何与使用光滑和强凸Brenier势的稳定性参数估计最优传输图的问题相联系。作为我们结果的一个应用，我们导出了一个中心极限定理的密度插件估计的平方Wasserstein距离，这是集中在其对应的人口时，基本分布有足够光滑的密度。与已知的经验估计的中心极限定理相比，这个结果很容易用于Wasserstein距离的统计推断。摘要：We analyze a number of natural estimators for the optimal transport map between two distributions and show that they are minimax optimal. We adopt the plugin approach: our estimators are simply optimal couplings between measures derived from our observations, appropriately extended so that they define functions on $\mathbb{R}^d$. When the underlying map is assumed to be Lipschitz, we show that computing the optimal coupling between the empirical measures, and extending it using linear smoothers, already gives a minimax optimal estimator. When the underlying map enjoys higher regularity, we show that the optimal coupling between appropriate nonparametric density estimates yields faster rates. Our work also provides new bounds on the risk of corresponding plugin estimators for the quadratic Wasserstein distance, and we show how this problem relates to that of estimating optimal transport maps using stability arguments for smooth and strongly convex Brenier potentials. As an application of our results, we derive a central limit theorem for a density plugin estimator of the squared Wasserstein distance, which is centered at its population counterpart when the underlying distributions have sufficiently smooth densities. In contrast to known central limit theorems for empirical estimators, this result easily lends itself to statistical inference for Wasserstein distances.

【3】 A Comparison of Various Aggregation Functions in Multi-Criteria Decision Analysis for Drug Benefit-Risk Assessment 标题：药物效益风险评价多准则决策分析中不同聚类函数的比较

作者：Tom Menzies,Gaelle Saint-Hilary,Pavel Mozgunov 机构： UK 2Department of Mathematics and Statistics, Lancaster University, UK 3Department of Biostatistics 链接：https://arxiv.org/abs/2107.12298 摘要：多准则决策分析（MCDA）是一种定量的药物受益风险评估（BRA）方法，它通过在一个分数中总结所有受益和风险来进行一致的比较。MCDA由几个部分组成，其中一个是效用（或损失）评分函数，该函数定义了如何将收益和风险聚合为单个数量。虽然线性效用评分是BRA中最广泛使用的方法之一，但人们认识到，它可能会导致违反直觉的决定，例如，建议使用极低效益或高风险的治疗。为了克服这一问题，提出了不同的分数构建方法，即乘积模型、多元线性模型和规模损失模型。然而，迄今为止，关于这些模型所隐含的差异的大多数论点都是启发性的。在这项工作中，我们考虑四个模型来计算汇总效用/损失分数，并比较他们的表现在一个广泛的模拟研究在许多不同的情况下，并在一个案例研究。研究发现，与线性和多线性模型相比，产品和规模损失评分模型在大多数情况下提供了更直观的治疗建议决策，并且对标准中的相关性更为稳健。摘要：Multi-criteria decision analysis (MCDA) is a quantitative approach to the drug benefit-risk assessment (BRA) which allows for consistent comparisons by summarising all benefits and risks in a single score. The MCDA consists of several components, one of which is the utility (or loss) score function that defines how benefits and risks are aggregated into a single quantity. While a linear utility score is one of the most widely used approach in BRA, it is recognised that it can result in counter-intuitive decisions, for example, recommending a treatment with extremely low benefits or high risks. To overcome this problem, alternative approaches to the scores construction, namely, product, multi-linear and Scale Loss Score models, were suggested. However, to date, the majority of arguments concerning the differences implied by these models are heuristic. In this work, we consider four models to calculate the aggregated utility/loss scores and compared their performance in an extensive simulation study over many different scenarios, and in a case study. It is found that the product and Scale Loss Score models provide more intuitive treatment recommendation decisions in the majority of scenarios compared to the linear and multi-linear models, and are more robust to the correlation in the criteria.

【4】 E-Bayesian Estimation For Some Characteristics Of Weibull Generalized Exponential Progressive Type-II Censored Samples 标题：Weibull广义指数递进Ⅱ型截尾样本若干特征的E-Bayes估计

作者：Hassan Piriaeia,Omid Shojaee 机构：a Department of Mathematics, Borujerd Branch, Islamic Azad University, Borujerd, Iran 备注：18 page 链接：https://arxiv.org/abs/2107.12162 摘要：可靠性和危险率的估计是许多应用中提出的最重要的问题之一，特别是在工程研究和人类寿命中。在这方面，采用了不同的估计方法。每种方法都利用了各种工具，存在计算复杂、精度低等问题。本文应用E-Bayesian方法估计Weibull广义指数分布的参数和生存函数。估计量是在平方误差和线性损失函数下得到的。基于超参数的三个先验推导了E-Bayesian估计，研究了不同先验对估计的影响。研究了E-Bayesian估计的渐近性态以及它们之间的关系。最后，利用真实数据和蒙特卡罗模拟对最大似然估计、贝叶斯估计和E-贝叶斯估计进行了比较。结果表明，新方法比以往的方法更有效。摘要：Estimation of reliability and hazard rate is one of the most important problems raised in many applications especially in engineering studies as well as human lifetime. In this regard, different methods of estimation have been used. Each method exploits various tools and suffers from problems such as complexity of computations, low precision, and so forth. This study is employed the E-Bayesian method, for estimating the parameter and survival functions of the Weibull Generalized Exponential distribution. The estimators are obtained under squared error and LINEX loss functions under progressive type-II censored samples. E-Bayesian estimations are derived based on three priors of hyperparameters to investigate the influence of different priors on estimations. The asymptotic behaviours of E-Bayesian estimations have been investigated as well as relationships among them. Finally, a comparison among the maximum likelihood, Bayes, and E-Bayesian estimations are made, using real data and Monte Carlo simulation. Results show that the new method is more efficient than previous methods.

【5】 Convergence in quadratic mean of averaged stochastic gradient algorithms without strong convexity nor bounded gradient 标题：无强凸性和无界梯度的平均随机梯度算法的二次均值收敛性

作者：Antoine Godichon-Baggioni 机构：Laboratoire de Probabilités, Statistique et Modélisation, Sorbonne-Université, Paris, France 链接：https://arxiv.org/abs/2107.12058 摘要：在线平均随机梯度算法的研究越来越多，因为（i）它们能够快速处理高维空间中的大样本值，（ii）它们能够连续处理数据，（iii）它们已知是渐近有效的。在本文中，我们着重于给出估计的二次平均误差的显式界，这是在非常弱的假设下，即不假设我们要最小化的函数是强凸的或允许有界的梯度。摘要：Online averaged stochastic gradient algorithms are more and more studied since (i) they can deal quickly with large sample taking values in high dimensional spaces, (ii) they enable to treat data sequentially, (iii) they are known to be asymptotically efficient. In this paper, we focus on giving explicit bounds of the quadratic mean error of the estimates, and this, with very weak assumptions, i.e without supposing that the function we would like to minimize is strongly convex or admits a bounded gradient.

【6】 From robust tests to Bayes-like posterior distributions 标题：从稳健检验到类贝叶斯后验分布

作者：Yannick Baraud 链接：https://arxiv.org/abs/2107.12011 摘要：在Bayes范式下，对于给定的损失函数，我们提出了一种新的后验分布的构造方法来估计n$样本的规律。我们所考虑的损失函数是基于总变差距离、海林格距离以及一些$\mathbb{L}{j}$-距离。我们证明，在概率接近1的情况下，对于所选的损失函数，这个新的后验分布将其质量集中在数据定律的一个邻域内，只要这个定律属于先验的支持，或者至少离它足够近。因此，我们建立新的后验分布享有一些鲁棒性的性质，就一个可能的错误指定的先验，或更准确地说，其支持。对于总变差和平方Hellinger损失，我们还表明，当数据仅是独立的时，后验分布保持其浓度特性，因此不一定是i.i.d.，前提是它们的大多数边缘足够接近某个概率分布，先验在该概率分布周围放置足够的质量。因此，相对于等分布假设，后验分布也是稳定的。我们通过几个应用来说明这些结果。我们考虑在非参数框架中估计位置参数或密度的位置和尺度的问题。最后，我们还讨论了在一些稀疏条件下，高维参数模型的密度估计问题。本文建立的结果是非渐近的，并尽可能提供显式常数。摘要：In the Bayes paradigm and for a given loss function, we propose the construction of a new type of posterior distributions for estimating the law of an $n$-sample. The loss functions we have in mind are based on the total variation distance, the Hellinger distance as well as some $\mathbb{L}_{j}$-distances. We prove that, with a probability close to one, this new posterior distribution concentrates its mass in a neighbourhood of the law of the data, for the chosen loss function, provided that this law belongs to the support of the prior or, at least, lies close enough to it. We therefore establish that the new posterior distribution enjoys some robustness properties with respect to a possible misspecification of the prior, or more precisely, its support. For the total variation and squared Hellinger losses, we also show that the posterior distribution keeps its concentration properties when the data are only independent, hence not necessarily i.i.d., provided that most of their marginals are close enough to some probability distribution around which the prior puts enough mass. The posterior distribution is therefore also stable with respect to the equidistribution assumption. We illustrate these results by several applications. We consider the problems of estimating a location parameter or both the location and the scale of a density in a nonparametric framework. Finally, we also tackle the problem of estimating a density, with the squared Hellinger loss, in a high-dimensional parametric model under some sparcity conditions. The results established in this paper are non-asymptotic and provide, as much as possible, explicit constants.

【7】 A Novel Bivariate Generalized Weibull Distribution with Properties and Applications 标题：一种新的二元广义Weibull分布及其性质和应用

作者：Ashok Kumar Pathak,Mohd. Arshad,Qazi J. Azhad,Mukti Khetan,Arvind Pandey 机构：Department of Mathematics and Statistics, Central University of Punjab, Bathinda, India., Department of Mathematics, Indian Institute of Technology Indore, Simrol, Indore, India. 链接：https://arxiv.org/abs/2107.11998 摘要：单变量威布尔分布是一种著名的寿命分布，在可靠性和生存分析中有着广泛的应用。本文介绍了一类新的二元广义Weibull分布，其单变量边缘为指数Weibull分布。不同的统计分位数，如边缘，条件分布，条件期望，积矩，相关性和测量组件的可靠性。各种措施的依赖性和统计性质以及老化性能进行了审查。此外，还考虑了与BGW分布相关的copula及其各种重要性质。采用最大似然法和贝叶斯估计法对模型的未知参数进行估计。通过montecarlo仿真和实际数据研究，验证了该方法的有效性摘要：Univariate Weibull distribution is a well-known lifetime distribution and has been widely used in reliability and survival analysis. In this paper, we introduce a new family of bivariate generalized Weibull (BGW) distributions, whose univariate marginals are exponentiated Weibull distribution. Different statistical quantiles like marginals, conditional distribution, conditional expectation, product moments, correlation and a measure component reliability are derived. Various measures of dependence and statistical properties along with ageing properties are examined. Further, the copula associated with BGW distribution and its various important properties are also considered. The methods of maximum likelihood and Bayesian estimation are employed to estimate unknown parameters of the model. A Monte Carlo simulation and real data study are carried out to demonstrate the performance of the estimators and results have proven the effectiveness of the distribution in real-life situations

【8】 A Real Time Monitoring Approach for Bivariate Event Data 标题：一种双变量事件数据的实时监测方法

作者：Inez Maria Zwetsloot,Tahir Mahmood,Funmilola Mary Taiwo,Zezhong Wang 机构：Wang, Department of Advanced Design and Systems Engineering, City University of, Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong, Department of Technology, School of Science and Technology, The Open University 链接：https://arxiv.org/abs/2107.11971 摘要：早期发现事件频率的变化是一项重要任务，例如，在疾病监测、高质量过程监测、可靠性监测和公共卫生等领域。在本文中，我们通过监测事件间隔时间（TBE）来检测多变量事件数据的变化。现有的多变量TBE图的局限性在于，它们只在每个进程的事件发生后发出信号。这会导致延迟（即，长时间发出信号），特别是在检测一个或几个进程中的变化时。我们提出了一种能够实时发出信号的双变量TBE（BTBE）图。我们推导了控制极限和平均信号时间性能的解析表达式，进行了性能评估，并将我们的图表与现有的方法进行了比较。研究结果表明，该方法是一种实用的监测双变量事件间隔时间的方法，具有比现有方法更好的检测能力。我们的方法的一个很大的优点是，它可以实时发送信号，并且由于解析表达式的存在，不需要进行模拟。该方法在一个与艾滋病相关的真实数据集上实现。摘要：Early detection of changes in the frequency of events is an important task, in, for example, disease surveillance, monitoring of high-quality processes, reliability monitoring and public health. In this article, we focus on detecting changes in multivariate event data, by monitoring the time-between-events (TBE). Existing multivariate TBE charts are limited in the sense that, they only signal after an event occurred for each of the individual processes. This results in delays (i.e., long time to signal), especially if it is of interest to detect a change in one or a few of the processes. We propose a bivariate TBE (BTBE) chart which is able to signal in real time. We derive analytical expressions for the control limits and average time-to-signal performance, conduct a performance evaluation and compare our chart to an existing method. The findings showed that our method is a realistic approach to monitor bivariate time-between-event data, and has better detection ability than existing methods. A large benefit of our method is that it signals in real-time and that due to the analytical expressions no simulation is needed. The proposed method is implemented on a real-life dataset related to AIDS.

【9】 Max-Type and Sum-Type Procedures for Online Change-Point Detection in the Mean of High-Dimensional Data 标题：高维数据均值在线变点检测的Max-Type和Sum-Type算法

作者：Jun Li 机构：Department of Mathematical Sciences, Kent State University, Kent, OH 链接：https://arxiv.org/abs/2107.11931 摘要：我们提出了两种方法来检测高维在线数据平均值的变化。一种是基于max型U统计量，另一种是基于sum型U统计量。在高维环境下探讨了这两个过程的理论性质。更准确地说，我们推导了它们在没有变化点时的平均运行长度（arl）和在有变化点时的预期检测延迟（edd）。仿真研究证实了理论结果的正确性。通过检测PM2.5浓度的突变，证明了该方法的实用性。目前的研究试图扩展先前在单变量环境下建立的CUSUM和Shiryayev-Roberts程序的结果。摘要：We propose two procedures to detect a change in the mean of high-dimensional online data. One is based on a max-type U-statistic and another is based on a sum-type U-statistic. Theoretical properties of the two procedures are explored in the high dimensional setting. More precisely, we derive their average run lengths (ARLs) when there is no change point, and expected detection delays (EDDs) when there is a change point. Accuracy of the theoretical results is confirmed by simulation studies. The practical use of the proposed procedures is demonstrated by detecting an abrupt change in PM2.5 concentrations. The current study attempts to extend the results of the CUSUM and Shiryayev-Roberts procedures previously established in the univariate setting.

【10】 Estimation of Stationary Optimal Transport Plans 标题：静态最优运输方案的估计

作者：Kevin O'Connor,Kevin McGoff,Andrew B Nobel 机构：UNC-Charlotte, UNC-Chapel Hill 链接：https://arxiv.org/abs/2107.11858 摘要：我们研究了有限值感兴趣的量随时间以平稳方式动态演化的最优运输问题。在数学上，这是一般最优运输问题的一个特例，其中所研究的分布表示平稳过程，费用依赖于有限个时间点。在这种情况下，我们认为应将注意力限制在固定耦合上，也称为连接，它与长期平均成本密切相关。我们引入了最优连接和最优连接代价的估计，并在温和的条件下建立了它们的相合性。在更强的混合假设下，我们为相同的估计量建立了有限样本错误率，扩展了iid情况下最为人所知的结果。最后，我们将一致性和速率分析推广到熵惩罚的最优连接问题。摘要：We study optimal transport problems in which finite-valued quantities of interest evolve dynamically over time in a stationary fashion. Mathematically, this is a special case of the general optimal transport problem in which the distributions under study represent stationary processes and the cost depends on a finite number of time points. In this setting, we argue that one should restrict attention to stationary couplings, also known as joinings, which have close connections with long run average cost. We introduce estimators of both optimal joinings and the optimal joining cost, and we establish their consistency under mild conditions. Under stronger mixing assumptions we establish finite-sample error rates for the same estimators that extend the best known results in the iid case. Finally, we extend the consistency and rate analysis to an entropy-penalized version of the optimal joining problem.

【11】 A Survey of Monte Carlo Methods for Parameter Estimation 标题：参数估计的蒙特卡罗方法综述

作者：D. Luengo,L. Martino,M. Bugallo,V. Elvira,S. Särkkä 机构：for parameter estimation”. EURASIP Journal on Advances in Signal Processing, Article, number: , (,)., A Survey of Monte Carlo Methods, for Parameter Estimation, Universidad Polit´ecnica de Madrid (UPM), Spain., Universidad Rey Juan Carlos (URJC), Spain. 备注：None 链接：https://arxiv.org/abs/2107.11820 摘要：统计信号处理应用通常需要估计给定一组观测数据的一些相关参数。这些估计通常是通过求解多变量优化问题获得的，如在最大似然（ML）或最大后验概率（MAP）估计中，或者通过执行多维积分获得的，如在最小均方误差（MMSE）估计中。不幸的是，在大多数实际应用中找不到这些估计量的解析表达式，蒙特卡罗方法是一种可行的方法。MC方法通过从期望分布或更简单的分布中抽取随机样本来计算一致估计量。最重要的MC算法家族是Markov链MC（MCMC）和重要性抽样（IS）。一方面，MCMC方法从一个建议密度中抽取样本，然后通过接受或拒绝这些候选样本作为链的新状态，建立一个遍历马尔可夫链，其平稳分布就是期望分布。另一方面，IS技术从一个简单的建议密度中抽取样本，然后给它们分配适当的权重，以某种适当的方式度量它们的质量。本文对信号处理中静态参数估计的MC方法进行了全面的综述。本文还提供了MC方案发展的历史记录，接着是基本MC方法和拒绝采样（RS）算法的简要描述，以及描述许多最相关的MCMC和is算法及其组合使用的三个部分。摘要：Statistical signal processing applications usually require the estimation of some parameters of interest given a set of observed data. These estimates are typically obtained either by solving a multi-variate optimization problem, as in the maximum likelihood (ML) or maximum a posteriori (MAP) estimators, or by performing a multi-dimensional integration, as in the minimum mean squared error (MMSE) estimators. Unfortunately, analytical expressions for these estimators cannot be found in most real-world applications, and the Monte Carlo (MC) methodology is one feasible approach. MC methods proceed by drawing random samples, either from the desired distribution or from a simpler one, and using them to compute consistent estimators. The most important families of MC algorithms are Markov chain MC (MCMC) and importance sampling (IS). On the one hand, MCMC methods draw samples from a proposal density, building then an ergodic Markov chain whose stationary distribution is the desired distribution by accepting or rejecting those candidate samples as the new state of the chain. On the other hand, IS techniques draw samples from a simple proposal density, and then assign them suitable weights that measure their quality in some appropriate way. In this paper, we perform a thorough review of MC methods for the estimation of static parameters in signal processing applications. A historical note on the development of MC schemes is also provided, followed by the basic MC method and a brief description of the rejection sampling (RS) algorithm, as well as three sections describing many of the most relevant MCMC and IS algorithms, and their combined use.

【12】 Sensitivity and robustness analysis in Bayesian networks with the bnmonitor R package 标题：基于bnmonitor R包的贝叶斯网络灵敏度和鲁棒性分析

作者：Manuele Leonelli,Ramsiya Ramanathan,Rachel L. Wilkerson 链接：https://arxiv.org/abs/2107.11785 摘要：贝叶斯网络是一类广泛应用于复杂操作系统风险评估的模型。现在有多种方法，以及实现的软件，通过数据学习或专家启发来指导它们的构建。然而，构建的贝叶斯网络在用于实际风险评估之前需要经过验证。这里，我们举例说明bnmonitor包的用法：第一个用于验证贝叶斯网络的综合软件。利用bnmonitor对一个医学数据集进行了应用数据分析，说明了bnmonitor的多种功能。摘要：Bayesian networks are a class of models that are widely used for risk assessment of complex operational systems. There are now multiple approaches, as well as implemented software, that guide their construction via data learning or expert elicitation. However, a constructed Bayesian network needs to be validated before it can be used for practical risk assessment. Here, we illustrate the usage of the bnmonitor R package: the first comprehensive software for the validation of a Bayesian network. An applied data analysis using bnmonitor is carried out over a medical dataset to illustrate the use of its wide array of functions.

【13】 Conditional Inference for Multivariate Generalised Linear Mixed Models 标题：多元广义线性混合模型的条件推断

作者：Jeanett S. Pelck,Rodrigo Labouriau 机构：Department of Mathematics, Aarhus University, Denmark 备注：35 pages, 2 figures and 5 appendices 链接：https://arxiv.org/abs/2107.11765 摘要：我们提出了一种广义线性混合模型（GLMMs）的推理方法，并对这些模型进行了扩展。首先，我们通过允许随机分量的分布为非高斯分布来扩展GLMM，也就是说，假设关于Lebesgue测度的绝对连续分布是围绕零对称的，单峰的，并且具有高达四阶的有限矩。其次，我们允许条件分布遵循离散模型，而不是指数离散模型。最后，我们将这些模型扩展到一个多变量框架，在这个框架中，通过对所有边际模型中代表共同观察簇的随机分量施加一个多变量绝对连续分布来组合多个响应。在这些模型中，最大似然推理涉及到计算一个通常不能以封闭形式计算的积分。我们提出了一种推断方法，它可以预测随机分量的值，并且不涉及条件似然量的积分。我们所研究的多元glmm可以用不同统计性质的边缘glmm构造，同时表示复杂的依赖结构，为应用提供了一个相当灵活的工具。摘要：We propose a method for inference in generalised linear mixed models (GLMMs) and several extensions of these models. First, we extend the GLMM by allowing the distribution of the random components to be non-Gaussian, that is, assuming an absolutely continuous distribution with respect to the Lebesgue measure that is symmetric around zero, unimodal and with finite moments up to fourth-order. Second, we allow the conditional distribution to follow a dispersion model instead of exponential dispersion models. Finally, we extend these models to a multivariate framework where multiple responses are combined by imposing a multivariate absolute continuous distribution on the random components representing common clusters of observations in all the marginal models. Maximum likelihood inference in these models involves evaluating an integral that often cannot be computed in closed form. We suggest an inference method that predicts values of random components and does not involve the integration of conditional likelihood quantities. The multivariate GLMMs that we studied can be constructed with marginal GLMMs of different statistical nature, and at the same time, represent complex dependence structure providing a rather flexible tool for applications.

【14】 On matching-adjusted indirect comparison and calibration estimation 标题：论匹配调整的间接比较和校准估计

作者：Jixian Wang 机构：Bristol Myers Squibb 备注：26 pages, 1 figure 链接：https://arxiv.org/abs/2107.11687 摘要：间接比较越来越多地用于比较不同来源的数据，如临床试验和观察数据，例如疾病登记。为了调整数据源之间的人口差异，匹配调整间接比较（MAIC）已被用于一些应用，包括卫生技术评估和药物监管提交。事实上，MAIC可以被视为一系列方法的特例，这些方法被称为调查抽样中的校准估计。然而，据我们所知，这种联系尚未得到详细研究。本文做了三个方面的工作：1.通过比较MAIC和几种常用的定标估计方法，包括等价于MAIC的熵平衡方法，研究了这种联系。2.研究了MAIC估计的标准误差估计，提出了一种模型无关的SE估计，并通过仿真验证了其性能。3.我们进行了一个模拟来比较这些常用的方法，以评估它们在间接比较场景中的性能。摘要：Indirect comparisons have been increasingly used to compare data from different sources such as clinical trials and observational data in, e.g., a disease registry. To adjust for population differences between data sources, matching-adjusted indirect comparison (MAIC) has been used in several applications including health technology assessment and drug regulatory submissions. In fact, MAIC can be considered as a special case of a range of methods known as calibration estimation in survey sampling. However, to our best knowledge, this connection has not been examined in detail. This paper makes three contributions: 1. We examined this connection by comparing MAIC and a few commonly used calibration estimation methods, including the entropy balancing approach, which is equivalent to MAIC. 2. We considered the standard error (SE) estimation of the MAIC estimators and propose a model-independent SE estimator and examine its performance by simulation. 3. We conducted a simulation to compare these commonly used approaches to evaluate their performance in indirect comparison scenarios.

【15】 Inference of collective Gaussian hidden Markov models 标题：集体高斯隐马尔可夫模型的推论

作者：Rahul Singh,Yongxin Chen 机构：ChenarewiththeSchoolofAerospaceEngineering, GeorgiaInstituteofTechnology 链接：https://arxiv.org/abs/2107.11662 摘要：我们考虑一类连续状态的集体隐马尔可夫模型的推断问题，其中数据被记录在由相同的动态的大量的个体所生成的集合（集体）形式中。我们提出了一种称为集合高斯向前向后算法的聚合推理算法，将最近提出的Sinkhorn信度传播算法推广到具有高斯密度特征的模型。我们的算法具有收敛性保证。此外，当观测值由单个个体产生时，它简化为标准的Kalman滤波。通过多个实验验证了该算法的有效性。摘要：We consider inference problems for a class of continuous state collective hidden Markov models, where the data is recorded in aggregate (collective) form generated by a large population of individuals following the same dynamics. We propose an aggregate inference algorithm called collective Gaussian forward-backward algorithm, extending recently proposed Sinkhorn belief propagation algorithm to models characterized by Gaussian densities. Our algorithm enjoys convergence guarantee. In addition, it reduces to the standard Kalman filter when the observations are generated by a single individual. The efficacy of the proposed algorithm is demonstrated through multiple experiments.

【16】 Automatic tempered posterior distributions for Bayesian inversion problems 标题：贝叶斯反演问题的自动调和后验分布

作者：L. Martino,F. Llorente,E. Curbelo,J. Lopez-Santiago,J. Miguez 机构：† Universidad rey Juan Carlos (URJC), Madrid, Spain., ∗ Universidad Carlos III de Madrd (UC,M), Madrid, Spain. 备注：None 链接：https://arxiv.org/abs/2107.11614 摘要：针对贝叶斯反演问题，提出了一种新的自适应重要性抽样方案，该方案将感兴趣的变量的推断和数据噪声的功率分开。更具体地，我们考虑贝叶斯分析的兴趣变量（即，模型的参数反转），而我们采用最大似然估计噪声功率的方法。整个技术通过迭代过程、交替采样和优化步骤来实现。此外，噪声功率也被用作感兴趣的变量的后验分布的缓和参数。因此，生成回火后验密度序列，其中回火参数根据噪声功率的实际估计自动选择。还可以对模型参数和尺度参数进行完整的贝叶斯研究。数值实验表明了该方法的有效性。摘要：We propose a novel adaptive importance sampling scheme for Bayesian inversion problems where the inference of the variables of interest and the power of the data noise is split. More specifically, we consider a Bayesian analysis for the variables of interest (i.e., the parameters of the model to invert), whereas we employ a maximum likelihood approach for the estimation of the noise power. The whole technique is implemented by means of an iterative procedure, alternating sampling and optimization steps. Moreover, the noise power is also used as a tempered parameter for the posterior distribution of the the variables of interest. Therefore, a sequence of tempered posterior densities is generated, where the tempered parameter is automatically selected according to the actual estimation of the noise power. A complete Bayesian study over the model parameters and the scale parameter can be also performed. Numerical experiments show the benefits of the proposed approach.

【17】 Effect of the COVID-19 pandemic on bike-sharing demand and hire time: Evidence from Santander Cycles in London 标题：冠状病毒大流行对共享单车需求和租用时间的影响：来自伦敦桑坦德自行车的证据

作者：Shahram Heydari,Garyfallos Konstantinoudis,Abdul Wahid Behsoodi 机构： Transportation Research Group, Department of Civil, Maritime and Environmental Engineering, University of Southampton, Southampton, UK, MRC Centre for Environment and Health, Department of Epidemiology and Biostatistics, School of Public Health 链接：https://arxiv.org/abs/2107.11589 摘要：自2020年初以来，COVID-19流行病一直影响着世界各地许多城市地区的出行行为。因此，自行车共享计划受到影响的部分原因是出行需求和行为的变化以及公共交通的转变。本研究估计了2020年3月至12月期间，COVID-19大流行对伦敦自行车共享系统（桑坦德周期）的不同影响。我们采用贝叶斯二阶随机游走时间序列模型来解释数据中的时间相关性。我们将观察到的周期雇佣人数和雇佣时间与其各自的反事实（如果大流行没有发生的话会是什么情况）进行比较，以估计大流行引起的变化的幅度。结果表明，在2020年3月和4月减少周期雇用人数后，需求从2020年5月开始反弹，保持在预期范围内，如果没有发生大流行的话。这可能表明桑坦德循环的弹性。就租用时间而言，2020年4月、5月和6月的租用时间大幅增加，这表明自行车的租用时间更长，部分原因可能是公共交通的转变。摘要：The COVID-19 pandemic has been influencing travel behaviour in many urban areas around the world since the beginning of 2020. As a consequence, bike-sharing schemes have been affected partly due to the change in travel demand and behaviour as well as a shift from public transit. This study estimates the varying effect of the COVID-19 pandemic on the London bike-sharing system (Santander Cycles) over the period March-December 2020. We employed a Bayesian second-order random walk time-series model to account for temporal correlation in the data. We compared the observed number of cycle hires and hire time with their respective counterfactuals (what would have been if the pandemic had not happened) to estimate the magnitude of the change caused by the pandemic. The results indicated that following a reduction in cycle hires in March and April 2020, the demand rebounded from May 2020, remaining in the expected range of what would have been if the pandemic had not occurred. This could indicate the resiliency of Santander Cycles. With respect to hire time, an important increase occurred in April, May, and June 2020, indicating that bikes were hired for longer trips, perhaps partly due to a shift from public transit.

【18】 On the Le Cam distance between multivariate hypergeometric and multivariate normal experiments 标题：关于多元超几何试验与多元正态试验之间的Le Cam距离

作者：Frédéric Ouimet 机构：aMcGill University 备注：6 pages, 0 figures 链接：https://arxiv.org/abs/2107.11565 摘要：在本文中，我们发展了多元超几何概率质量函数与相应的多项式概率质量函数对数比的局部近似。结合Carter（2002）和Ouimet（2021）关于$（-1/2,1/2）^d$上均匀扰动的多项式向量律与相应的多元正态分布律之间的总变差的界限，然后利用对数比的局部展开，得到了一个多元超几何随机向量在$（-1/2,1/2）^d$上受均匀扰动的律与相应的多元正态分布律之间的总变差界。作为推论，我们找到了多元超几何和多元正态实验之间的lecam距离的上界。摘要：In this short note, we develop a local approximation for the log-ratio of the multivariate hypergeometric probability mass function over the corresponding multinomial probability mass function. In conjunction with the bounds from Carter (2002) and Ouimet (2021) on the total variation between the law of a multinomial vector jittered by a uniform on $(-1/2,1/2)^d$ and the law of the corresponding multivariate normal distribution, the local expansion for the log-ratio is then used to obtain a total variation bound between the law of a multivariate hypergeometric random vector jittered by a uniform on $(-1/2,1/2)^d$ and the law of the corresponding multivariate normal distribution. As a corollary, we find an upper bound on the Le Cam distance between multivariate hypergeometric and multivariate normal experiments.

【19】 Improved inference for vaccine-induced immune responses via shape-constrained methods 标题：基于形状约束方法的疫苗诱导免疫反应改进推理

作者：Nilanjana Laha,Zoe Moodie,Ying Huang,Alex Luedtke 链接：https://arxiv.org/abs/2107.11546 摘要：我们研究了形状约束方法在评估早期疫苗试验免疫应答谱方面的性能。这项工作的动机问题包括量化和比较HVTN 097和HVTN 100 HIV疫苗试验中产生的第一和第二可变环（V1V2区）的IgG结合免疫应答。我们考虑单峰和对数凹形约束的方法来比较两种疫苗的免疫谱，这是合理的，因为数据支持免疫应答的潜在密度可以具有这些形状。为此，我们开发了新的随机优势形状约束检验和两个密度之间海林格距离的形状约束插件估计。我们的方法要么不需要调整参数，要么只依赖一个调整参数，但它们的性能要么更好（随机优势检验），要么与非参数方法（海林格距离估计）相当。在临床环境中，对调节参数的最小依赖性是特别可取的，因为在临床环境中，分析必须是预先指定的和可重复的。我们的方法得到了理论结果和仿真研究的支持。摘要：We study the performance of shape-constrained methods for evaluating immune response profiles from early-phase vaccine trials. The motivating problem for this work involves quantifying and comparing the IgG binding immune responses to the first and second variable loops (V1V2 region) arising in HVTN 097 and HVTN 100 HIV vaccine trials. We consider unimodal and log-concave shape-constrained methods to compare the immune profiles of the two vaccines, which is reasonable because the data support that the underlying densities of the immune responses could have these shapes. To this end, we develop novel shape-constrained tests of stochastic dominance and shape-constrained plug-in estimators of the Hellinger distance between two densities. Our techniques are either tuning parameter free, or rely on only one tuning parameter, but their performance is either better (the tests of stochastic dominance) or comparable with the nonparametric methods (the estimators of Hellinger distance). The minimal dependence on tuning parameters is especially desirable in clinical contexts where analyses must be prespecified and reproducible. Our methods are supported by theoretical results and simulation studies.

【20】 A Robust Partial Correlation-based Screening Approach 标题：一种稳健的基于偏相关的筛选方法

作者：Xiaochao Xia 机构：ChongqingUniversity 链接：https://arxiv.org/abs/2107.11538 摘要：sure独立筛选作为一种计算速度快、效率高的工具，在解决超高维问题中受到了广泛的关注。本文从无模型的角度出发，提出了两种同时考虑异方差、离群值、重尾分布、连续或离散响应和混杂效应的稳健sure筛选方法。首先，我们定义一个只使用两个随机指标的稳健相关度量，并引入一个使用这种相关性的筛选器。其次，我们提出了一个稳健的偏相关筛选方法时，暴露变量是可用的。为了消除暴露对反应和每个协变量的混杂影响，我们使用了具有特定损失函数的非参数回归。更具体地说，形成了基于鲁棒相关的筛选方法（RC-SIS）和基于鲁棒偏相关的筛选框架（RPC-SIS），包括两个具体的筛选器：RPC-SIS（L2）和RPC-SIS（L1）。第三，我们建立了响应变量可以是连续变量也可以是离散变量的RC-SIS的筛选性质，以及RPC-SIS（L2）和RPC-SIS（L1）在一定正则性条件下的筛选性质。我们的方法本质上是非参数的，对响应和协变量都有很好的表现。最后，通过大量的仿真研究和两个实际应用验证了本文方法的优越性。摘要：As a computationally fast and working efficient tool, sure independence screening has received much attention in solving ultrahigh dimensional problems. This paper contributes two robust sure screening approaches that simultaneously take into account heteroscedasticity, outliers, heavy-tailed distribution, continuous or discrete response, and confounding effect, from the perspective of model-free. First, we define a robust correlation measure only using two random indicators, and introduce a screener using that correlation. Second, we propose a robust partial correlation-based screening approach when an exposure variable is available. To remove the confounding effect of the exposure on both response and each covariate, we use a nonparametric regression with some specified loss function. More specifically, a robust correlation-based screening method (RC-SIS) and a robust partial correlation-based screening framework (RPC-SIS) including two concrete screeners: RPC-SIS(L2) and RPC-SIS(L1), are formed. Third, we establish sure screening properties of RC-SIS for which the response variable can be either continuous or discrete, as well as those of RPC-SIS(L2) and RPC-SIS(L1) under some regularity conditions. Our approaches are essentially nonparametric, and perform robustly for both the response and the covariates. Finally, extensive simulation studies and two applications are carried out to demonstrate the superiority of our proposed approaches.

【21】 Combining Online Learning and Offline Learning for Contextual Bandits with Deficient Support 标题：支持不足的情境土匪的线上学习和线下学习相结合

作者：Hung Tran-The,Sunil Gupta,Thanh Nguyen-Tang,Santu Rana,Svetha Venkatesh 机构：Applied Artificial Intelligence Institute, Deakin University, Australia 链接：https://arxiv.org/abs/2107.11533 摘要：我们解决策略学习与记录的数据在上下文强盗。当前的离线策略学习算法大多基于反向倾向得分（IPS）加权，要求日志策略具有{完全支持}，即对评估策略的任何上下文/操作具有非零概率。然而，现实世界中的许多系统并不能保证这样的日志策略，特别是当操作空间很大，并且许多操作的回报很差或缺失时。由于支持度不足，离线学习无法找到最优策略。我们提出了一种新的方法，使用离线学习和在线探索的混合。在线探索用于探索记录数据中不支持的操作，而离线学习用于利用记录数据中支持的操作，以避免不必要的探索。我们的方法使用最少的在线探索次数来确定具有理论保证的最优策略。我们在不同的数据集上用经验证明了我们的算法的有效性。摘要：We address policy learning with logged data in contextual bandits. Current offline-policy learning algorithms are mostly based on inverse propensity score (IPS) weighting requiring the logging policy to have \emph{full support} i.e. a non-zero probability for any context/action of the evaluation policy. However, many real-world systems do not guarantee such logging policies, especially when the action space is large and many actions have poor or missing rewards. With such \emph{support deficiency}, the offline learning fails to find optimal policies. We propose a novel approach that uses a hybrid of offline learning with online exploration. The online exploration is used to explore unsupported actions in the logged data whilst offline learning is used to exploit supported actions from the logged data avoiding unnecessary explorations. Our approach determines an optimal policy with theoretical guarantees using the minimal number of online explorations. We demonstrate our algorithms' effectiveness empirically on a diverse collection of datasets.

【22】 Efficient nonparametric estimation of the covariate-adjusted threshold-response function, a support-restricted stochastic intervention 标题：支持受限随机干预的协变量调整阈值-响应函数的有效非参数估计

作者：Lars van der Laan,Wenbo Zhang,Peter B. Gilbert 机构：Divisions of Environmental Health Sciences and Biostatistics, School of Public Health, University of California, Berkeley, California, U.S.A., Department of Biostatistics, University of Washington, Seattle, Washington, U.S.A. 备注：51 pages including supplement, 7 figures, 1 table 链接：https://arxiv.org/abs/2107.11459 摘要：确定一个生物标志物或治疗剂量阈值，标志着一个特定的风险水平是一个重要的问题，特别是在临床试验中。这种风险，视为阈值的函数，并可能调整为协变量，我们称之为阈值响应函数。扩展了Donovan、Hudgens和Gilbert（2019）的工作，我们提出了一种协变量调整阈值响应函数的非参数有效估计方法，该方法利用机器学习和目标最小损失估计（TMLE）。我们还提出了一个更一般的估计，基于序贯回归，也适用于当有结果丢失。我们表明，在随机干预下，给定阈值的阈值反应可被视为预期结果，所有参与者都被给予高于阈值的治疗剂量。证明了该估计是有效的，并刻画了它的渐近分布。给出了同时构造阈值响应函数及其逆函数95%置信带的方法。此外，我们还讨论了当治疗或生物标志物随机缺失时，如何调整我们的估计器，例如在有偏抽样设计的临床试验中，使用逆概率加权。这些方法在一组不同的模拟环境中进行评估，结果罕见，且采用累积病例对照抽样。在CYD14和CYD15登革热疫苗试验中，这些方法用于估计经病毒学证实的登革热风险的中和抗体阈值。摘要：Identifying a biomarker or treatment-dose threshold that marks a specified level of risk is an important problem, especially in clinical trials. This risk, viewed as a function of thresholds and possibly adjusted for covariates, we call the threshold-response function. Extending the work of Donovan, Hudgens and Gilbert (2019), we propose a nonparametric efficient estimator for the covariate-adjusted threshold-response function, which utilizes machine learning and Targeted Minimum-Loss Estimation (TMLE). We additionally propose a more general estimator, based on sequential regression, that also applies when there is outcome missingness. We show that the threshold-response for a given threshold may be viewed as the expected outcome under a stochastic intervention where all participants are given a treatment dose above the threshold. We prove the estimator is efficient and characterize its asymptotic distribution. A method to construct simultaneous 95% confidence bands for the threshold-response function and its inverse is given. Furthermore, we discuss how to adjust our estimator when the treatment or biomarker is missing-at-random, as is the case in clinical trials with biased sampling designs, using inverse-probability-weighting. The methods are assessed in a diverse set of simulation settings with rare outcomes and cumulative case-control sampling. The methods are employed to estimate neutralizing antibody thresholds for virologically confirmed dengue risk in the CYD14 and CYD15 dengue vaccine trials.

【23】 Multipartition model for multiple change point identification 标题：多变点识别的多重划分模型

作者：Ricardo C. Pedroso,Rosangela H. Loschi,Fernando Andrés Quintana 机构：Departamento de Estat´ıstica, Universidade Federal de Minas Gerais, Fernando Andr´es Quintana∗, Departamento de Estad´ıstica, Pontificia Universidad Cat´olica de Chile, Millennium Nucleus Center for the Discovery of Structures in Complex Data 备注：60 pages, 33 figures 链接：https://arxiv.org/abs/2107.11456 摘要：多变化点问题的主要目标之一是估计变化点的数目和位置，以及由这些变化引起的集群中的制度结构。产品划分模型（PPM）是一种广泛应用的多变化点检测方法。传统的PPM假设变化点将时间点集分割成随机的簇，这些簇定义了时间轴的分区。然后通常假设这些块中的每个块内的采样模型参数值是相同的。由于观测模型的不同参数可能在不同时间发生变化，因此PPM无法识别经历这些变化的参数。当检测多变量时间序列的变化时，可能会出现类似的问题。为了解决这个重要的限制，我们引入了一个多部分模型来检测多个参数在可能不同的时间发生的多个变化点。该模型假设每个参数所经历的变化会产生不同的随机时间轴划分，这有助于识别哪些参数发生了变化以及何时发生了变化。我们讨论了一个部分折叠的Gibbs采样器方案来实现后验模拟。我们应用所提出的模型来识别多个正态均值和方差的变化点，并通过montecarlo模拟和数据说明来评估所提出的模型的性能。它的性能比较了一些以前提出的方法的变化点问题。研究结果表明，该模型具有较强的竞争性，丰富了变点问题的分析。摘要：Among the main goals in multiple change point problems are the estimation of the number and positions of the change points, as well as the regime structure in the clusters induced by those changes. The product partition model (PPM) is a widely used approach for the detection of multiple change points. The traditional PPM assumes that change points split the set of time points in random clusters that define a partition of the time axis. It is then typically assumed that sampling model parameter values within each of these blocks are identical. Because changes in different parameters of the observational model may occur at different times, the PPM thus fails to identify the parameters that experienced those changes. A similar problem may occur when detecting changes in multivariate time series. To solve this important limitation, we introduce a multipartition model to detect multiple change points occurring in several parameters at possibly different times. The proposed model assumes that the changes experienced by each parameter generate a different random partition of the time axis, which facilitates identifying which parameters have changed and when they do so. We discuss a partially collapsed Gibbs sampler scheme to implement posterior simulation under the proposed model. We apply the proposed model to identify multiple change points in Normal means and variances and evaluate the performance of the proposed model through Monte Carlo simulations and data illustrations. Its performance is compared with some previously proposed approaches for change point problems. These studies show that the proposed model is competitive and enriches the analysis of change point problems.

【24】 Finite-time Analysis of Globally Nonstationary Multi-Armed Bandits 标题：全局非平稳多臂土匪的有限时间分析

作者：Junpei Komiyama,Edouard Fouché,Junya Honda 链接：https://arxiv.org/abs/2107.11419 摘要：我们考虑非定常多臂BoDIT问题，其中武器的模型参数随时间变化。本文介绍了自适应重设bandit（ADR-bandit）算法，它是一类利用数据流社区自适应窗口技术的bandit算法。我们首先对自适应加窗技术产生的估计器的质量提供了新的保证，这些技术在数据挖掘领域是独立的。此外，我们对ADR-bandit在两种典型环境下进行了有限时间分析：突变环境和渐进环境。我们证明了ADR-bandit在突然的或全局的变化以一种我们称之为全局变化的协调方式发生时具有接近最优的性能。我们证明，当我们把兴趣局限于全球变化时，强迫探索是不必要的。与现有的非平稳bandit算法不同，ADR-bandit算法在平稳环境和全局变化的非平稳环境中都具有最优的性能。实验结果表明，该算法在合成环境和真实环境中的性能均优于现有算法。摘要：We consider nonstationary multi-armed bandit problems where the model parameters of the arms change over time. We introduce the adaptive resetting bandit (ADR-bandit), which is a class of bandit algorithms that leverages adaptive windowing techniques from the data stream community. We first provide new guarantees on the quality of estimators resulting from adaptive windowing techniques, which are of independent interest in the data mining community. Furthermore, we conduct a finite-time analysis of ADR-bandit in two typical environments: an abrupt environment where changes occur instantaneously and a gradual environment where changes occur progressively. We demonstrate that ADR-bandit has nearly optimal performance when the abrupt or global changes occur in a coordinated manner that we call global changes. We demonstrate that forced exploration is unnecessary when we restrict the interest to the global changes. Unlike the existing nonstationary bandit algorithms, ADR-bandit has optimal performance in stationary environments as well as nonstationary environments with global changes. Our experiments show that the proposed algorithms outperform the existing approaches in synthetic and real-world environments.

【25】 A principled (and practical) test for network comparison 标题：网络比较的原则性(和实践性)测试

作者：Gecia Bravo Hermsdorff,Lee M. Gunderson,Pierre-André Maugis,Carey E. Priebe 机构：Gatsby Computational Neuroscience Unit, University College London, London, W,T ,JG, United Kingdom, Pierre-Andr´e Maugis, Google Research, Z¨urich, Switzerland, Department of Applied Mathematics and Statistics, Whiting School of Engineering, Johns Hopkins University 链接：https://arxiv.org/abs/2107.11403 摘要：人们如何检验从同一分布中取样的图形的假设？在这里，我们比较两个统计测试，解决这个问题。第一种方法使用观察到的子图密度本身作为基础分布密度的估计。第二个测试使用一种新的方法，将这些子图密度转换为分布的图累积量的估计。我们通过理论、模拟和实际数据的应用证明了使用图累积量的优越统计能力。摘要：How might one test the hypothesis that graphs were sampled from the same distribution? Here, we compare two statistical tests that address this question. The first uses the observed subgraph densities themselves as estimates of those of the underlying distribution. The second test uses a new approach that converts these subgraph densities into estimates of the graph cumulants of the distribution. We demonstrate -- via theory, simulation, and application to real data -- the superior statistical power of using graph cumulants.

【26】 Uncertainty-Aware Time-to-Event Prediction using Deep Kernel Accelerated Failure Time Models 标题：基于深度核加速故障时间模型的不确定性感知事件间隔时间预测

作者：Zhiliang Wu,Yinchong Yang,Peter A. Fasching,Volker Tresp 机构：Ludwig Maximilians University Munich, Siemens AG, Technology, Munich, Department of Gynecology and Obstetrics, University Hospital Erlangen, Erlangen 链接：https://arxiv.org/abs/2107.12250 摘要：基于递归神经网络的解决方案正越来越多地用于纵向电子病历数据的分析。然而，目前的研究大多集中在预测精度上，而忽略了预测的不确定性。我们提出了事件时间预测任务的深核加速失效时间模型，通过递归神经网络和稀疏高斯过程的流水线实现了预测的不确定性感知。此外，采用基于深度度量学习的预训练步骤对模型进行了改进。在两个真实数据集上的实验表明，我们的模型比基于递归神经网络的基线具有更好的点估计性能。更重要的是，我们模型的预测方差可以用来量化对事件时间预测的不确定性估计：当我们的模型对其预测更有信心时，它可以提供更好的性能。与蒙特卡罗法等相关方法相比，我们的模型利用解析解提供了更好的不确定性估计，计算效率更高。摘要：Recurrent neural network based solutions are increasingly being used in the analysis of longitudinal Electronic Health Record data. However, most works focus on prediction accuracy and neglect prediction uncertainty. We propose Deep Kernel Accelerated Failure Time models for the time-to-event prediction task, enabling uncertainty-awareness of the prediction by a pipeline of a recurrent neural network and a sparse Gaussian Process. Furthermore, a deep metric learning based pre-training step is adapted to enhance the proposed model. Our model shows better point estimate performance than recurrent neural network based baselines in experiments on two real-world datasets. More importantly, the predictive variance from our model can be used to quantify the uncertainty estimates of the time-to-event prediction: Our model delivers better performance when it is more confident in its prediction. Compared to related methods, such as Monte Carlo Dropout, our model offers better uncertainty estimates by leveraging an analytical solution and is more computationally efficient.

【27】 Are Bayesian neural networks intrinsically good at out-of-distribution detection? 标题：贝叶斯神经网络本质上擅长非分布检测吗？

作者：Christian Henning,Francesco D'Angelo,Benjamin F. Grewe 机构： AssumingEqual contribution 1Institute of Neuroinformatics, Universityof Z¨urich and ETH Z¨urich 备注：Published at UDL Workshop, ICML 2021 链接：https://arxiv.org/abs/2107.12248 摘要：避免对不熟悉的数据进行自信预测的需求引发了对分布外（OOD）检测的兴趣。人们普遍认为，贝叶斯神经网络（BNN）非常适合这项任务，因为被赋予的认知不确定性会导致对异常值的预测不一致。在本文中，我们质疑这一假设，并提供经验证据表明，适当的贝叶斯推理与常见的神经网络结构不一定导致良好的OOD检测。为了避免使用近似推理，我们首先研究了无限宽情况，其中贝叶斯推理可以精确地考虑相应的高斯过程。引人注目的是，在通用架构选择下产生的内核导致了不确定性，这些不确定性不能反映底层数据生成过程，因此不适合用于OOD检测。最后，我们利用HMC研究了有限宽度网络，观察到了与无限宽度情况一致的OOD行为。总的来说，我们的研究揭示了单纯使用BNNs进行OOD检测时的基本问题，并为将来的研究开辟了有趣的途径。摘要：The need to avoid confident predictions on unfamiliar data has sparked interest in out-of-distribution (OOD) detection. It is widely assumed that Bayesian neural networks (BNN) are well suited for this task, as the endowed epistemic uncertainty should lead to disagreement in predictions on outliers. In this paper, we question this assumption and provide empirical evidence that proper Bayesian inference with common neural network architectures does not necessarily lead to good OOD detection. To circumvent the use of approximate inference, we start by studying the infinite-width case, where Bayesian inference can be exact considering the corresponding Gaussian process. Strikingly, the kernels induced under common architectural choices lead to uncertainties that do not reflect the underlying data generating process and are therefore unsuited for OOD detection. Finally, we study finite-width networks using HMC, and observe OOD behavior that is consistent with the infinite-width case. Overall, our study discloses fundamental problems when naively using BNNs for OOD detection and opens interesting avenues for future research.

【28】 Journal subject classification: intra- and inter-system discrepancies in Web Of Science and Scopus 标题：期刊主题分类：Web of Science和Scopus的系统内和系统间差异

作者：Shir Aviv-Reuven,Ariel Rosenfeld 机构：Department of Information Sciences, Bar-Ilan University, Israel 备注：25 pages, 20 figures, 3 tables 链接：https://arxiv.org/abs/2107.12222 摘要：期刊的学科分类是学术研究评价和文献计量分析的一个重要方面。期刊分类系统使用各种（部分）重叠和非详尽的主题类别，这导致许多期刊被分为多个单一的主题类别。因此，在任何给定系统内和不同系统之间都可能遇到差异。在这项研究中，我们将检验两种最广泛使用的索引系统——科学网和Scopus中的差异。我们使用已知的距离度量，以及逻辑集理论来检验和比较这些系统定义的分类方案。我们的结果表明，在每个系统中，分类类别的数量越多，其范围越大，差异越大，并且发现了冗余类别。我们的结果也显示了两个系统之间的显著差异。具体来说，一个系统中很少有类别与第二个系统中的类别“相似”，其中“相似性”是通过子集和有趣类别以及最小覆盖类别来衡量的。综合来看，我们的研究结果表明，这两种类型的差异是系统性的，在依赖这些学科分类系统时不能轻易忽视。摘要：Journal classification into subject categories is an important aspect in scholarly research evaluation as well as in bibliometric analysis. Journal classification systems use a variety of (partially) overlapping and non-exhaustive subject categories which results in many journals being classified into more than a single subject category. As such, discrepancies are likely to be encountered within any given system and between different systems. In this study, we set to examine both types of discrepancies in the two most widely used indexing systems - Web Of Science and Scopus. We use known distance measures, as well as logical set theory to examine and compare the category schemes defined by these systems. Our results demonstrate significant discrepancies within each system where a higher number of classified categories correlates with increased range and variance of rankings within them, and where redundant categories are found. Our results also show significant discrepancies between the two system. Specifically, very few categories in one system are "similar" to categories in the second system, where "similarity" is measured by subset & interesting categories and minimally covering categories. Taken jointly, our findings suggest that both types of discrepancies are systematic and cannot be easily disregarded when relying on these subject classification systems.

【29】 Predicting Influential Higher-Order Patterns in Temporal Network Data 标题：时态网络数据中有影响力的高阶模式预测

作者：Christoph Gote,Vincenzo Perri,Ingo Scholtes 机构：Predicting Infuential Higher-Order Patterns in Temporal Network Data, , Predicting Influential Higher-Order Paterns in Temporal Network Data, Chair of Systems Design, ETH Zurich, Zurich, Switzerland 备注：28 pages, 7 figures, 2 tables 链接：https://arxiv.org/abs/2107.12100 摘要：网络经常被用来模拟由相互作用的元素组成的复杂系统。虽然链路捕捉到直接交互的拓扑结构，但许多系统的真正复杂性源于路径中的高阶模式，通过这些模式节点可以间接地相互影响。路径数据表示连续直接交互的有序序列，可以用来建模这些模式。然而，为了避免过度拟合，这样的模型应该只考虑数据提供足够的统计证据的那些高阶模式。另一方面，我们假设仅捕捉直接交互的网络模型不适合数据中存在的高阶模式。因此，这两种方法都可能误判复杂网络中有影响的节点。我们在MOGen模型的基础上提出了八个中心性度量，这是一个多阶生成模型，它可以计算到最大距离的所有路径，但忽略了更大距离的路径。在一个预测实验中，我们将基于MOGen的中心性与网络模型和路径数据的等效度量进行了比较，目的是在样本外数据中识别出有影响的节点。我们的结果显示有力的证据支持我们的假设。MOGen始终优于网络模型和基于路径的预测。我们进一步证明，如果我们有足够的观测值，MOGen和基于路径的方法之间的性能差异将消失，这证实了误差是由于过度拟合造成的。摘要：Networks are frequently used to model complex systems comprised of interacting elements. While links capture the topology of direct interactions, the true complexity of many systems originates from higher-order patterns in paths by which nodes can indirectly influence each other. Path data, representing ordered sequences of consecutive direct interactions, can be used to model these patterns. However, to avoid overfitting, such models should only consider those higher-order patterns for which the data provide sufficient statistical evidence. On the other hand, we hypothesise that network models, which capture only direct interactions, underfit higher-order patterns present in data. Consequently, both approaches are likely to misidentify influential nodes in complex networks. We contribute to this issue by proposing eight centrality measures based on MOGen, a multi-order generative model that accounts for all paths up to a maximum distance but disregards paths at higher distances. We compare MOGen-based centralities to equivalent measures for network models and path data in a prediction experiment where we aim to identify influential nodes in out-of-sample data. Our results show strong evidence supporting our hypothesis. MOGen consistently outperforms both the network model and path-based prediction. We further show that the performance difference between MOGen and the path-based approach disappears if we have sufficient observations, confirming that the error is due to overfitting.

【30】 Workpiece Image-based Tool Wear Classification in Blanking Processes Using Deep Convolutional Neural Networks 标题：基于深度卷积神经网络的冲裁过程刀具磨损图像分类

作者：Dirk Alexander Molitor,Christian Kubik,Ruben Helmut Hetfleisch,Peter Groche 机构：Institute for Production Engineering, and Forming Machines, Technical University of Darmstadt, Germany, Darmstadt 链接：https://arxiv.org/abs/2107.12034 摘要：冲裁工艺由于其经济性，属于应用最广泛的制造工艺。它们的经济可行性在很大程度上取决于最终产品质量和相关的客户满意度以及可能的停机时间。特别是，刀具磨损增加会降低产品质量并导致停机，这就是近年来在磨损检测方面进行大量研究的原因。基于力和加速度信号的过程监测已得到广泛应用，本文提出了一种新的方法。对16种不同磨损状态的冲床冲裁件进行了拍照，并将其作为深度卷积神经网络的输入对磨损状态进行分类。结果表明，该方法能准确地预测刀具的磨损状态，为刀具磨损监测开辟了新的可能性和研究机会。摘要：Blanking processes belong to the most widely used manufacturing techniques due to their economic efficiency. Their economic viability depends to a large extent on the resulting product quality and the associated customer satisfaction as well as on possible downtimes. In particular, the occurrence of increased tool wear reduces the product quality and leads to downtimes, which is why considerable research has been carried out in recent years with regard to wear detection. While processes have widely been monitored based on force and acceleration signals, a new approach is pursued in this paper. Blanked workpieces manufactured by punches with 16 different wear states are photographed and then used as inputs for Deep Convolutional Neural Networks to classify wear states. The results show that wear states can be predicted with surprisingly high accuracy, opening up new possibilities and research opportunities for tool wear monitoring of blanking processes.

【31】 Tsallis and Rényi deformations linked via a new λ-duality标题：Tsallis和Rényi变形通过一个新的λ-对偶联系在一起

作者：Ting-Kam Leonard Wong,Jun Zhang 备注：38 pages, 7 figures 链接：https://arxiv.org/abs/2107.11925 摘要：Tsallis和R\'{e}nyi熵是这类熵的单调变换，它们将经典的Shannon熵和概率分布的指数族推广到非广义统计物理、信息论和统计学。尽管如此，$q$-指数族作为一个具有减法正规化的变形指数族，仍然反映了凸函数的经典勒让德对偶性以及相关的Bregman散度概念。本文证明了广义$\lambda$-对偶，其中$\lambda=1-q$是常数信息几何曲率，导出了一个具有分裂正规化的变形指数族，并与R\'{e}nyi熵和最优传输相联系。我们的$\lambda$-对偶将两个变形模型统一起来，这两个模型的区别仅仅在于重新参数化，并为研究潜在的数学结构提供了一个优雅而深刻的框架。利用这种对偶性，$\lambda$-指数族在自然出现R\'{e}nyi熵和散度的情况下满足与指数族的性质平行和推广的性质。特别地，我们给出了$q$-指数族的Tsallis熵最大化性质的一个新证明。我们还引入了一个$\lambda$-混合族，它可以看作$\lambda$-指数族的对偶。最后，我们讨论了$\lambda$-指数族和对数散度之间的对偶关系，并研究了它的统计结果。摘要：Tsallis and R\'{e}nyi entropies, which are monotone transformations of such other, generalize the classical Shannon entropy and the exponential family of probability distributions to non-extensive statistical physics, information theory, and statistics. The $q$-exponential family, as a deformed exponential family with subtractive normalization, nevertheless reflects the classical Legendre duality of convex functions as well as the associated concept of Bregman divergence. In this paper we show that a generalized $\lambda$-duality, where $\lambda = 1 - q$ is the constant information-geometric curvature, induces a deformed exponential family with divisive normalization and links to R\'{e}nyi entropy and optimal transport. Our $\lambda$-duality unifies the two deformation models, which differ by a mere reparameterization, and provides an elegant and deep framework to study the underlying mathematical structure. Using this duality, under which the R\'{e}nyi entropy and divergence appear naturally, the $\lambda$-exponential family satisfies properties that parallel and generalize those of the exponential family. In particular, we give a new proof of the Tsallis entropy maximizing property of the $q$-exponential family. We also introduce a $\lambda$-mixture family which may be regared as the dual of the $\lambda$-exponential family. Finally, we discuss a duality between the $\lambda$-exponential family and the logarithmic divergence, and study its statistical consequences.

【32】 A brief note on understanding neural networks as Gaussian processes 标题：关于将神经网络理解为高斯过程的一点注记

作者：Mengwu Guo 机构：Applied Analysis, Department of Applied Mathematics, University of Twente 链接：https://arxiv.org/abs/2107.11892 摘要：作为[Lee et al.，2017]工作的推广，本文简要讨论了神经网络输出的先验何时遵循高斯过程，以及神经网络诱导的高斯过程是如何形成的。这种高斯过程回归的后验平均函数位于神经网络诱导核定义的再生核Hilbert空间。在两层神经网络的情况下，诱导高斯过程提供了再生核Hilbert空间的解释，其并集形成Barron空间。摘要：As a generalization of the work in [Lee et al., 2017], this note briefly discusses when the prior of a neural network output follows a Gaussian process, and how a neural-network-induced Gaussian process is formulated. The posterior mean functions of such a Gaussian process regression lie in the reproducing kernel Hilbert space defined by the neural-network-induced kernel. In the case of two-layer neural networks, the induced Gaussian processes provide an interpretation of the reproducing kernel Hilbert spaces whose union forms a Barron space.

【33】 Adaptive Estimation and Uniform Confidence Bands for Nonparametric IV 标题：非参数IV的自适应估计和一致置信带

作者：Xiaohong Chen,Timothy Christensen,Sid Kankanala 备注：The data-driven choice of sieve dimension in this paper is based on and supersedes Section 3 of the preprint arXiv:1508.03365v1 链接：https://arxiv.org/abs/2107.11869 摘要：我们介绍了一个简单的计算，数据驱动的程序估计和推断的结构函数$h_0$及其导数在非参数模型使用工具变量。我们的第一个程序是基于bootstrap的，数据驱动的筛子非参数工具变量（NPIV）估计的筛子维数选择。当采用这种数据驱动的选择时，$h_0$及其导数的筛子NPIV估计量是自适应的：它们收敛于尽可能好的（即极小极大）超范数率，而不必知道$h_0$的平滑度、回归系数的内生性程度或工具强度。我们的第二个过程是数据驱动的方法，用于为$h_0$及其导数构造诚实的自适应一致置信带（ucb）。我们的数据驱动UCB保证了$h_0$及其衍生产品在一类通用的数据生成过程（诚实）上的覆盖率，并在minimax sup norm rate（自适应）上或在minimax sup norm rate（自适应）的对数因子内收缩。因此，我们的数据驱动的ucb相对于通常的欠光滑方法构造的ucb提供渐近效率增益。另外，这两种方法都作为特例适用于非参数回归。我们使用我们的程序来估计和推断企业出口密集边际的非参数引力方程，并找到反对未观察到的企业生产率分布的常见参数化的证据。摘要：We introduce computationally simple, data-driven procedures for estimation and inference on a structural function $h_0$ and its derivatives in nonparametric models using instrumental variables. Our first procedure is a bootstrap-based, data-driven choice of sieve dimension for sieve nonparametric instrumental variables (NPIV) estimators. When implemented with this data-driven choice, sieve NPIV estimators of $h_0$ and its derivatives are adaptive: they converge at the best possible (i.e., minimax) sup-norm rate, without having to know the smoothness of $h_0$, degree of endogeneity of the regressors, or instrument strength. Our second procedure is a data-driven approach for constructing honest and adaptive uniform confidence bands (UCBs) for $h_0$ and its derivatives. Our data-driven UCBs guarantee coverage for $h_0$ and its derivatives uniformly over a generic class of data-generating processes (honesty) and contract at, or within a logarithmic factor of, the minimax sup-norm rate (adaptivity). As such, our data-driven UCBs deliver asymptotic efficiency gains relative to UCBs constructed via the usual approach of undersmoothing. In addition, both our procedures apply to nonparametric regression as a special case. We use our procedures to estimate and perform inference on a nonparametric gravity equation for the intensive margin of firm exports and find evidence against common parameterizations of the distribution of unobserved firm productivity.

【34】 Graph Representation Learning on Tissue-Specific Multi-Omics 标题：面向特定组织的多元有机体的图形表示学习

作者：Amine Amor,Pietro Lio',Vikash Singh,Ramon Viñas Torné,Helena Andres Terre 机构： Whilethere has been a significant interest in building integrative 1Department of Computer Science and Technology, Universityof Cambridge 备注：This paper was accepted at the 2021 ICML Workshop on Computational Biology 链接：https://arxiv.org/abs/2107.11856 摘要：结合不同形式的人体组织数据对于推进生物医学研究和个性化医疗保健至关重要。在这项研究中，我们利用图嵌入模型（即VGAE）对组织特异性基因-基因相互作用（GGI）网络进行链接预测。通过消融实验，我们证明了多种生物学模式（即多组学）的结合可以产生强大的嵌入和更好的链接预测性能。我们的评估表明，整合基因甲基化图谱和RNA测序数据显著提高了链接预测性能。总的来说，RNA测序和基因甲基化数据的结合使得GGI网络的链接预测准确率达到71%。通过利用多组学数据的图表示学习，我们的工作为生物信息学中多组学集成的当前文献带来了新的见解。摘要：Combining different modalities of data from human tissues has been critical in advancing biomedical research and personalised medical care. In this study, we leverage a graph embedding model (i.e VGAE) to perform link prediction on tissue-specific Gene-Gene Interaction (GGI) networks. Through ablation experiments, we prove that the combination of multiple biological modalities (i.e multi-omics) leads to powerful embeddings and better link prediction performances. Our evaluation shows that the integration of gene methylation profiles and RNA-sequencing data significantly improves the link prediction performance. Overall, the combination of RNA-sequencing and gene methylation data leads to a link prediction accuracy of 71% on GGI networks. By harnessing graph representation learning on multi-omics data, our work brings novel insights to the current literature on multi-omics integration in bioinformatics.

【35】 SGD May Never Escape Saddle Points 标题：SGD可能永远不会逃脱鞍点

作者：Liu Ziyin,Botao Li,Masahito Ueda 机构：Department of Physics, University of Tokyo, Laboratoire de Physique de l’Ecole normale sup´erieure, ENS, Universit´e PSL, CNRS, Sorbonne Universit´e, Universit´e Paris-Diderot, Sorbonne Paris Cit´e, Institute for Physics of Intelligence, University of Tokyo 链接：https://arxiv.org/abs/2107.11774 摘要：随机梯度下降法（SGD）被用来解决高度非线性和非凸的机器学习问题，如深层神经网络的训练。然而，以往关于SGD的研究往往依赖于对SGD中噪声性质的高度限制和不切实际的假设。在这项工作中，我们在数学上构造的例子，无视以往的理解SGD。例如，我们的构造表明：（1）SGD可能收敛到一个局部极大值(2） SGD可以任意缓慢地脱离鞍点(3）新加坡元可能更喜欢尖锐的最低比平的；AMSGrad可以收敛到局部极大值。我们的结果表明，在神经网络训练中，SGD的噪声结构可能比损失情况更为重要，未来的研究应该集中在深入学习中得出实际的噪声结构。摘要：Stochastic gradient descent (SGD) has been deployed to solve highly non-linear and non-convex machine learning problems such as the training of deep neural networks. However, previous works on SGD often rely on highly restrictive and unrealistic assumptions about the nature of noise in SGD. In this work, we mathematically construct examples that defy previous understandings of SGD. For example, our constructions show that: (1) SGD may converge to a local maximum; (2) SGD may escape a saddle point arbitrarily slowly; (3) SGD may prefer sharp minima over the flat ones; and (4) AMSGrad may converge to a local maximum. Our result suggests that the noise structure of SGD might be more important than the loss landscape in neural network training and that future research should focus on deriving the actual noise structure in deep learning.

【36】 Federated Causal Inference in Heterogeneous Observational Data 标题：异质观测数据中的联合因果推理

作者：Ruoxuan Xiong,Allison Koenecke,Michael Powell,Zhu Shen,Joshua T. Vogelstein,Susan Athey 链接：https://arxiv.org/abs/2107.11732 摘要：分析来自多个来源的观测数据有助于提高检测治疗效果的统计能力；然而，隐私考虑等实际约束可能会限制跨数据集的个人级信息共享。本文提出了只利用异构数据集摘要级信息的联邦方法。我们的联合方法提供了治疗效果的双鲁棒点估计以及方差估计。我们得到了我们的联邦估计的渐近分布，它被证明是渐近等价于相应的估计从组合，个人水平的数据。我们表明，为了实现这些属性，联邦方法应该根据诸如模型是否正确指定以及跨异构数据集的稳定性等条件进行调整。摘要：Analyzing observational data from multiple sources can be useful for increasing statistical power to detect a treatment effect; however, practical constraints such as privacy considerations may restrict individual-level information sharing across data sets. This paper develops federated methods that only utilize summary-level information from heterogeneous data sets. Our federated methods provide doubly-robust point estimates of treatment effects as well as variance estimates. We derive the asymptotic distributions of our federated estimators, which are shown to be asymptotically equivalent to the corresponding estimators from the combined, individual-level data. We show that to achieve these properties, federated methods should be adjusted based on conditions such as whether models are correctly specified and stable across heterogeneous data sets.

【37】 Invariance-based Multi-Clustering of Latent Space Embeddings for Equivariant Learning 标题：基于不变性的等变学习潜在空间嵌入多聚类

作者：Chandrajit Bajaj,Avik Roy,Haoran Zhang 机构： We areinterested in fleshing out a canonical reconstruction that isinvariant under such group transformations while indepen- 1Department of Computer Science, The University of Texas atAustin, TX 787 1 2 2Department of Physics 备注：The codebase for MCEVAE is available at this https URL 链接：https://arxiv.org/abs/2107.11717 摘要：变分自动编码器（VAE）在恢复计算机视觉任务中的模型潜空间方面具有显著的效果。然而，由于种种原因，目前训练的vae似乎在潜伏期空间的学习不变性和等变聚类方面存在不足。我们的工作集中在提供解决这个问题的方法，并提出了一种方法来解开等变特征映射在一个李群流形通过执行深入，组不变的学习。同时实现了潜在空间表示的语义变量和等变变量的分离，我们通过使用一个混合模型pdf-like高斯混合变量进行不变聚类嵌入，形成了一个改进的证据下界（ELBO），该模型允许更好的无监督变分聚类。我们的实验表明，与目前最好的深度学习模型相比，该模型有效地学习分离不变量和等变量表示，显著提高了学习率，显著地提高了图像识别和规范状态重建的效率。摘要：Variational Autoencoders (VAEs) have been shown to be remarkably effective in recovering model latent spaces for several computer vision tasks. However, currently trained VAEs, for a number of reasons, seem to fall short in learning invariant and equivariant clusters in latent space. Our work focuses on providing solutions to this problem and presents an approach to disentangle equivariance feature maps in a Lie group manifold by enforcing deep, group-invariant learning. Simultaneously implementing a novel separation of semantic and equivariant variables of the latent space representation, we formulate a modified Evidence Lower BOund (ELBO) by using a mixture model pdf like Gaussian mixtures for invariant cluster embeddings that allows superior unsupervised variational clustering. Our experiments show that this model effectively learns to disentangle the invariant and equivariant representations with significant improvements in the learning rate and an observably superior image recognition and canonical state reconstruction compared to the currently best deep learning models.

【38】 Efficient inference of interventional distributions 标题：干预分布的有效推断

作者：Arnab Bhattacharyya,Sutanu Gayen,Saravanan Kandasamy,Vedant Raval,N. V. Vinodchandran 机构：National University of Singapore, Cornell University, Indian Institute of Technology Delhi, University of Nebraska-Lincoln 备注：16 pages, 2 figures 链接：https://arxiv.org/abs/2107.11712 摘要：我们考虑的问题，有效推断推断分布在因果贝叶斯网络从有限数量的观察。设$\mathcal{P}$是给定因果图$G$上可观测变量集$\mathbf{V}$上的因果模型。对于集合$\mathbf{X}、\mathbf{Y}\subseteq\mathbf{V}$，并将${\bf X}$设置为$\mathbf{X}$，让$P{\bf X}（\mathbf{Y}）$表示$\mathbf{Y}$上关于干预${\bf X}$到变量${\bf X}$的干预分布。shpither和Pearl（AAAI 2006）在Tian和Pearl（AAAI 2001）的工作基础上，给出了一类因果图的精确刻画，对于这类因果图，$P{\bf x}（{\mathbf{Y}}）$的介入分布可以唯一地确定。给出了第一种有效的Shpitser-Pearl算法。特别地，在自然假设下，我们给出了一个多项式时间算法，在输入一个可观测变量$\mathbf{V}$上的因果图$$G$，一个有界大小的集合$\mathbf{x}\substeq\mathbf{V}$的集合${\bf x}$，如果可以识别$P{\bf x}（\mathbf{Y}）$，则输出分布$\hat{P}$的求值器和生成器的简洁描述，该分布$\varepsilon$-接近（总变化距离）到$P{\bf x}（{\mathbf{Y}）$，其中$Y=\mathbf{V}\setminus\mathbf{x}$。我们还表明，当$\mathbf{Y}$是一个任意集时，除非所有具有统计零知识证明的问题（包括图同构问题）都有有效的随机化算法，否则没有有效的算法输出$\varepsilon$-接近于$\P{\bf x}（{\mathbf{Y}）$）的分布的求值器。摘要：We consider the problem of efficiently inferring interventional distributions in a causal Bayesian network from a finite number of observations. Let $\mathcal{P}$ be a causal model on a set $\mathbf{V}$ of observable variables on a given causal graph $G$. For sets $\mathbf{X},\mathbf{Y}\subseteq \mathbf{V}$, and setting ${\bf x}$ to $\mathbf{X}$, let $P_{\bf x}(\mathbf{Y})$ denote the interventional distribution on $\mathbf{Y}$ with respect to an intervention ${\bf x}$ to variables ${\bf x}$. Shpitser and Pearl (AAAI 2006), building on the work of Tian and Pearl (AAAI 2001), gave an exact characterization of the class of causal graphs for which the interventional distribution $P_{\bf x}({\mathbf{Y}})$ can be uniquely determined. We give the first efficient version of the Shpitser-Pearl algorithm. In particular, under natural assumptions, we give a polynomial-time algorithm that on input a causal graph $G$ on observable variables $\mathbf{V}$, a setting ${\bf x}$ of a set $\mathbf{X} \subseteq \mathbf{V}$ of bounded size, outputs succinct descriptions of both an evaluator and a generator for a distribution $\hat{P}$ that is $\varepsilon$-close (in total variation distance) to $P_{\bf x}({\mathbf{Y}})$ where $Y=\mathbf{V}\setminus \mathbf{X}$, if $P_{\bf x}(\mathbf{Y})$ is identifiable. We also show that when $\mathbf{Y}$ is an arbitrary set, there is no efficient algorithm that outputs an evaluator of a distribution that is $\varepsilon$-close to $P_{\bf x}({\mathbf{Y}})$ unless all problems that have statistical zero-knowledge proofs, including the Graph Isomorphism problem, have efficient randomized algorithms.

【39】 Tail of Distribution GAN (TailGAN): Generative- Adversarial-Network-Based Boundary Formation 标题：分配的尾巴(TailGan)：基于生成-对抗-网络的边界形成

作者：Nikolaos Dionelis 机构：The University of Edinburgh, Edinburgh, UK 备注：None 链接：https://arxiv.org/abs/2107.11658 摘要：生成性对抗网络（GAN）是一种强大的方法，可用于无监督异常检测，而目前的技术存在一些局限性，如在分布尾部附近准确检测异常。GANs一般不保证概率密度的存在，并且容易受到模式崩溃的影响，而很少有GANs使用可能性来减少模式崩溃。在本文中，我们建立了一个基于GAN的尾部异常检测模型，即尾部分布GAN（TailGAN），在尾部数据分布上生成样本，并在支撑边界附近检测异常。使用TailGAN，我们利用GANs进行异常检测，并使用最大熵正则化。利用学习潜在分布概率的GANs，可以设计边界样本发生器，并利用该模型来描述异常，从而改进异常检测方法。TailGAN解决了不相交组件的支持问题，并在图像上实现了有竞争力的性能。我们评估了TailGAN用于识别分布外（OoD）数据的性能，并在MNIST、CIFAR-10、Baggage X-Ray和OoD数据上对其性能进行了评估，与文献中的方法相比，OoD数据显示出了竞争力。摘要：Generative Adversarial Networks (GAN) are a powerful methodology and can be used for unsupervised anomaly detection, where current techniques have limitations such as the accurate detection of anomalies near the tail of a distribution. GANs generally do not guarantee the existence of a probability density and are susceptible to mode collapse, while few GANs use likelihood to reduce mode collapse. In this paper, we create a GAN-based tail formation model for anomaly detection, the Tail of distribution GAN (TailGAN), to generate samples on the tail of the data distribution and detect anomalies near the support boundary. Using TailGAN, we leverage GANs for anomaly detection and use maximum entropy regularization. Using GANs that learn the probability of the underlying distribution has advantages in improving the anomaly detection methodology by allowing us to devise a generator for boundary samples, and use this model to characterize anomalies. TailGAN addresses supports with disjoint components and achieves competitive performance on images. We evaluate TailGAN for identifying Out-of-Distribution (OoD) data and its performance evaluated on MNIST, CIFAR-10, Baggage X-Ray, and OoD data shows competitiveness compared to methods from the literature.

【40】 Detecting Adversarial Examples Is (Nearly) As Hard As Classifying Them 标题：检测敌意的例子(几乎)和分类一样难。

作者：Florian Tramèr 机构： 1Stanford University 备注：ICML 2021 Workshop on the Prospects and Perils of Adversarial Machine Learning 链接：https://arxiv.org/abs/2107.11630 摘要：使分类器对对抗性例子具有鲁棒性是很困难的。因此，许多防御措施解决了检测扰动输入这一看似简单的任务。我们向这个目标展示了一个障碍。我们证明了对抗性示例的检测和分类之间的一般硬度降低：给定一个{\epsilon}距离攻击的鲁棒检测器（在某些度量中），我们可以为{\epsilon}/2距离攻击构建一个类似的鲁棒（但低效）分类器，因此不能用来构建实用的分类器。相反，这是一个有用的健全性检查，以测试经验检测结果是否意味着比作者预期的要强烈得多的东西。为了说明这一点，我们重温了13个探测器防御系统。对于11/13的情况，我们表明，声称的检测结果将意味着一个低效的分类器，其鲁棒性远远超过了最先进的水平。摘要：Making classifiers robust to adversarial examples is hard. Thus, many defenses tackle the seemingly easier task of detecting perturbed inputs. We show a barrier towards this goal. We prove a general hardness reduction between detection and classification of adversarial examples: given a robust detector for attacks at distance {\epsilon} (in some metric), we can build a similarly robust (but inefficient) classifier for attacks at distance {\epsilon}/2. Our reduction is computationally inefficient, and thus cannot be used to build practical classifiers. Instead, it is a useful sanity check to test whether empirical detection results imply something much stronger than the authors presumably anticipated. To illustrate, we revisit 13 detector defenses. For 11/13 cases, we show that the claimed detection results would imply an inefficient classifier with robustness far beyond the state-of-the-art.

【41】 A Model-Agnostic Algorithm for Bayes Error Determination in Binary Classification 标题：一种模型无关的二值分类贝叶斯误差判定算法

作者：Umberto Michelucci,Michela Sperti,Dario Piga,Francesca Venturini,Marco A. Deriu 机构：TOELT llc, Birchlenstr. , D¨ubendorf, Switzerland, PolitoBIOMed Lab, Department of Mechanical and Aerospace Engineering, Politecnico, di Torino, Turin, Italy, Institute of Applied Mathematics and Physics, Zurich University of Applied Sciences 备注：21 pages 链接：https://arxiv.org/abs/2107.11609 摘要：本文提出了一种新的确定最佳性能的方法——内禀极限确定算法（ILD算法），它是根据AUC（ROC曲线下面积）和精确度来测量的，可以从二进制分类问题中的特定数据集获得的数据，这些数据集具有分类特征{\sl，而不管}所使用的模型。这个极限，即Bayes误差，完全独立于所使用的任何模型，并且描述了数据集的内在属性。因此，ILD算法在应用于所考虑的数据集时提供了关于任何二元分类算法的预测极限的重要信息。本文对该算法进行了详细的描述，给出了其完整的数学框架，并给出了便于实现的伪码。最后给出了一个实例。摘要：This paper presents the intrinsic limit determination algorithm (ILD Algorithm), a novel technique to determine the best possible performance, measured in terms of the AUC (area under the ROC curve) and accuracy, that can be obtained from a specific dataset in a binary classification problem with categorical features {\sl regardless} of the model used. This limit, namely the Bayes error, is completely independent of any model used and describes an intrinsic property of the dataset. The ILD algorithm thus provides important information regarding the prediction limits of any binary classification algorithm when applied to the considered dataset. In this paper the algorithm is described in detail, its entire mathematical framework is presented and the pseudocode is given to facilitate its implementation. Finally, an example with a real dataset is given.

【42】 Inferring Economic Condition Uncertainty from Electricity Big Data 标题：从电力大数据推断经济状况不确定性

作者：Zhengyu Shi,Libo Wu,Haoqi Qian,Yingjie Tian 机构：School of Data Science, Fudan University, Yangpu District, Shanghai , China, School of Economics, Fudan University, Yangpu District, Shanghai , China, Institute for Big Data, Fudan University, Yangpu District, Shanghai , China 链接：https://arxiv.org/abs/2107.11593 摘要：推断经济状况中的不确定性对于决策者和市场参与者都具有重要意义。本文提出了一种基于隐马尔可夫模型（HMM）的经济条件不确定性（ECU）指标的构造方法。ECU指数是一个介于0和1之间的无量纲指数，这使得它在行业、地区和时期之间具有可比性。我们利用上海市2018-2020年近2万家企业的日用电量数据构建ECU指数。结果表明，无论是部门层面还是区域层面，ECU的各项指标都成功地反映了COVID-19对上海经济状况的负面影响。此外，ECU指数也呈现出不同地区、不同行业的异质性。这反映出经济状况不确定性的变化主要与区域经济结构和行业面临的定向调控政策有关。ECU指数还可以很容易地扩展到测量未来具有巨大潜力的不同领域的经济状况的不确定性。摘要：Inferring the uncertainties in economic conditions are of significant importance for both decision makers as well as market players. In this paper, we propose a novel method based on Hidden Markov Model (HMM) to construct the Economic Condition Uncertainty (ECU) index that can be used to infer the economic condition uncertainties. The ECU index is a dimensionless index ranges between zero and one, this makes it to be comparable among sectors, regions and periods. We use the daily electricity consumption data of nearly 20 thousand firms in Shanghai from 2018 to 2020 to construct the ECU indexes. Results show that all ECU indexes, no matter at sectoral level or regional level, successfully captured the negative impacts of COVID-19 on Shanghai's economic conditions. Besides, the ECU indexes also presented the heterogeneities in different districts as well as in different sectors. This reflects the facts that changes in uncertainties of economic conditions are mainly related to regional economic structures and targeted regulation policies faced by sectors. The ECU index can also be easily extended to measure uncertainties of economic conditions in different fields which has great potentials in the future.

【43】 Nonreversible Markov chain Monte Carlo algorithm for efficient generation of Self-Avoiding Walks 标题：高效生成自避步的不可逆马尔可夫链蒙特卡罗算法

作者：Hanqing Zhao,Marija Vucelja 机构：Department of Physics, University of Virginia, Charlottesville, VA , USA, ) 链接：https://arxiv.org/abs/2107.11542 摘要：我们引入一个有效的不可逆马尔可夫链蒙特卡罗算法来生成具有可变端点的自回避行走。在二维空间中，新算法比H.~Hu，X.~Chen和Y.~Deng在\cite{old}中提出的双移动不可逆Berretti-Sokal算法略好，而在三维空间中，新算法的速度要快3-5倍。新算法引入了服从全局平衡的不可逆马尔可夫链，允许在现有的自回避步行上进行三种基本的移动：缩短、延长或改变构象，而不改变步行的长度。摘要：We introduce an efficient nonreversible Markov chain Monte Carlo algorithm to generate self-avoiding walks with a variable endpoint. In two dimensions, the new algorithm slightly outperforms the two-move nonreversible Berretti-Sokal algorithm introduced by H.~Hu, X.~Chen, and Y.~Deng in \cite{old}, while for three-dimensional walks, it is 3--5 times faster. The new algorithm introduces nonreversible Markov chains that obey global balance and allows for three types of elementary moves on the existing self-avoiding walk: shorten, extend or alter conformation without changing the walk's length.

【44】 Applying Inter-rater Reliability and Agreement in Grounded Theory Studies in Software Engineering 标题：评分者间可靠性和一致性在软件工程扎根理论研究中的应用

作者：Jessica Díaz,Jorge Pérez,Carolina Gallardo,Ángel González-Prieto 机构：Universidad Politécnica de Madrid. Departamento de Sistemas Informáticos. ETSI Sistemas, Informáticos, C. Alan Turing sn (Carretera de Valencia Km ,), Madrid, Spain., Universidad Complutense de Madrid. Departamento de Álgebra, Geometría y Topología 备注：20 pages, 5 figures, 8 tables 链接：https://arxiv.org/abs/2107.11449 摘要：近年来，应用扎根理论的实证软件工程的定性研究越来越多。扎根理论是以理论抽样、编码、常数比较、记忆和饱和为主要特征，从定性数据中归纳、迭代地发展理论的一种技术。大型或有争议的GT研究可能涉及多个研究人员的协作编码，这需要一种严格和共识，而单个编码人员不需要。尽管许多定性研究者拒绝采用定量方法，而赞同其他定性标准，但许多其他研究者致力于通过评分者间可靠性（IRR）和/或评分者间一致性（IRA）技术来衡量共识，以形成对所研究现象的共同理解。然而，在GT的迭代过程中，如何以及何时应用IRR/IRA还没有具体的指导方针，因此研究人员已经使用了很多年的特别方法。本文提出了一个系统地将IRR/RA应用于软件工程GT研究的过程，该过程符合这种定性研究方法的迭代性质，并得到了一个关于在软件工程GT研究中应用IRR/RA的系统文献综述的支持。这一过程使研究人员能够逐步生成一个理论，同时确保对支持该理论的结构达成共识，从而提高定性研究的严谨性。这种形式化有助于研究人员应用IRR/IRA的GT研究时，不同的评分参与编码。衡量评分者之间的共识有助于提高研究的可交流性、透明度、反思性、可复制性和可信度。摘要：In recent years, the qualitative research on empirical software engineering that applies Grounded Theory is increasing. Grounded Theory (GT) is a technique for developing theory inductively e iteratively from qualitative data based on theoretical sampling, coding, constant comparison, memoing, and saturation, as main characteristics. Large or controversial GT studies may involve multiple researchers in collaborative coding, which requires a kind of rigor and consensus that an individual coder does not. Although many qualitative researchers reject quantitative measures in favor of other qualitative criteria, many others are committed to measuring consensus through Inter-Rater Reliability (IRR) and/or Inter-Rater Agreement (IRA) techniques to develop a shared understanding of the phenomenon being studied. However, there are no specific guidelines about how and when to apply IRR/IRA during the iterative process of GT, so researchers have been using ad hoc methods for years. This paper presents a process for systematically applying IRR/IRA in GT studies that meets the iterative nature of this qualitative research method, which is supported by a previous systematic literature review on applying IRR/RA in GT studies in software engineering. This process allows researchers to incrementally generate a theory while ensuring consensus on the constructs that support it and, thus, improving the rigor of qualitative research. This formalization helps researchers to apply IRR/IRA to GT studies when various raters are involved in coding. Measuring consensus among raters promotes communicability, transparency, reflexivity, replicability, and trustworthiness of the research.

【45】 A general sample complexity analysis of vanilla policy gradient 标题：香草政策梯度的一般样本复杂性分析

作者：Rui Yuan,Robert M. Gower,Alessandro Lazaric 备注：ICML 2021 Workshop on "Reinforcement learning theory" 链接：https://arxiv.org/abs/2107.11433 摘要：策略梯度（PG）是解决强化学习（RL）问题最常用的方法之一。然而，一个坚实的理论理解，即使是“香草”PG一直难以捉摸很长一段时间。本文应用最新的非凸优化SGD分析工具，在目标函数的光滑性假设和估计梯度范数的二阶矩的弱条件下，得到了REINFORCE和GPOMDP的收敛性保证。当在策略空间的公共假设下实例化时，我们的一般结果立即恢复现有的$\widetilde{\mathcal{O}（\epsilon^{-4}）$样本复杂性保证，但是对于更大范围的参数（例如，步长和批大小$m$），相对于以前的文献。值得注意的是，我们的结果包括了单轨迹情况（即，$m=1$），通过修正文献中已有的结果，它提供了对问题特定参数依赖性的更精确的分析。我们相信，非凸优化的最新工具的集成可能会导致识别更广泛的问题，其中PG方法具有强大的理论保证。摘要：The policy gradient (PG) is one of the most popular methods for solving reinforcement learning (RL) problems. However, a solid theoretical understanding of even the "vanilla" PG has remained elusive for long time. In this paper, we apply recent tools developed for the analysis of SGD in non-convex optimization to obtain convergence guarantees for both REINFORCE and GPOMDP under smoothness assumption on the objective function and weak conditions on the second moment of the norm of the estimated gradient. When instantiated under common assumptions on the policy space, our general result immediately recovers existing $\widetilde{\mathcal{O}}(\epsilon^{-4})$ sample complexity guarantees, but for wider ranges of parameters (e.g., step size and batch size $m$) with respect to previous literature. Notably, our result includes the single trajectory case (i.e., $m=1$) and it provides a more accurate analysis of the dependency on problem-specific parameters by fixing previous results available in the literature. We believe that the integration of state-of-the-art tools from non-convex optimization may lead to identify a much broader range of problems where PG methods enjoy strong theoretical guarantees.

本文参与腾讯云自媒体同步曝光计划，分享自微信公众号。

原始发表：2021-07-27，如有侵权请联系 cloudcommunity@tencent.com 删除

linux