统计学学术速递[11.15]

公众号-arXiv每日学术速递

发布于 2021-11-17 11:08:14

6680

发布于 2021-11-17 11:08:14

stat统计学，共计30篇

【1】 Sampling from multimodal distributions using tempered Hamiltonian transitions 标题：使用回火哈密顿变换从多峰分布中采样链接：https://arxiv.org/abs/2111.06871

作者：Joonha Park 机构：University of Kansas 备注：32 pages, 10 figures 摘要：哈密顿蒙特卡罗（HMC）方法因其高效性和良好的可扩展性而被广泛用于从非标准化目标密度中提取样本。然而，当目标分布为多峰分布时，HMC会遇到困难，因为沿模拟路径的势能函数（即负对数密度函数）的最大增加量受初始动能的限制，初始动能遵循$\chi_d^2$分布的一半，其中d是空间维度。在本文中，我们发展了一种哈密顿蒙特卡罗方法，其中构造的路径可以穿过高势能垒。这种方法不需要预先知道目标分布的模式。我们的方法通过在构建哈密顿路径时连续改变模拟粒子的质量，实现目标密度的孤立模式之间的频繁跳跃。因此，该方法可视为HMC和回火转变方法的组合。与其他回火方法相比，我们的方法在吉布斯取样器设置中具有独特的优势，其中目标分布在每一步都会发生变化。我们为我们的方法开发了一个实用的调整策略，并证明它可以使用正态混合和传感器网络定位问题，构建针对高维、多峰分布的全局混合马尔可夫链。摘要：Hamiltonian Monte Carlo (HMC) methods are widely used to draw samples from unnormalized target densities due to high efficiency and favorable scalability with respect to increasing space dimensions. However, HMC struggles when the target distribution is multimodal, because the maximum increase in the potential energy function (i.e., the negative log density function) along the simulated path is bounded by the initial kinetic energy, which follows a half of the $\chi_d^2$ distribution, where d is the space dimension. In this paper, we develop a Hamiltonian Monte Carlo method where the constructed paths can travel across high potential energy barriers. This method does not require the modes of the target distribution to be known in advance. Our approach enables frequent jumps between the isolated modes of the target density by continuously varying the mass of the simulated particle while the Hamiltonian path is constructed. Thus, this method can be considered as a combination of HMC and the tempered transitions method. Compared to other tempering methods, our method has a distinctive advantage in the Gibbs sampler settings, where the target distribution changes at each step. We develop a practical tuning strategy for our method and demonstrate that it can construct globally mixing Markov chains targeting high-dimensional, multimodal distributions, using mixtures of normals and a sensor network localization problem.

【2】 Higher-Order Coverage Errors of Batching Methods via Edgeworth Expansions on t-Statistics标题：基于t-Statistics的Edgeworth展开的批处理方法的高阶覆盖误差链接：https://arxiv.org/abs/2111.06859

作者：Shengyi He,Henry Lam 机构： Department of Industrial Engineering and Operations Research, Columbia University 摘要：在这方面，开放式统计方法的覆盖率是否比其他方法更高，而在这方面，开放式统计方法的覆盖率是否比其他方法更高。我们通过在$t$-统计数据上构建Edgeworth类型展开，开发了获取批处理方法高阶覆盖率错误的技术。这些展开式中的系数在分析上是复杂的，但我们提供了通过蒙特卡罗模拟来估计$n^{-1}$误差项系数的算法。我们提供了批次数量对覆盖率误差影响的见解，其中我们展示了通常的非单调关系。我们还从理论上和数值上比较了不同的批处理方法，并认为没有一种方法在覆盖率方面优于其他方法。然而，当批量较大时，切片刀的覆盖范围最好。摘要：While batching methods have been widely used in simulation and statistics, it is open regarding their higher-order coverage behaviors and whether one variant is better than the others in this regard. We develop techniques to obtain higher-order coverage errors for batching methods by building Edgeworth-type expansions on $t$-statistics. The coefficients in these expansions are intricate analytically, but we provide algorithms to estimate the coefficients of the $n^{-1}$ error term via Monte Carlo simulation. We provide insights on the effect of the number of batches on the coverage error where we demonstrate generally non-monotonic relations. We also compare different batching methods both theoretically and numerically, and argue that none of the methods is uniformly better than the others in terms of coverage. However, when the number of batches is large, sectioned jackknife has the best coverage among all.

【3】 Wasserstein convergence in Bayesian deconvolution models 标题：贝叶斯反褶积模型的Wasserstein收敛性链接：https://arxiv.org/abs/2111.06846

作者：Judith Rousseau,Catia Scricciolo 机构： 1University of Oxford, Department of Statistics, uk 2Department of Economics, Universita di Verona 摘要：我们研究了从随机误差（噪声）附加污染的独立复制（信号）中恢复分布函数的重新认识的反褶积问题，其分布是已知的。我们研究了在$L^1$-Wasserstein度量下，用于模拟信号潜在分布的贝叶斯非参数方法是否能够产生具有渐近频率有效性的推论。当误差密度为普通光滑时，我们建立了两个反演不等式，将两个混合密度（观测值）之间的$L^1$或$L^1$-Wasserstein距离与相应信号分布之间的$L^1$-Wasserstein距离联系起来。这个平滑不等式改进了文献中的不等式。我们将这一一般结果应用于正态分布的Dirichlet过程混合上的贝叶斯方法bayes，作为具有Laplace或Linnik噪声的混合分布（或信号分布）的先验。特别是，我们通过拉普拉斯（或Linnik）与精心选择的正态密度混合物的卷积，构造了观测密度的自适应近似，并表明后验概率集中在极小极大速率上，达到对数因子。同样的先验法则也适用于混合密度的Sobolev正则性水平，从而导致了一种新的相对于Wasserstein距离的贝叶斯估计方法，用于光滑密度分布。摘要：We study the reknown deconvolution problem of recovering a distribution function from independent replicates (signal) additively contaminated with random errors (noise), whose distribution is known. We investigate whether a Bayesian nonparametric approach for modelling the latent distribution of the signal can yield inferences with asymptotic frequentist validity under the $L^1$-Wasserstein metric. When the error density is ordinary smooth, we develop two inversion inequalities relating either the $L^1$ or the $L^1$-Wasserstein distance between two mixture densities (of the observations) to the $L^1$-Wasserstein distance between the corresponding distributions of the signal. This smoothing inequality improves on those in the literature. We apply this general result to a Bayesian approach bayes on a Dirichlet process mixture of normal distributions as a prior on the mixing distribution (or distribution of the signal), with a Laplace or Linnik noise. In particular we construct an \textit{adaptive} approximation of the density of the observations by the convolution of a Laplace (or Linnik) with a well chosen mixture of normal densities and show that the posterior concentrates at the minimax rate up to a logarithmic factor. The same prior law is shown to also adapt to the Sobolev regularity level of the mixing density, thus leading to a new Bayesian estimation method, relative to the Wasserstein distance, for distributions with smooth densities.

【4】 Convergence Rates for the MAP of an Exponential Family and Stochastic Mirror Descent -- an Open Problem 标题：指数族映射的收敛速度与随机镜像下降--一个公开问题链接：https://arxiv.org/abs/2111.06826

作者：Rémi Le Priol,Frederik Kunstner,Damien Scieur,Simon Lacoste-Julien 机构：Mila, Université de Montreal, University of British Columbia, Samsung, SAIT AI Lab, Montreal, Canada CIFAR AI Chair 备注：9 pages and 3 figures + Appendix 摘要：我们考虑的最大上限估计（MLE），或共轭最大后验概率（MAP）的指数族的上限对数似然次优的问题，在非渐近的方式。令人惊讶的是，我们在文献中没有找到这个问题的一般解决方案。特别是，当前的理论不适用于高斯分布或有趣的少样本区域。在展示了问题的各个方面之后，我们展示了我们可以将地图解释为在对数似然上运行随机镜像下降（SMD）。然而，现代收敛结果并不适用于指数族的标准示例，突出了收敛文献中的漏洞。我们相信，解决这个非常根本的问题可能会给统计和优化社区带来进步。摘要：We consider the problem of upper bounding the expected log-likelihood sub-optimality of the maximum likelihood estimate (MLE), or a conjugate maximum a posteriori (MAP) for an exponential family, in a non-asymptotic way. Surprisingly, we found no general solution to this problem in the literature. In particular, current theories do not hold for a Gaussian or in the interesting few samples regime. After exhibiting various facets of the problem, we show we can interpret the MAP as running stochastic mirror descent (SMD) on the log-likelihood. However, modern convergence results do not apply for standard examples of the exponential family, highlighting holes in the convergence literature. We believe solving this very fundamental problem may bring progress to both the statistics and optimization communities.

【5】 Histograms lie about distribution shapes and Pearson's coefficient of variation lies about variability 标题：直方图与分布形状有关，皮尔逊变异系数与变异性有关链接：https://arxiv.org/abs/2111.06822

作者：Paulo S. P. Silveira,Jose O. Siqueira 机构：Running head, Density plots & ECRD 备注：24 pages, 7 figures, Rscripts included along text and apendices. This manuscript is under consideration of TQMP (The Quantitative Methods for Psychology \url{this https URL}) since 28Oct2021 摘要：背景与目的：直方图和皮尔逊变异系数是最流行的汇总统计数据。研究人员使用它们通过视觉检查直方图来判断定量数据分布的形状。变异系数被视为这些数据相对变异性的估计器。我们通过R中的例子探索直方图和变异系数的性质，从而提供更好的选择：密度图和艾森豪尔的相对分散系数。方法：应用R中的假设示例创建直方图和密度，并计算变异系数和相对分散系数。结果：这些假设的例子清楚地表明这两种传统方法是有缺陷的。直方图无法反映概率分布，变异系数在同一数据集中存在负值和正值问题，对异常值敏感，并且受分布平均值的严重影响。对可能的替代品进行了解释并应用于对比。结论：使用现代计算机和R语言，可以很容易地用密度图代替直方图，密度图能够近似理论概率分布。此外，艾森豪尔的相对离散系数被认为是相对可变性的合适估计量，包括对上下限的修正。摘要：Background and Objective: Histograms and Pearson's coefficient of variation are among the most popular summary statistics. Researchers use them to judge the shape of quantitative data distribution by visual inspection of histograms. The coefficient of variation is taken as an estimator of relative variability of these data. We explore properties of histograms and coefficient of variation by examples in R, thus offering better alternatives: density plots and Eisenhauer's relative dispersion coefficient. Methods: Hypothetical examples developed in R are applied to create histograms and density and to compute coefficient of variation and relative dispersion coefficient. Results: These hypothetical examples clearly show that these two traditional approaches are flawed. Histograms are incapable of reflecting the distribution of probabilities and the coefficient of variation has issues with negative and positive values in the same dataset, it is sensible to outliers, and it is severely affected by mean value of a distribution. Potential replacements are explained and applied for contrast. Conclusions: With the use of modern computers and R language it is easy to replace histograms by density plots, which are able to approximate the theoretical probability distribution. In addition, Eisenhauer's relative dispersion coefficient is suggested as a suitable estimator of relative variability, including corrections for lower and upper bounds.

【6】 Dynamic treatment effects: high-dimensional inference under model misspecification 标题：动态处理效应：模型误指定下的高维推理链接：https://arxiv.org/abs/2111.06818

作者：Yuqian Zhang,Jelena Bradic,Weijie Ji 机构： AND WEIJIE JI 1 1Department of Mathematics, University of California San Diego yuz6 4 3, edu 2Department of Mathematics and Halicioglu Data Science Institute, University of California San Diego jbradic 摘要：本文考虑了在协变量和治疗是纵向的动态环境下，异质治疗效应的推断。我们关注高维情况，即样本量$N$可能比协变量向量的维度$d$大得多。考虑了边际结构均值模型。我们提出了一种基于“矩目标”干扰估计器构造的“序列模型双鲁棒”估计器。这种干扰估计器是通过非标准损失函数精心设计的，减少了因潜在模型错误而产生的偏差。即使在模型错误指定发生时，我们也实现了$\sqrt N$-推理。我们只要求在每个时间点正确指定一个干扰模型。这样的模型正确性条件比所有现有的工作都要弱，甚至包含低维的文献。摘要：This paper considers the inference for heterogeneous treatment effects in dynamic settings that covariates and treatments are longitudinal. We focus on high-dimensional cases that the sample size, $N$, is potentially much larger than the covariate vector's dimension, $d$. The marginal structural mean models are considered. We propose a "sequential model doubly robust" estimator constructed based on "moment targeted" nuisance estimators. Such nuisance estimators are carefully designed through non-standard loss functions, reducing the bias resulting from potential model misspecifications. We achieve $\sqrt N$-inference even when model misspecification occurs. We only require one nuisance model to be correctly specified at each time spot. Such model correctness conditions are weaker than all the existing work, even containing the literature on low dimensions.

【7】 High-Dimensional Functional Mixed-effect Model for Bilevel Repeated Measurements 标题：两水平重复测量的高维函数混合效应模型链接：https://arxiv.org/abs/2111.06796

作者：Xiaotian Dai,Guifang Fu 机构：Department of Mathematical Sciences, SUNY Binghamton University, Vestal, NY 摘要：考虑中的双层功能数据有两个重复测量来源。一种方法是在一系列规则的时间/空间点上密集重复地测量每个对象的变量，称为函数数据。另一种方法是在每次访问时重复收集一个功能数据。与成熟的单级功能数据分析方法相比，与高维两级功能数据相关的方法是有限的。在这篇文章中，我们提出了一个高维函数混合效应模型（HDFMM）来分析双层函数响应和大规模标量预测之间的关联。我们使用B样条对无限维函数系数进行平滑和估计，使用三明治平滑器估计协方差函数，并通过快速更新MCMC程序将协方差相关参数的估计与所有回归参数集成到一个框架中。我们证明了HDFMM方法在各种仿真研究和实际数据分析下的性能是有希望的。作为已建立的线性混合模型的扩展，HDFMM模型将响应从重复测量的标量扩展到重复测量的功能数据/曲线，同时保持解释样本之间相关性和控制混杂因素的能力。摘要：The bilevel functional data under consideration has two sources of repeated measurements. One is to densely and repeatedly measure a variable from each subject at a series of regular time/spatial points, which is named as functional data. The other is to repeatedly collect one functional data at each of the multiple visits. Compared to the well-established single-level functional data analysis approaches, those that are related to high-dimensional bilevel functional data are limited. In this article, we propose a high-dimensional functional mixed-effect model (HDFMM) to analyze the association between the bilevel functional response and a large scale of scalar predictors. We utilize B-splines to smooth and estimate the infinite-dimensional functional coefficient, a sandwich smoother to estimate the covariance function and integrate the estimation of covariance-related parameters together with all regression parameters into one framework through a fast updating MCMC procedure. We demonstrate that the performance of the HDFMM method is promising under various simulation studies and a real data analysis. As an extension of the well-established linear mixed model, the HDFMM model extends the response from repeatedly measured scalars to repeatedly measured functional data/curves, while maintaining the ability to account for the relatedness among samples and control for confounding factors.

【8】 Epistasis Detection Via the Joint Cumulant 标题：基于联合累积量的上位性检测链接：https://arxiv.org/abs/2111.06795

作者：Randall Reese,Guifang Fu,Geran Zhao,Xiaotian Dai,Xiaotian Li,Kenneth Chiu 机构：Received: date Accepted: date 摘要：从超高维数据中选择有影响的非线性交互特征一直是各个领域的一项重要任务。然而，当实际收集了50多万个特征时，统计准确性和计算可行性是两个最大的问题。许多现存的特征筛选方法要么只关注主效应，要么严重依赖于遗传结构，因此在呈现强交互但弱主效应的场景中，它们是无效的。在本文中，我们提出了一种新的基于联合累积量的交互筛选方法（JCI-SIS）。我们表明，所提出的程序具有很强的确定筛选一致性，并且理论上支持其性能。对连续预测和分类预测进行了仿真研究，以证明我们的JCI-SIS方法的通用性和实用性。我们将JCI-SIS应用于从多囊卵巢综合征（PCOS）患者和健康对照组中收集的4000名受试者中筛选27554602881个相互作用对，涉及234754个单核苷酸多态性（SNPs），进一步说明了JCI-SIS的功效。摘要：Selecting influential nonlinear interactive features from ultrahigh dimensional data has been an important task in various fields. However, statistical accuracy and computational feasibility are the two biggest concerns when more than half a million features are collected in practice. Many extant feature screening approaches are either focused on only main effects or heavily rely on heredity structure, hence rendering them ineffective in a scenario presenting strong interactive but weak main effects. In this article, we propose a new interaction screening procedure based on joint cumulant (named JCI-SIS). We show that the proposed procedure has strong sure screening consistency and is theoretically sound to support its performance. Simulation studies designed for both continuous and categorical predictors are performed to demonstrate the versatility and practicability of our JCI-SIS method. We further illustrate the power of JCI-SIS by applying it to screen 27,554,602,881 interaction pairs involving 234,754 single nucleotide polymorphisms (SNPs) for each of the 4,000 subjects collected from polycystic ovary syndrome (PCOS) patients and healthy controls.

【9】 Simplifying approach to Node Classification in Graph Neural Networks 标题：图神经网络中节点分类的简化方法链接：https://arxiv.org/abs/2111.06748

作者：Sunil Kumar Maurya,Xin Liu,Tsuyoshi Murata 机构：Department of Computer Science, Tokyo Institute of Technology, Tokyo, Japan, Artificial Intelligence Research Center, AIST, Tokyo, Japan, AIST-Tokyo Tech Real World Big-Data Computation Open Innovation Laboratory, Tokyo, Japan, A R T I C L E I N F O 备注：arXiv admin note: substantial text overlap with arXiv:2105.07634 摘要：图神经网络已经成为从图结构数据中学习的不可或缺的工具之一，并且在各种各样的任务中都显示了其有用性。近年来，体系结构设计有了巨大的改进，在各种预测任务中获得了更好的性能。一般来说，这些神经结构使用同一层中的可学习权重矩阵将节点特征聚合和特征转换结合起来。这使得分析从各种跃点聚合的节点特征的重要性和神经网络层的表达能力变得具有挑战性。由于不同的图形数据集在特征和类标签分布中表现出不同程度的同质性和异质性，因此在没有任何先验信息的情况下，了解哪些特征对预测任务很重要变得至关重要。在这项工作中，我们解耦了图神经网络的节点特征聚合步骤和深度，并实证分析了不同聚合特征对预测性能的影响。我们表明，并非所有通过聚合步骤生成的特性都是有用的，并且通常使用这些信息量较小的特性会对GNN模型的性能造成不利影响。通过我们的实验，我们表明学习这些特征的某些子集可以在各种数据集上获得更好的性能。我们建议使用softmax作为正则化器和“软选择器”，从不同跳距的邻居聚合特征；以及GNN层上的L2规范化。结合这些技术，我们提出了一个简单而浅显的模型，即特征选择图神经网络（FSGNN），并通过实证表明，在节点分类任务的九个基准数据集中，该模型达到了与最先进的GNN模型相当甚至更高的精度，显著提高了51.1%。摘要：Graph Neural Networks have become one of the indispensable tools to learn from graph-structured data, and their usefulness has been shown in wide variety of tasks. In recent years, there have been tremendous improvements in architecture design, resulting in better performance on various prediction tasks. In general, these neural architectures combine node feature aggregation and feature transformation using learnable weight matrix in the same layer. This makes it challenging to analyze the importance of node features aggregated from various hops and the expressiveness of the neural network layers. As different graph datasets show varying levels of homophily and heterophily in features and class label distribution, it becomes essential to understand which features are important for the prediction tasks without any prior information. In this work, we decouple the node feature aggregation step and depth of graph neural network, and empirically analyze how different aggregated features play a role in prediction performance. We show that not all features generated via aggregation steps are useful, and often using these less informative features can be detrimental to the performance of the GNN model. Through our experiments, we show that learning certain subsets of these features can lead to better performance on wide variety of datasets. We propose to use softmax as a regularizer and "soft-selector" of features aggregated from neighbors at different hop distances; and L2-Normalization over GNN layers. Combining these techniques, we present a simple and shallow model, Feature Selection Graph Neural Network (FSGNN), and show empirically that the proposed model achieves comparable or even higher accuracy than state-of-the-art GNN models in nine benchmark datasets for the node classification task, with remarkable improvements up to 51.1%.

【10】 Using Bayesian Network Analysis to Reveal Complex Natures of Relationships 标题：用贝叶斯网络分析揭示关系的复杂性质链接：https://arxiv.org/abs/2111.06640

作者：Panchika Lortaraprasert,Pongpak Manoret,Chanati Jantrachotechatchawan,Kobchai Duangrattanalert 机构： 1 Triam Udom Suksa School, Mahidol University, ) 3 Chulalongkorn University Technology Center (UTC), Chulalongkorn University Thailand; kobchai 备注：37 pages, 13 figures 摘要：人际关系在许多方面对人类至关重要。根据马斯洛的需求层次，有人认为，虽然健康的关系是人类生活的重要组成部分，从根本上决定了我们的目标和目的，但不成功的关系可能导致自杀和其他重大心理问题。然而，对这个话题的全面理解仍然是一个挑战，离婚率比以往任何时候都高出近50%。本研究的目的是通过对大量公开的亲密关系体验量表进行贝叶斯网络分析，探索各组行为之间的关联。亲密关系量表是对openpsychometrics数据库中的依恋风格调查（ECR）数据的测试。得到的有向无环图有2个根节点（回避者的Q02和焦虑依恋的Q05）和5个结束节点（焦虑依恋的Q16、Q34和Q36）。该网络可分为5个簇，2个回避簇和3个焦虑簇。此外，我们在聚类中的项目列表与先前因子分析研究的结果一致，我们估计的系数与一项偏相关网络研究的结果显著相关。摘要：Relationships are vital for mankind in many aspects. According to Maslow hierarchy of needs, it is suggested that while a healthy relationship is an essential part of a human life that fundamentally determines our goals and purposes, an unsuccessful relationship can lead to suicide and other major psychological problems. However, a complete understanding of this topic still remains a challenge and the divorce rate is rising more than ever before to almost 50 percents. The objective of this research is to explore the association between each group of behaviors by performing Bayesian network analysis on a large publically available Experiences in Close Relationships Scale, a test of attachment style survey (ECR) data from openpsychometrics database. The resulting directed acyclic graph has 2 root nodes (Q02 from avoidant and Q05 from anxious attachment) and 5 end nodes (Q16, Q34, and Q36 from anxious attachment). The network can be divided into 5 clusters, 2 avoidance and 3 anxiety clusters. Furthermore, our list of items in the clusters are consistent with the findings of previous factor analysis studies and our estimated coefficients are significantly correlated with those of one partial correlation network study.

【11】 Employment in Tourism Industries: Are there Subsectors with a Potentially Higher Level of Income? 标题：旅游业就业：有没有收入水平可能更高的细分行业？链接：https://arxiv.org/abs/2111.06633

作者：Pablo Dorta-González,Sara M. González-Betancor 机构： González‐Betancor 2 1 Institute of Tourism and Sustainable Economic Development (TIDES), University of Las Palmas de Gran Canaria 备注：None 摘要：这项工作分析了旅游部门、旅游业创造的就业机会及其与旅游收入的关系。这一假设是，存在收入水平可能更高的旅游细分部门。本文利用2008年至2018年期间的面板数据，研究了24个经合组织国家旅游业不同子部门就业人口分布（控制了最重要的经济变量）对每次抵达收入水平的影响。作为其主要结果，该模型表明，增加每次到达收入最多的劳动力是“旅行社和其他预订服务”，其次是“体育和娱乐业”劳动力，而在“食品和饮料”或“文化产业”中拥有大量劳动力的劳动力则相反。摘要：This work analyzes the tourist sector, the employment generated by the tourism industries, and its relationship with tourism receipts. The hypothesis is that there are tourist subsectors with a potentially higher level of income. The article studies the impact of the distribution of the employed population in the different subsectors of the tourism industry, controlling for the most important economic variables, on the level of income per arrival in 24 OECD countries, using panel data for the period 2008 to 2018. As its main result, the model indicates that the labor force that increases most the receipts per arrival is the 'travel agencies and other reservation services', followed by the 'sports and recreation industry' labor force, while having a large labor force in the 'food and beverage' or 'cultural industry' operates in the opposite direction.

【12】 G-optimal grid designs for kriging models 标题：克立格模型的G-最优网格设计链接：https://arxiv.org/abs/2111.06632

作者：Subhadra Dasgupta,Siuli Mukhopadhyay,Jonathan Keith 机构： IITB-Monash Research Academy, India, Department of Mathematics, Indian Institute of Technology Bombay, India, School of Mathematics, Monash University, Australia 备注：36 Pages, 4 set of figures, 5 Algorithms 摘要：本文主要研究具有二维输入和可分离指数协方差结构的kriging模型的G-最优设计问题。为了进行设计比较，提出了二维网格设计的均匀度概念。研究了设计与均方预测误差（SMSPE）函数的上确界之间的数学关系，然后探讨了前瞻性和回顾性设计方案的最优设计。在前瞻性设计的情况下，新设计在进行试验之前开发，规则间隔的网格显示为G-最优设计。追溯设计是通过从现有设计中添加或删除点来构建的。确定性算法的开发是为了找到最好的回溯设计（使SMSPE最小化）。研究发现，分布更均匀的设计会导致最好的回溯设计。对于寻找最佳前瞻性设计和最佳可能的回顾性设计的所有案例，都考虑了频度和贝叶斯框架。针对甲烷流量监测设计，说明了寻找回顾性设计的拟议方法。摘要：This work is focused on finding G-optimal designs theoretically for kriging models with two-dimensional inputs and separable exponential covariance structures. For design comparison, the notion of evenness of two-dimensional grid designs is developed. The mathematical relationship between the design and the supremum of the mean squared prediction error (SMSPE) function is studied and then optimal designs are explored for both prospective and retrospective design scenarios. In the case of prospective designs, the new design is developed before the experiment is conducted and the regularly spaced grid is shown to be the G-optimal design. The retrospective designs are constructed by adding or deleting points from an already existing design. Deterministic algorithms are developed to find the best possible retrospective designs (which minimizes the SMSPE). It is found that a more evenly spread design leads to the best possible retrospective design. For all the cases of finding the optimal prospective designs and the best possible retrospective designs, both frequentist and Bayesian frameworks have been considered. The proposed methodology for finding retrospective designs is illustrated for a methane flux monitoring design.

【13】 Joint Models for Cause-of-Death Mortality in Multiple Populations 标题：多人群死因死亡的联合模型链接：https://arxiv.org/abs/2111.06631

作者：Nhan Huynh,Mike Ludkovski 备注：27 pages, 14 figures 摘要：我们调查了在多国环境中联合建模不同死因的年龄特异性比率。我们应用多输出高斯过程（MOGP），一种空间机器学习方法，平滑并推断多个国家和性别的多死因死亡率。为了保持灵活性和可伸缩性，我们研究了具有Kronecker结构内核和潜在因素的MOGP。特别是，我们开发了一个定制的多级MOGP，它利用死亡率表的网格结构有效地捕获不同因素输入的异质性和依赖性。结果用人类死因数据库（HCD）的数据集进行了说明。我们讨论了一个涉及三个欧洲国家癌症变异的案例研究，以及一个基于美国的研究，该研究考虑了八个顶级原因，并包括与全因分析的比较。我们的模型深入了解了特定原因死亡率趋势的共性，并展示了各自数据融合的机会。摘要：We investigate jointly modeling Age-specific rates of various causes of death in a multinational setting. We apply Multi-Output Gaussian Processes (MOGP), a spatial machine learning method, to smooth and extrapolate multiple cause-of-death mortality rates across several countries and both genders. To maintain flexibility and scalability, we investigate MOGPs with Kronecker-structured kernels and latent factors. In particular, we develop a custom multi-level MOGP that leverages the gridded structure of mortality tables to efficiently capture heterogeneity and dependence across different factor inputs. Results are illustrated with datasets from the Human Cause-of-Death Database (HCD). We discuss a case study involving cancer variations in three European nations, and a US-based study that considers eight top-level causes and includes comparison to all-cause analysis. Our models provide insights into the commonality of cause-specific mortality trends and demonstrate the opportunities for respective data fusion.

【14】 Multi-task Learning for Compositional Data via Sparse Network Lasso 标题：基于稀疏网络套索的成分数据多任务学习链接：https://arxiv.org/abs/2111.06617

作者：Akira Okazaki,Shuichi Kawano 机构：Graduate School of Informatics and Engineering, The University, of Electro-Communications,-,-, Chofugaoka, Chofu, Tokyo, -, Japan., Corresponding author(s). E-mail(s): 备注：21 pages, 4 figures 摘要：网络套索使我们能够为每个样本构建一个模型，称为多任务学习。现有的多任务学习方法由于其固有属性而无法应用于合成数据。在本文中，我们提出了一种基于稀疏网络套索的合成数据多任务学习方法。我们关注于对数对比模型的对称形式，这是一种具有成分协变量的回归模型。模拟研究和对肠道微生物组数据的应用表明了该方法的有效性。摘要：A network lasso enables us to construct a model for each sample, which is known as multi-task learning. Existing methods for multi-task learning cannot be applied to compositional data due to their intrinsic properties. In this paper, we propose a multi-task learning method for compositional data using a sparse network lasso. We focus on a symmetric form of the log-contrast model, which is a regression model with compositional covariates. The effectiveness of the proposed method is shown through simulation studies and application to gut microbiome data.

【15】 Differential privacy and robust statistics in high dimensions 标题：差异化隐私和高维稳健统计链接：https://arxiv.org/abs/2111.06578

作者：Xiyang Liu,Weihao Kong,Sewoong Oh 机构：AllenSchoolofComputerScience&Engineering, UniversityofWashington 摘要：我们引入了一个通用的框架来描述具有差异隐私保证的统计估计问题的统计效率。我们的框架，我们称之为高维建议测试发布（HPTR），建立在三个关键组件之上：指数机制、稳健统计和建议测试发布机制。将所有这些结合在一起的是弹性的概念，它是稳健统计估计的核心。弹性指导算法的设计、敏感性分析和测试发布中测试步骤的成功概率分析。关键的洞察是，如果我们设计一种指数机制，只通过一维稳健统计数据访问数据，那么由此产生的局部敏感性可以显著降低。使用弹性，我们可以提供严格的局部敏感度界限。在某些情况下，这些紧边界很容易转化为接近最优的效用保证。我们给出了将HPTR应用于统计估计问题给定实例的一般方法，并在均值估计、线性回归、协方差估计和主成分分析的典型问题上进行了演示。我们介绍了一种通用的效用分析技术，证明了在文献中研究的几种情况下，HPTR几乎达到了最佳样本复杂度。摘要：We introduce a universal framework for characterizing the statistical efficiency of a statistical estimation problem with differential privacy guarantees. Our framework, which we call High-dimensional Propose-Test-Release (HPTR), builds upon three crucial components: the exponential mechanism, robust statistics, and the Propose-Test-Release mechanism. Gluing all these together is the concept of resilience, which is central to robust statistical estimation. Resilience guides the design of the algorithm, the sensitivity analysis, and the success probability analysis of the test step in Propose-Test-Release. The key insight is that if we design an exponential mechanism that accesses the data only via one-dimensional robust statistics, then the resulting local sensitivity can be dramatically reduced. Using resilience, we can provide tight local sensitivity bounds. These tight bounds readily translate into near-optimal utility guarantees in several cases. We give a general recipe for applying HPTR to a given instance of a statistical estimation problem and demonstrate it on canonical problems of mean estimation, linear regression, covariance estimation, and principal component analysis. We introduce a general utility analysis technique that proves that HPTR nearly achieves the optimal sample complexity under several scenarios studied in the literature.

【16】 The expectation-maximization algorithm for autoregressive models with normal inverse Gaussian innovations 标题：正态逆高斯新息自回归模型的期望最大化算法链接：https://arxiv.org/abs/2111.06565

作者：Monika S. Dhull,Arun Kumar,Agnieszka Wylomanska 机构：Department of Mathematics, Indian Institute of Technology Ropar, Rupnagar, Punjab - , India, Hugo Steinhaus Center, Wroclaw University of Science and, Technology, Wyspianskiego ,-, Wroclaw, Poland 摘要：自回归（AR）模型用于表示时变随机过程，其中输出线性依赖于先前项和随机项（创新点）。在经典版本中，AR模型基于正态分布。然而，这种分布不允许描述具有异常值和不对称行为的数据。本文研究了具有正态逆高斯（NIG）新息的AR模型。NIG分布属于一类具有广泛形状的半重尾分布，因此允许描述具有可能跳跃的真实数据。期望最大化（EM）算法用于估计所考虑模型的参数。模拟数据显示了估计程序的有效性。比较研究了经典的估计算法，即Yule-Walker法、条件最小二乘法和EM法对模型参数的估计。在实际金融数据上演示了该模型的应用。摘要：The autoregressive (AR) models are used to represent the time-varying random process in which output depends linearly on previous terms and a stochastic term (the innovation). In the classical version, the AR models are based on normal distribution. However, this distribution does not allow describing data with outliers and asymmetric behavior. In this paper, we study the AR models with normal inverse Gaussian (NIG) innovations. The NIG distribution belongs to the class of semi heavy-tailed distributions with wide range of shapes and thus allows for describing real-life data with possible jumps. The expectation-maximization (EM) algorithm is used to estimate the parameters of the considered model. The efficacy of the estimation procedure is shown on the simulated data. A comparative study is presented, where the classical estimation algorithms are also incorporated, namely, Yule-Walker and conditional least squares methods along with EM method for model parameters estimation. The applications of the introduced model are demonstrated on the real-life financial data.

【17】 Modelling stochastic time delay for regression analysis 标题：用于回归分析的随机时滞建模链接：https://arxiv.org/abs/2111.06403

作者：Juan Camilo Orduz,Aaron Pickering 备注：GitHub repository this https URL 摘要：在输入和输出之间具有随机时滞的系统提出了许多独特的挑战。时域噪声会导致不规则对齐，混淆关系并衰减推断系数。为了应对这些挑战，我们引入了一个最大似然回归模型，该模型将随机时间延迟视为时域中的“误差”。对于某一问题子集，通过对预测误差和时间误差进行建模，可以超越传统模型。通过一个单变量问题的模拟实验，我们展示了比普通最小二乘（OLS）回归显著改进的结果。摘要：Systems with stochastic time delay between the input and output present a number of unique challenges. Time domain noise leads to irregular alignments, obfuscates relationships and attenuates inferred coefficients. To handle these challenges, we introduce a maximum likelihood regression model that regards stochastic time delay as an "error" in the time domain. For a certain subset of problems, by modelling both prediction and time errors it is possible to outperform traditional models. Through a simulated experiment of a univariate problem, we demonstrate results that significantly improve upon Ordinary Least Squares (OLS) regression.

【18】 Cooperative multi-agent reinforcement learning for high-dimensional nonequilibrium control 标题：高维非平衡控制的协作多智能体强化学习链接：https://arxiv.org/abs/2111.06875

作者：Shriram Chennakesavalu,Grant M. Rotskoff 机构：Department of Chemistry, Stanford University 备注：To appear in the Fourth Workshop on Machine Learning and the Physical Sciences (NeurIPS 2021) 摘要：实现高分辨率外部控制的实验进展为生产具有奇异特性的材料创造了新的机会。在这项工作中，我们研究了如何使用多智能体强化学习方法来设计用于自组装的外部控制协议。我们发现，完全分散的方法即使在“粗略”的外部控制水平下也表现得非常好。更重要的是，我们看到一种部分分散的方法，其中包括有关当地环境的信息，使我们能够更好地控制我们的系统，以实现某些目标分布。我们通过将我们的方法分析为一个部分观测的马尔可夫决策过程来解释这一点。与完全分散的方法相比，使用部分分散的方法，代理能够更具前瞻性地采取行动，既可以防止不良结构的形成，也可以更好地稳定目标结构。摘要：Experimental advances enabling high-resolution external control create new opportunities to produce materials with exotic properties. In this work, we investigate how a multi-agent reinforcement learning approach can be used to design external control protocols for self-assembly. We find that a fully decentralized approach performs remarkably well even with a "coarse" level of external control. More importantly, we see that a partially decentralized approach, where we include information about the local environment allows us to better control our system towards some target distribution. We explain this by analyzing our approach as a partially-observed Markov decision process. With a partially decentralized approach, the agent is able to act more presciently, both by preventing the formation of undesirable structures and by better stabilizing target structures as compared to a fully decentralized approach.

【19】 Generalized active information: extensions to unbounded domains 标题：广义有效信息：对无界区域的扩展链接：https://arxiv.org/abs/2111.06865

作者：Daniel Andrés Díaz-Pachón,Robert J. Marks II 机构：Division of Biostatistics - University of Miami, Miami, FL, Department of Engineering - Baylor University, Waco, TX 备注：None 摘要：在过去的三十年里，人们提出了几种衡量复杂性的方法。到目前为止，大多数此类度量只针对有限空间开发。在这些情况下，基线分布是一致的。这是有意义的，因为除其他外，均匀分布是相关空间上最大熵的度量。传统上，主动信息假设话语的范围是有限的，但可以扩展到定义最大熵的其他情况。说明这一点是本文的目的。来自最大熵的不平衡，作为活动信息测量，可以从具有无限支持的基线进行评估。摘要：In the last three decades, several measures of complexity have been proposed. Up to this point, most of such measures have only been developed for finite spaces. In these scenarios the baseline distribution is uniform. This makes sense because, among other things, the uniform distribution is the measure of maximum entropy over the relevant space. Active information traditionally assumes a finite interval universe of discourse but can be extended to other cases where maximum entropy is defined. Illustrating this is the purpose of this paper. Disequilibrium from maximum entropy, measured as active information, can be evaluated from baselines with unbounded support.

【20】 Rigid Motion Invariant Statistical Shape Modeling based on Discrete Fundamental Forms 标题：基于离散基本形式的刚体运动不变统计形状建模链接：https://arxiv.org/abs/2111.06850

作者：Felix Ambellan,Stefan Zachow,Christoph von Tycowicz 机构：Zuse Institute Berlin, Berlin, Germany, b,shapes GmbH, Berlin, Germany, Freie Universität Berlin, Berlin, Germany 备注：To be published in: Medical Image Analysis 73 摘要：我们提出了一种新的非线性统计形状建模方法，该方法在欧几里德运动下保持不变，因此不需要对齐。通过分析一致黎曼环境下作为李群元素的形状的度量失真和曲率，我们构造了一个可靠地处理大变形的框架。由于李群运算的显式特征，我们的非欧几里德方法非常有效，允许快速且数值稳健的处理。这有助于通过纵向和多部位成像研究对大型群体进行黎曼分析，从而提高统计能力。此外，由于平面构型在形状空间中形成子流形，我们的表示允许有效估计准等距曲面的平坦度。我们分别评估了阿尔茨海默病和骨关节炎所致海马和股骨畸形的w.r.t.形状分类模型的性能。特别是，我们优于基于几何深度学习和统计形状建模的最新分类器，尤其是在存在稀疏训练数据的情况下。我们评估了我们的模型w.r.t.基于形状的人类膝盖病理畸形分类的性能，并表明它优于标准欧几里德方法以及最近的非线性方法，尤其是在存在稀疏训练数据的情况下。为了深入了解该模型捕获生物形状变异性的能力，我们对其特异性和泛化能力进行了分析。摘要：We present a novel approach for nonlinear statistical shape modeling that is invariant under Euclidean motion and thus alignment-free. By analyzing metric distortion and curvature of shapes as elements of Lie groups in a consistent Riemannian setting, we construct a framework that reliably handles large deformations. Due to the explicit character of Lie group operations, our non-Euclidean method is very efficient allowing for fast and numerically robust processing. This facilitates Riemannian analysis of large shape populations accessible through longitudinal and multi-site imaging studies providing increased statistical power. Additionally, as planar configurations form a submanifold in shape space, our representation allows for effective estimation of quasi-isometric surfaces flattenings. We evaluate the performance of our model w.r.t. shape-based classification of hippocampus and femur malformations due to Alzheimer's disease and osteoarthritis, respectively. In particular, we outperform state-of-the-art classifiers based on geometric deep learning as well as statistical shape modeling especially in presence of sparse training data. We evaluate the performance of our model w.r.t. shape-based classification of pathological malformations of the human knee and show that it outperforms the standard Euclidean as well as a recent nonlinear approach especially in presence of sparse training data. To provide insight into the model's ability of capturing biological shape variability, we carry out an analysis of specificity and generalization ability.

【21】 ADCB: An Alzheimer's disease benchmark for evaluating observational estimators of causal effects 标题：ADCB：评估因果效应观测估计值的阿尔茨海默病基准链接：https://arxiv.org/abs/2111.06811

作者：Newton Mwai Kinyanjui,Fredrik D. Johansson 机构：Chalmers University of Technology, Sweden 备注：Machine Learning for Health (ML4H) - Extended Abstract 摘要：模拟器为因果效应估计提供了独特的基准，因为它们不依赖于无法验证的假设或干预现实世界系统的能力，但往往过于简单，无法捕获实际应用的重要方面。我们提出了一个阿尔茨海默病模拟器，旨在对复杂的医疗数据进行建模，同时对因果效应和政策评估进行基准测试。我们将该系统与阿尔茨海默病神经成像计划（ADNI）数据集相匹配，并根据比较治疗试验和观察治疗模式的结果手工制作地面组件。该模拟器包括改变因果推理任务性质和难度的参数，如潜在变量、效应异质性、观察历史的长度、行为策略和样本量。我们使用模拟器来比较平均和条件治疗效果的估计。摘要：Simulators make unique benchmarks for causal effect estimation since they do not rely on unverifiable assumptions or the ability to intervene on real-world systems, but are often too simple to capture important aspects of real applications. We propose a simulator of Alzheimer's disease aimed at modeling intricacies of healthcare data while enabling benchmarking of causal effect and policy estimators. We fit the system to the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset and ground hand-crafted components in results from comparative treatment trials and observational treatment patterns. The simulator includes parameters which alter the nature and difficulty of the causal inference tasks, such as latent variables, effect heterogeneity, length of observed history, behavior policy and sample size. We use the simulator to compare estimators of average and conditional treatment effects.

【22】 A Minimax Learning Approach to Off-Policy Evaluation in Partially Observable Markov Decision Processes 标题：部分可观测马尔可夫决策过程非策略评估的极小极大学习方法链接：https://arxiv.org/abs/2111.06784

作者：Chengchun Shi,Masatoshi Uehara,Nan Jiang 机构：Department of Statistics, London School of Economics and Political Science, Department of Computer Science, Cornell University, Department of Computer Science, University of Illinois Urbana-Champaign 摘要：我们考虑在部分Observable Markov决策过程（POMDP）中的非策略评估（OPE），其中评估策略仅依赖于可观察变量，行为策略依赖于不可观察的潜在变量。现有的工作要么假设没有未测量的混杂因素，要么关注观察和状态空间都是表格的设置。因此，这些方法要么在存在未测量的混杂因素时存在较大偏差，要么在具有连续或较大观察/状态空间的设置中存在较大差异。在这项工作中，我们首先通过引入连接目标策略值和观测数据分布的桥函数，提出了具有潜在混杂因子的POMDPs中OPE的新识别方法。在完全可观察的MDP中，这些桥函数简化为评估和行为策略之间熟悉的值函数和边际密度比。接下来，我们提出学习这些桥函数的极小极大估计方法。我们的建议允许一般函数近似，因此适用于具有连续或大观测/状态空间的设置。最后，我们基于这些估计的桥函数构造了三个估计量，分别对应于基于值函数的估计量、边缘化重要性抽样估计量和双稳健估计量。详细研究了它们的非渐近性和渐近性。摘要：We consider off-policy evaluation (OPE) in Partially Observable Markov Decision Processes (POMDPs), where the evaluation policy depends only on observable variables and the behavior policy depends on unobservable latent variables. Existing works either assume no unmeasured confounders, or focus on settings where both the observation and the state spaces are tabular. As such, these methods suffer from either a large bias in the presence of unmeasured confounders, or a large variance in settings with continuous or large observation/state spaces. In this work, we first propose novel identification methods for OPE in POMDPs with latent confounders, by introducing bridge functions that link the target policy's value and the observed data distribution. In fully-observable MDPs, these bridge functions reduce to the familiar value functions and marginal density ratios between the evaluation and the behavior policies. We next propose minimax estimation methods for learning these bridge functions. Our proposal permits general function approximation and is thus applicable to settings with continuous or large observation/state spaces. Finally, we construct three estimators based on these estimated bridge functions, corresponding to a value function-based estimator, a marginalized importance sampling estimator, and a doubly-robust estimator. Their nonasymptotic and asymptotic properties are investigated in detail.

【23】 Review of Pedestrian Trajectory Prediction Methods: Comparing Deep Learning and Knowledge-based Approaches 标题：行人轨迹预测方法综述：深度学习和基于知识方法的比较链接：https://arxiv.org/abs/2111.06740

作者：Raphael Korbmacher,Antoine Tordeux 机构： University of Wuppertal 备注：20 pages, 7 tables, 4 figures 摘要：在人群场景中，预测行人的轨迹是一项复杂且具有挑战性的任务，取决于许多外部因素。场景的拓扑结构和行人之间的相互作用只是其中的一部分。随着数据科学和数据采集技术的发展，深度学习方法已成为众多领域的研究热点。因此，越来越多的研究人员将这些方法应用于预测行人的轨迹，这并不奇怪。本文将这些相对较新的深度学习算法与广泛用于模拟行人动力学的经典知识模型进行了比较。它提供了这两种方法的全面文献综述，探讨了技术和面向应用的差异，解决了开放性问题以及未来的发展方向。我们的研究指出，由于深度学习算法的高精度，基于知识的模型预测局部轨迹的相关性现在是值得怀疑的。然而，深度学习算法用于大规模模拟和集体动力学描述的能力仍有待证明。此外，比较表明，两种方法的结合（混合方法）似乎有希望克服诸如深度学习方法缺乏可解释性等缺点。摘要：In crowd scenarios, predicting trajectories of pedestrians is a complex and challenging task depending on many external factors. The topology of the scene and the interactions between the pedestrians are just some of them. Due to advancements in data-science and data collection technologies deep learning methods have recently become a research hotspot in numerous domains. Therefore, it is not surprising that more and more researchers apply these methods to predict trajectories of pedestrians. This paper compares these relatively new deep learning algorithms with classical knowledge-based models that are widely used to simulate pedestrian dynamics. It provides a comprehensive literature review of both approaches, explores technical and application oriented differences, and addresses open questions as well as future development directions. Our investigations point out that the pertinence of knowledge-based models to predict local trajectories is nowadays questionable because of the high accuracy of the deep learning algorithms. Nevertheless, the ability of deep-learning algorithms for large-scale simulation and the description of collective dynamics remains to be demonstrated. Furthermore, the comparison shows that the combination of both approaches (the hybrid approach) seems to be promising to overcome disadvantages like the missing explainability of the deep learning approach.

【24】 Causal Multi-Agent Reinforcement Learning: Review and Open Problems 标题：因果多智能体强化学习：综述和有待解决的问题链接：https://arxiv.org/abs/2111.06721

作者：St John Grimbly,Jonathan Shock,Arnu Pretorius 机构：University of Cape Town, InstaDeep 备注：Accepted at CoopAI NeurIPS Workshop 2021 摘要：本文旨在向读者介绍多智能体强化学习（MARL）领域及其与因果关系研究方法的交叉。我们强调MARL中的关键挑战，并在因果方法如何帮助解决这些挑战的背景下讨论这些挑战。我们提倡对泥灰岩采取“因果关系优先”的观点。具体而言，我们认为因果关系可以提高安全性、可解释性和鲁棒性，同时也为紧急行为提供强有力的理论保证。我们讨论共同挑战的潜在解决方案，并利用这一背景来推动未来的研究方向。摘要：This paper serves to introduce the reader to the field of multi-agent reinforcement learning (MARL) and its intersection with methods from the study of causality. We highlight key challenges in MARL and discuss these in the context of how causal methods may assist in tackling them. We promote moving toward a 'causality first' perspective on MARL. Specifically, we argue that causality can offer improved safety, interpretability, and robustness, while also providing strong theoretical guarantees for emergent behaviour. We discuss potential solutions for common challenges, and use this context to motivate future research directions.

【25】 Learning Quantile Functions without Quantile Crossing for Distribution-free Time Series Forecasting 标题：无分布时间序列预测中无交叉分位数函数的学习链接：https://arxiv.org/abs/2111.06581

作者：Youngsuk Park,Danielle Maddix,François-Xavier Aubet,Kelvin Kan,Jan Gasthaus,Yuyang Wang 机构：Fran¸cois-Xavier Aubet, AWS AI Labs 备注：24 pages 摘要：分位数回归是量化不确定性、拟合具有挑战性的基本分布的有效技术，通常通过多个分位数水平上的联合学习提供完整的概率预测。然而，这些联合分位数回归的一个常见缺点是{分位数交叉}，这违反了条件分位数函数的期望单调性。在这项工作中，我们提出了增量（样条）分位数函数I（S）QF，这是一种灵活有效的无分布分位数估计框架，可通过简单的神经网络层解决分位数交叉问题。此外，I（S）QF inter/extraction可预测与基础训练水平不同的任意分位数水平。借助于对I（S）QF表示的连续排序概率得分的分析评估，我们将我们的方法应用于基于神经网络的时间序列预测案例，在这些案例中，非训练分位数水平的昂贵再训练成本的节约尤为显著。我们还提供了在序列到序列设置下我们提出的方法的泛化误差分析。最后，大量实验表明，与其他基线相比，一致性和精度误差有所改善。摘要：Quantile regression is an effective technique to quantify uncertainty, fit challenging underlying distributions, and often provide full probabilistic predictions through joint learnings over multiple quantile levels. A common drawback of these joint quantile regressions, however, is \textit{quantile crossing}, which violates the desirable monotone property of the conditional quantile function. In this work, we propose the Incremental (Spline) Quantile Functions I(S)QF, a flexible and efficient distribution-free quantile estimation framework that resolves quantile crossing with a simple neural network layer. Moreover, I(S)QF inter/extrapolate to predict arbitrary quantile levels that differ from the underlying training ones. Equipped with the analytical evaluation of the continuous ranked probability score of I(S)QF representations, we apply our methods to NN-based times series forecasting cases, where the savings of the expensive re-training costs for non-trained quantile levels is particularly significant. We also provide a generalization error analysis of our proposed approaches under the sequence-to-sequence setting. Lastly, extensive experiments demonstrate the improvement of consistency and accuracy errors over other baselines.

【26】 Multi-Step Budgeted Bayesian Optimization with Unknown Evaluation Costs 标题：评估费用未知的多步预算贝叶斯优化链接：https://arxiv.org/abs/2111.06537

作者：Raul Astudillo,Daniel R. Jiang,Maximilian Balandat,Eytan Bakshy,Peter I. Frazier 机构：Cornell University, Facebook 备注：In Advances in Neural Information Processing Systems, 2021 摘要：贝叶斯优化（BO）是一种样本有效的方法，用于优化代价高昂的黑盒函数。大多数BO方法忽略了评估成本在优化领域的变化。然而，这些成本可能具有高度的异质性，并且通常事先未知。这发生在许多实际环境中，例如机器学习算法的超参数调整或基于物理的模拟优化。此外，承认成本异质性的少数现有方法自然不能适应总评估成本的预算约束。这种未知成本和预算约束的组合为勘探开发权衡引入了一个新的维度，即了解成本本身会产生成本。现有的方法没有以一种有原则的方式对这个问题的各种权衡进行推理，常常导致性能不佳。我们通过证明预期改善和单位成本预期改善（可以说是实践中使用最广泛的两个收购函数）可以任意低于最优非短视政策，从而正式证明这一说法。为了克服现有方法的缺点，我们提出了预算多步骤预期改进，这是一个非短视的获取函数，将经典的预期改进推广到异质和未知评估成本的设置。最后，我们证明了我们的捕获函数在各种合成和实际问题上优于现有方法。摘要：Bayesian optimization (BO) is a sample-efficient approach to optimizing costly-to-evaluate black-box functions. Most BO methods ignore how evaluation costs may vary over the optimization domain. However, these costs can be highly heterogeneous and are often unknown in advance. This occurs in many practical settings, such as hyperparameter tuning of machine learning algorithms or physics-based simulation optimization. Moreover, those few existing methods that acknowledge cost heterogeneity do not naturally accommodate a budget constraint on the total evaluation cost. This combination of unknown costs and a budget constraint introduces a new dimension to the exploration-exploitation trade-off, where learning about the cost incurs the cost itself. Existing methods do not reason about the various trade-offs of this problem in a principled way, leading often to poor performance. We formalize this claim by proving that the expected improvement and the expected improvement per unit of cost, arguably the two most widely used acquisition functions in practice, can be arbitrarily inferior with respect to the optimal non-myopic policy. To overcome the shortcomings of existing approaches, we propose the budgeted multi-step expected improvement, a non-myopic acquisition function that generalizes classical expected improvement to the setting of heterogeneous and unknown evaluation costs. Finally, we show that our acquisition function outperforms existing methods in a variety of synthetic and real problems.

【27】 A Time-Series Scale Mixture Model of EEG with a Hidden Markov Structure for Epileptic Seizure Detection 标题：用于癫痫发作检测的隐马尔可夫结构EEG时间序列尺度混合模型链接：https://arxiv.org/abs/2111.06526

作者：Akira Furui,Tomoyuki Akiyama,Toshio Tsuji 机构： Tsuji are with the Graduate School of Advanced Scienceand Engineering, Hiroshima University, Akiyama is with the Department of Child Neurology, OkayamaUniversity Hospital 备注：Accepted at EMBC2021 摘要：在这篇文章中，我们提出了一个基于尺度混合分布和马尔可夫变换的时间序列随机模型来检测脑电图（EEG）中的癫痫发作。在该模型中，假设每个时间点的脑电信号是服从高斯分布的随机变量。高斯分布的协方差矩阵用一个潜在尺度参数加权，该参数也是一个随机变量，导致协方差的随机波动。通过在这种随机关系的背景下引入带有马尔可夫链的潜在状态变量，可以根据癫痫发作的状态来表示潜在尺度参数分布的时间序列变化。在一个实验中，我们使用从临床数据集中分解的多频带EEG评估了所提出的癫痫检测模型的性能。结果表明，所提出的模型能够以高灵敏度检测癫痫发作，并且优于多个基线。摘要：In this paper, we propose a time-series stochastic model based on a scale mixture distribution with Markov transitions to detect epileptic seizures in electroencephalography (EEG). In the proposed model, an EEG signal at each time point is assumed to be a random variable following a Gaussian distribution. The covariance matrix of the Gaussian distribution is weighted with a latent scale parameter, which is also a random variable, resulting in the stochastic fluctuations of covariances. By introducing a latent state variable with a Markov chain in the background of this stochastic relationship, time-series changes in the distribution of latent scale parameters can be represented according to the state of epileptic seizures. In an experiment, we evaluated the performance of the proposed model for seizure detection using EEGs with multiple frequency bands decomposed from a clinical dataset. The results demonstrated that the proposed model can detect seizures with high sensitivity and outperformed several baselines.

【28】 An Enhanced Adaptive Bi-clustering Algorithm through Building a Shielding Complex Sub-Matrix 标题：一种通过构造屏蔽复数矩阵的增强型自适应双向聚类算法链接：https://arxiv.org/abs/2111.06524

作者：Kaijie Xu 机构：a School of Electronic Engineering, Xidian University, Xi’an , China 摘要：双聚类是指在数据矩阵中查找子矩阵（由一组列和一组行索引）的任务，以便每个子矩阵（数据和特征）的元素以特定方式相关，例如，它们在某些度量方面相似。本文分析了著名的Cheng-and-Church（CC）双聚类算法，该算法已被证明是挖掘共表达基因的有效工具。然而，Cheng和Church双聚类算法在总结其局限性（如贪婪策略中随机数的干扰；忽略重叠双聚类）的基础上，我们提出了一种新的自适应双聚类算法增强，其中，构造屏蔽复数子矩阵来屏蔽已获得的bi簇并发现重叠的bi簇。在屏蔽复矩阵中，利用虚部和实部分别屏蔽和扩展新的bi簇，形成一系列最优bi簇。为了保证所获得的双簇对已经产生的双簇没有影响，引入单位脉冲信号来自适应地检测和屏蔽所构建的双簇。同时，为了有效屏蔽零数据（零尺寸数据），设置另一单位脉冲信号进行自适应检测和屏蔽。此外，我们添加了屏蔽因子来调整包含子矩阵屏蔽数据的行（或列）的均方剩余分数，以决定是否保留它们。我们对所开发的方案进行了彻底的分析。实验结果与理论分析一致。在公开的真实微阵列数据集上获得的结果表明，由于所提出的方法，双聚类性能得到了增强。摘要：Bi-clustering refers to the task of finding sub-matrices (indexed by a group of columns and a group of rows) within a matrix of data such that the elements of each sub-matrix (data and features) are related in a particular way, for instance, that they are similar with respect to some metric. In this paper, after analyzing the well-known Cheng and Church (CC) bi-clustering algorithm which has been proved to be an effective tool for mining co-expressed genes. However, Cheng and Church bi-clustering algorithm and summarizing its limitations (such as interference of random numbers in the greedy strategy; ignoring overlapping bi-clusters), we propose a novel enhancement of the adaptive bi-clustering algorithm, where a shielding complex sub-matrix is constructed to shield the bi-clusters that have been obtained and to discover the overlapping bi-clusters. In the shielding complex sub-matrix, the imaginary and the real parts are used to shield and extend the new bi-clusters, respectively, and to form a series of optimal bi-clusters. To assure that the obtained bi-clusters have no effect on the bi-clusters already produced, a unit impulse signal is introduced to adaptively detect and shield the constructed bi-clusters. Meanwhile, to effectively shield the null data (zero-size data), another unit impulse signal is set for adaptive detecting and shielding. In addition, we add a shielding factor to adjust the mean squared residue score of the rows (or columns), which contains the shielded data of the sub-matrix, to decide whether to retain them or not. We offer a thorough analysis of the developed scheme. The experimental results are in agreement with the theoretical analysis. The results obtained on a publicly available real microarray dataset show the enhancement of the bi-clusters performance thanks to the proposed method.

【29】 Variational Auto-Encoder Architectures that Excel at Causal Inference 标题：擅长因果推理的变分自动编码器体系结构链接：https://arxiv.org/abs/2111.06486

作者：Negar Hassanpour,Russell Greiner 机构：Department of Computing Science, University of Alberta, Amii, Edmonton, Canada 摘要：从观察数据（在个人或群体层面）估计因果效应对于做出许多类型的决策至关重要。解决这一任务的一种方法是学习数据基本因素的分解表示；当存在混杂因素（影响因果）时，这将变得更具挑战性。在本文中，我们采取了一种生成方法，该方法建立在变分自动编码器的最新进展的基础上，以同时了解这些潜在因素以及因果关系。我们提出了一个渐进的模型序列，其中每个模型都比前一个模型有所改进，最终形成了混合模型。我们的实证结果表明，这三种模型的性能都优于文献中最先进的区分方法和其他生成方法。摘要：Estimating causal effects from observational data (at either an individual -- or a population -- level) is critical for making many types of decisions. One approach to address this task is to learn decomposed representations of the underlying factors of data; this becomes significantly more challenging when there are confounding factors (which influence both the cause and the effect). In this paper, we take a generative approach that builds on the recent advances in Variational Auto-Encoders to simultaneously learn those underlying factors as well as the causal effects. We propose a progressive sequence of models, where each improves over the previous one, culminating in the Hybrid model. Our empirical results demonstrate that the performance of all three proposed models are superior to both state-of-the-art discriminative as well as other generative approaches in the literature.

【30】 Unique Bispectrum Inversion for Signals with Finite Spectral/Temporal Support 标题：有限谱/时间支撑信号的唯一双谱反演链接：https://arxiv.org/abs/2111.06479

作者：Samuel Pinilla,Kumar Vijay Mishra,Brian M. Sadler 摘要：从信号的三阶统计量或双谱的傅里叶变换中提取信号在一系列信号处理问题中都会出现。传统方法不能提供唯一的双谱反演。在本文中，我们提出了一种方法，可以从其双谱函数（BF）的至少$3B$测量值中唯一地恢复具有有限谱支持的信号（带限信号），其中$B$是信号的带宽。我们的方法还扩展到时间有限的信号。我们提出了一个最小化非凸目标函数的两步信赖域算法。首先，我们用频谱算法来近似信号。然后，我们根据一系列梯度迭代来优化所获得的初始化。数值实验表明，对于完全观测和欠采样观测，我们提出的算法能够从其BF估计带限/限时信号。摘要：Retrieving a signal from the Fourier transform of its third-order statistics or bispectrum arises in a wide range of signal processing problems. Conventional methods do not provide a unique inversion of bispectrum. In this paper, we present a an approach that uniquely recovers signals with finite spectral support (band-limited signals) from at least $3B$ measurements of its bispectrum function (BF), where $B$ is the signal's bandwidth. Our approach also extends to time-limited signals. We propose a two-step trust region algorithm that minimizes a non-convex objective function. First, we approximate the signal by a spectral algorithm. Then, we refine the attained initialization based upon a sequence of gradient iterations. Numerical experiments suggest that our proposed algorithm is able to estimate band/time-limited signals from its BF for both complete and undersampled observations.

本文参与腾讯云自媒体同步曝光计划，分享自微信公众号。

原始发表：2021-11-15，如有侵权请联系 cloudcommunity@tencent.com 删除

linux