统计学学术速递[9.1]

公众号-arXiv每日学术速递

发布于 2021-09-16 14:54:02

3510

发布于 2021-09-16 14:54:02

文章被收录于专栏：arXiv每日学术速递arXiv每日学术速递

Update！H5支持摘要折叠，体验更佳！点击阅读原文访问arxivdaily.com，涵盖CS|物理|数学|经济|统计|金融|生物|电气领域，更有搜索、收藏等功能！

stat统计学，共计21篇

【1】 Uniform Consistency in Nonparametric Mixture Models 标题：非参数混合模型的一致相合性链接：https://arxiv.org/abs/2108.14003

作者：Bryon Aragam,Ruiyi Yang 机构：University of Chicago 摘要：我们研究了非参数混合模型以及密切相关的混合回归（也称为混合回归）模型中的一致一致性，其中回归函数允许为非参数，误差分布假设为高斯密度的卷积。我们在一般条件下构造了一致一致一致性估计，同时强调了将现有的逐点一致性结果推广到一致性结果的几个难点。最终的分析结果证明是不平凡的，并且在此过程中开发了一些新的技术工具。在混合回归的情况下，我们证明了回归函数的$L^1$收敛性，同时允许分量回归函数经常任意相交，这带来了额外的技术挑战。我们还考虑一般化（即非卷积）非参数混合的推广。摘要：We study uniform consistency in nonparametric mixture models as well as closely related mixture of regression (also known as mixed regression) models, where the regression functions are allowed to be nonparametric and the error distributions are assumed to be convolutions of a Gaussian density. We construct uniformly consistent estimators under general conditions while simultaneously highlighting several pain points in extending existing pointwise consistency results to uniform results. The resulting analysis turns out to be nontrivial, and several novel technical tools are developed along the way. In the case of mixed regression, we prove $L^1$ convergence of the regression functions while allowing for the component regression functions to intersect arbitrarily often, which presents additional technical challenges. We also consider generalizations to general (i.e. non-convolutional) nonparametric mixtures.

【2】 Bayesian learning of forest and tree graphical models 标题：森林和树木图形模型的贝叶斯学习链接：https://arxiv.org/abs/2108.13992

作者：Edmund Jones 机构：A dissertation submitted to the University of Bristol, in accordance with the requirements for award of the, School of Mathematics, Statistics Group, This is a compact version of my PhD thesis, not the official version that was submitted to the 备注：PhD thesis, 2013, University of Bristol; 148 pages, 24 figures 摘要：在高斯图形模型结构的贝叶斯学习中，通常使用MCMC或随机鸟枪搜索（SSS）等方法，将注意力限制在某些类别的图上，并通过从一个图重复移动到另一个图来近似后验分布。我给出了不可分解图算法的两个修正版本，并讨论了随机图分布，特别是先验分布。本论文的主要主题是基于森林或树木的贝叶斯结构学习。限制对这些图的关注可以用随机图上的定理来证明。我描述了如何使用Chow$\unicode{x2013}$Liu算法和矩阵树定理来查找映射林和树上后验分布中的某些数量。我给出了MCMC和SSS的改进版本，用于近似森林和树木的后验分布，以及存储这些图的系统，以便很容易选择移动到相邻图。实验表明，当真实图是树或稀疏图时，带树的SSS效果良好。在某些情况下，具有树或森林的SSS比具有可分解图的SSS做得更好。图先验改进了中心点的检测，但需要大范围的概率。森林中的MCMC不能很好地混合，树上的MCMC比SSS慢(有关更长的摘要，请参见论文。）摘要：In Bayesian learning of Gaussian graphical model structure, it is common to restrict attention to certain classes of graphs and approximate the posterior distribution by repeatedly moving from one graph to another, using MCMC or methods such as stochastic shotgun search (SSS). I give two corrected versions of an algorithm for non-decomposable graphs and discuss random graph distributions, in particular as prior distributions. The main topic of the thesis is Bayesian structure-learning with forests or trees. Restricting attention to these graphs can be justified using theorems on random graphs. I describe how to use the Chow$\unicode{x2013}$Liu algorithm and the Matrix Tree Theorem to find the MAP forest and certain quantities in the posterior distribution on trees. I give adapted versions of MCMC and SSS for approximating the posterior distribution for forests and trees, and systems for storing these graphs so that it is easy to choose moves to neighbouring graphs. Experiments show that SSS with trees does well when the true graph is a tree or sparse graph. SSS with trees or forests does better than SSS with decomposable graphs in certain cases. Graph priors improve detection of hubs but need large ranges of probabilities. MCMC on forests fails to mix well and MCMC on trees is slower than SSS. (For a longer abstract see the thesis.)

【3】 A Subsampling Based Method for Causal Discovery on Discrete Data 标题：一种基于二次抽样的离散数据因果发现方法链接：https://arxiv.org/abs/2108.13984

作者：Austin Goddard,Yu Xiang 机构：University of Utah, S Central Campus Dr #, Salt Lake City, UT, USA 备注：Accepted to the 2021 IEEE Statistical Signal Processing Workshop 摘要：在离散和分类数据上推断因果方向是一个重要但具有挑战性的问题。即使加性噪声模型（ANMs）方法可以适用于离散数据，但功能结构假设使其不适用于分类数据。受原因和机制是独立的原则的启发，已经开发了各种方法，利用独立性测试，如距离相关性度量。在这项工作中，我们采取了另一种观点，并提出了一种基于子抽样的方法来测试原因和机制的生成方案之间的独立性。我们的方法既适用于离散数据，也适用于分类数据，并且不涉及数据的任何功能模型，因此是一种更灵活的方法。为了证明我们的方法的有效性，我们在各种合成数据和真实数据实验中将其与现有基线进行了比较。摘要：Inferring causal directions on discrete and categorical data is an important yet challenging problem. Even though the additive noise models (ANMs) approach can be adapted to the discrete data, the functional structure assumptions make it not applicable on categorical data. Inspired by the principle that the cause and mechanism are independent, various methods have been developed, leveraging independence tests such as the distance correlation measure. In this work, we take an alternative perspective and propose a subsampling-based method to test the independence between the generating schemes of the cause and that of the mechanism. Our methodology works for both discrete and categorical data and does not imply any functional model on the data, making it a more flexible approach. To demonstrate the efficacy of our methodology, we compare it with existing baselines over various synthetic data and real data experiments.

【4】 Comparison of cause specific rate functions of panel count data with multiple modes of recurrence 标题：多种重复模式下面板计数数据病因比率值函数的比较链接：https://arxiv.org/abs/2108.13967

作者：Sankaran P. G.,Ashlin Mathew,P. M.,Sreedevi E. P. 备注：arXiv admin note: substantial text overlap with arXiv:2106.01636 摘要：面板计数数据是指研究对象仅在不同时间点观察到的与反复事件有关的研究产生的数据。如果这些研究对象暴露于几种类型的复发事件，我们将获得具有多种复发模式的面板计数数据。在本文中，我们提出了一种非参数检验，用于比较面板计数数据的原因特定率函数与多个重复模式。该测试也可用于评估竞争性复发模式是否同样影响复发时间。我们进行模拟研究，以评估测试统计在有限样本设置中的性能。建议的测试使用两个真实的面板计数数据集进行说明，一个来自皮肤癌化学预防试验的医学随访研究，另一个来自车队保修数据库。摘要：Panel count data refer to the data arising from studies concerning recurrent events where study subjects are observed only at distinct time points. If these study subjects are exposed to recurrent events of several types, we obtain panel count data with multiple modes of recurrence. In the present paper, we propose a nonparametric test for comparing cause specific rate functions of panel count data with more than one mode of recurrence. The test can also be employed to assess whether the competing modes of recurrence are affecting the recurrence times identically. We carry out simulation studies to evaluate the performance of the test statistic in a finite sample setup. The proposed test is illustrated using two real life panel count data sets, one arising from a medical follow up study on skin cancer chemo prevention trial and the other on a warranty database for a fleet of automobiles.

【5】 Decision Tree-Based Predictive Models for Academic Achievement Using College Students' Support Networks 标题：基于决策树的大学生支持网络学业成绩预测模型链接：https://arxiv.org/abs/2108.13947

作者：Anthony Frazier,Joethi Silva,Rachel Meilak,Indranil Sahoo,David Chan,Michael Broda 机构：Weber State University, George Mason University, Loyola Marymount University, Virginia Commonwealth University 摘要：在这项研究中，我们研究了一组主要数据收集来自484名学生在一个大型公立大学在大西洋中部美国地区在早期阶段的COVID-19大流行。这些数据称为Ties数据，包括学生的人口统计和支持网络信息。支持网络数据包括强调支持类型的信息（即情感或教育支持）；常规的或激烈的）。利用该数据集，使用决策树算法卡方自动交互检测（CHAID）和使用条件推理树的随机森林算法cforest，创建了预测学生学业成绩的模型，该模型通过学生自我报告的GPA进行量化。我们比较了每种算法所建议的一组重要变量的方法的准确性和变化。每种算法都发现不同的变量对不同的学生人口统计数据有一定的重叠。对于白人学生，不同类型的教育支持对学业成绩的预测很重要，而对于非白人学生，不同类型的情感支持对学业成绩的预测很重要。不同类型的常规支持对预测CISEXED女性的学业成绩很重要，而不同类型的强化支持对预测CISEXED男性的学业成绩很重要。摘要：In this study, we examine a set of primary data collected from 484 students enrolled in a large public university in the Mid-Atlantic United States region during the early stages of the COVID-19 pandemic. The data, called Ties data, included students' demographic and support network information. The support network data comprised of information that highlighted the type of support, (i.e. emotional or educational; routine or intense). Using this data set, models for predicting students' academic achievement, quantified by their self-reported GPA, were created using Chi-Square Automatic Interaction Detection (CHAID), a decision tree algorithm, and cforest, a random forest algorithm that uses conditional inference trees. We compare the methods' accuracy and variation in the set of important variables suggested by each algorithm. Each algorithm found different variables important for different student demographics with some overlap. For White students, different types of educational support were important in predicting academic achievement, while for non-White students, different types of emotional support were important in predicting academic achievement. The presence of differing types of routine support were important in predicting academic achievement for cisgender women, while differing types of intense support were important in predicting academic achievement for cisgender men.

【6】 On Proximal Causal Inference With Synthetic Controls 标题：关于具有综合控制的近似性因果推理链接：https://arxiv.org/abs/2108.13935

作者：Xu Shi,Wang Miao,Mengtong Hu,Eric Tchetgen Tchetgen 机构：Department of Biostatistics, University of Michigan, Department of Probability and Statistics, Peking University, Statistics Department, The Wharton School, University of Pennsylvania 摘要：我们的目的是评估在治疗前后，当单个治疗单位和多个未治疗单位的时间序列数据可用时，干预的因果影响。在他们的开创性工作中，Abadie和Gardeazabal（2003年）以及Abadie等人（2010年）提出了一种综合控制（SC）方法，作为放松平行趋势假设的一种方法，这种假设是差异方法通常依赖的。“综合控制”一词是指控制单元的加权平均数，该控制单元的构建与治疗单元的治疗前结果轨迹相匹配，因此SC的治疗后结果预测了治疗单元在未接受治疗的情况下未观察到的潜在结果。然后将治疗效果估计为治疗单元和SC之间治疗后结果的差异。估计权重的常见做法是使用普通或加权（约束）最小二乘法将治疗单元的治疗前结果过程回归到对照单元的治疗前结果过程。然而，已经证明，在标准时间序列渐近状态下，这些估计量可能不一致。此外，合成对照的推断通常通过安慰剂试验进行，缺乏正式的理由。在本文中，我们为综合控制方法引入了一个近似因果推理框架，并将综合控制权重和治疗效果对治疗单元的识别和推理形式化。我们进一步扩展了传统的线性交互固定效应模型，以适应更一般的非线性模型，允许二进制和计数结果，这些结果目前正在综合控制文献中进行研究。我们用模拟研究和1990年德国统一评估的应用来说明我们提出的方法。摘要：We aim to evaluate the causal impact of an intervention when time series data on a single treated unit and multiple untreated units are available, in pre- and post- treatment periods. In their seminal work, Abadie and Gardeazabal (2003) and Abadie et al. (2010) proposed a synthetic control (SC) method as an approach to relax the parallel trend assumption on which difference-in-differences methods typically rely upon. The term "synthetic control" refers to a weighted average of control units built to match the treated unit's pre-treatment outcome trajectory, such that the SC's post-treatment outcome predicts the treated unit's unobserved potential outcome under no treatment. The treatment effect is then estimated as the difference in post-treatment outcomes between the treated unit and the SC. A common practice to estimate the weights is to regress the pre-treatment outcome process of the treated unit on that of control units using ordinary or weighted (constrained) least squares. However, it has been established that these estimators can fail to be consistent under standard time series asymptotic regimes. Furthermore, inferences with synthetic controls are typically performed by placebo test which lacks formal justification. In this paper, we introduce a proximal causal inference framework for the synthetic control approach and formalize identification and inference for both the synthetic control weights and the treatment effect on the treated unit. We further extend the traditional linear interactive fixed effects model to accommodate more general nonlinear models allowing for binary and count outcomes which are currently under-studied in the synthetic control literature. We illustrate our proposed methods with simulation studies and an application to the evaluation of the 1990 German Reunification.

【7】 On the interpretation of black-box default prediction models: an Italian Small and Medium Enterprises case 标题：关于黑箱违约预测模型的解读--以意大利中小企业为例链接：https://arxiv.org/abs/2108.13914

作者：Lisa Crosato,Caterina Liberati,Marco Repetto 摘要：近年来，由于机器学习算法能够解决复杂的学习任务，学术研究和金融行业对其给予了极大的关注。然而，在企业违约预测领域，缺乏可解释性阻碍了黑箱模型的广泛采用。为了克服这个缺点并保持黑箱的高性能，本文采用了一种模型不可知的方法。累积局部效应和Shapley值用于塑造预测因素对违约可能性的影响，并根据其对模型结果的贡献对其进行排序。与三种标准判别模型相比，预测是通过两种机器学习算法（极端梯度增强和前馈神经网络）实现的。结果表明，我们对意大利中小企业制造业的分析得益于极端梯度推进算法的总体最高分类权，而不放弃丰富的解释框架。摘要：Academic research and the financial industry have recently paid great attention to Machine Learning algorithms due to their power to solve complex learning tasks. In the field of firms' default prediction, however, the lack of interpretability has prevented the extensive adoption of the black-box type of models. To overcome this drawback and maintain the high performances of black-boxes, this paper relies on a model-agnostic approach. Accumulated Local Effects and Shapley values are used to shape the predictors' impact on the likelihood of default and rank them according to their contribution to the model outcome. Prediction is achieved by two Machine Learning algorithms (eXtreme Gradient Boosting and FeedForward Neural Network) compared with three standard discriminant models. Results show that our analysis of the Italian Small and Medium Enterprises manufacturing industry benefits from the overall highest classification power by the eXtreme Gradient Boosting algorithm without giving up a rich interpretation framework.

【8】 Regional estimates of reproduction numbers with application to COVID-19 标题：复制数量的区域估计及其在冠状病毒中的应用链接：https://arxiv.org/abs/2108.13842

作者：Jan Pablo Burgard,Stefan Heyder,Thomas Hotz,Tyll Krueger 备注：8 pages, 2 figures 摘要：在去年2019冠状病毒疾病流行的传播中，许多公共卫生决策都是基于实时监测的。为此，我们通常考虑生殖数，它衡量单个传染性个体产生的继发病例数量。虽然国家一级很容易得到这一数量的估计数，但国家以下各级的估计数，例如县一级的估计数，造成了更多的困难，因为那里发生的事件很少。然而，由于应对这一流行病的对策通常是在国家以下一级实施的，因此这种估计对于评估所采取措施的效力和指导未来政策具有重大意义。通过应用小面积估计技术，我们提出了一种新的将已建立的国家级再生产数量估计扩展到县一级的方法。这个新的估计器可以对国家和县级的繁殖数量做出合理的估计。它可以处理县一级的低和高度可变病例数，并可用于区分地方性疫情和更广泛的疫情。我们通过一项模拟研究以及将该估计器应用于德国案例数据，证明了我们新估计器的能力。摘要：In the last year many public health decisions were based on real-time monitoring the spread of the ongoing COVID-19 pandemic. For this one often considers the reproduction number which measures the amount of secondary cases produced by a single infectious individual. While estimates of this quantity are readily available on the national level, subnational estimates, e.g. on the county level, pose more difficulties since only few incidences occur there. However, as countermeasures to the pandemic are usually enforced on the subnational level, such estimates are of great interest to assess the efficacy of the measures taken, and to guide future policy. We present a novel extension of the well established estimator of the country level reproduction number to the county level by applying techniques from small-area estimation. This new estimator yields sensible estimates of reproduction numbers both on the country and county level. It can handle low and highly variable case counts on the county level, and may be used to distinguish local outbreaks from more widespread ones. We demonstrate the capabilities of our novel estimator by a simulation study and by applying the estimator to German case data.

【9】 Spatial Blind Source Separation in the Presence of a Drift 标题：漂移情况下的空间盲源分离链接：https://arxiv.org/abs/2108.13813

作者：Christoph Muehlmann,Peter Filzmoser,Klaus Nordhausen 机构：Institute of Statistics & Mathematical Methods in Economics, Vienna University of Technology, Austria, Department of Mathematics and Statistics, University of Jyväskylä, Finland 摘要：在实践中，在不同的空间位置进行的多变量测量经常发生。对这些数据的正确分析不仅需要考虑视觉上的依赖性，还需要考虑作为空间分离函数的变量之间的依赖关系。空间盲源分离（SBSS）是最近发展起来的一种无监督统计工具，它通过假设可观测数据由线性潜变量模型形成来处理此类数据。在SBSS中，假设潜变量由不相关的弱平稳随机场构成。这样的模型很有吸引力，因为可以对潜在变量的边际分布进行进一步的分析，解释很简单，因为该模型假设为线性的，并且并非所有潜在场的组成部分都可以作为降维的一种形式。SBSS的弱平稳性假设意味着所有样本位置的数据平均值都是常数，这在实际应用中可能过于局限。因此，最近文献中建议采用基于差异的分散矩阵对SBS进行调整。在我们的贡献中，我们形式化了这些想法，提出了一种改进的SBSS方法，并在合成和真实数据上展示了它的有用性。摘要：Multivariate measurements taken at different spatial locations occur frequently in practice. Proper analysis of such data needs to consider not only dependencies on-sight but also dependencies in and in-between variables as a function of spatial separation. Spatial Blind Source Separation (SBSS) is a recently developed unsupervised statistical tool that deals with such data by assuming that the observable data is formed by a linear latent variable model. In SBSS the latent variable is assumed to be constituted by weakly stationary random fields which are uncorrelated. Such a model is appealing as further analysis can be carried out on the marginal distributions of the latent variables, interpretations are straightforward as the model is assumed to be linear, and not all components of the latent field might be of interest which acts as a form of dimension reduction. The weakly stationarity assumption of SBSS implies that the mean of the data is constant for all sample locations, which might be too restricting in practical applications. Therefore, an adaptation of SBSS that uses scatter matrices based on differences was recently suggested in the literature. In our contribution we formalize these ideas, suggest an adapted SBSS method and show its usefulness on synthetic and real data.

【10】 Variable Selection in Regression Model with AR(p) Error Terms Based on Heavy Tailed Distributions 标题：基于重尾分布的AR(P)误差回归模型的变量选择链接：https://arxiv.org/abs/2108.13755

作者：Yetkin Tuaç,Olcay Arslan 机构： Arslan 1 1Ankara University, Department of Statistics 摘要：参数估计和变量选择是回归分析中的两个前沿问题。传统的变量选择方法要求对模型参数进行先验估计，而惩罚方法则同时进行参数估计和变量选择。因此，惩罚变量选择方法引起了人们极大的兴趣，并在文献中得到了广泛的研究。然而，文献中的大多数文献仅限于具有不相关误差项和正态性假设的回归模型。在本研究中，我们在重尾误差分布假设下，通过使用不同的惩罚函数，将具有自回归误差项的回归模型中的参数估计和变量选择结合起来。我们进行了仿真研究和实际数据示例，以展示估计器的性能。摘要：Parameter estimation and the variable selection are two pioneer issues in regression analysis. While traditional variable selection methods require prior estimation of the model parameters, the penalized methods simultaneously carry on parameter estimation and variable select. Therefore, penalized variable selection methods are of great interest and have been extensively studied in literature. However, most of the papers in literature are only limited to the regression models with uncorrelated error terms and normality assumption. In this study, we combine the parameter estimation and the variable selection in regression models with autoregressive error term by using different penalty functions under heavy tailed error distribution assumption. We conduct a simulation study and a real data example to show the performance of the estimators.

【11】 Disentanglement Analysis with Partial Information Decomposition 标题：基于部分信息分解的解缠分析链接：https://arxiv.org/abs/2108.13753

作者：Seiya Tokui,Issei Sato 机构：The University of Tokyo 摘要：给定由多个协同改变其外观的变化因素生成的数据，分离表示的目的是通过将数据映射到多个随机变量（分别捕获不同的生成因素）来逆转过程。由于概念直观而抽象，因此需要使用解纠缠度量对其进行量化，以评估和比较不同模型之间解纠缠表示的质量。当前解纠缠度量的目的是测量由每个生成因子调节的每个变量的浓度，例如绝对偏差、方差或熵，可选择由其边际分布的浓度抵消，并在不同变量之间进行比较。当表示由两个以上的变量组成时，这些度量可能无法检测它们之间的相互作用，因为它们只测量成对的相互作用。在这项工作中，我们使用部分信息分解框架来评估两个以上变量之间的信息共享，并构建一个框架，包括一个新的解纠缠度量，用于分析表示如何清晰、冗余和协作地编码生成因素。我们建立了一个实验协议来评估每个度量如何评估越来越纠缠的表示，并通过人工和现实的设置来确认所提出的度量正确地响应纠缠。我们的研究结果有望促进信息论对解纠缠的理解，并推动度量和学习方法的进一步发展。摘要：Given data generated from multiple factors of variation that cooperatively transform their appearance, disentangled representations aim at reversing the process by mapping data to multiple random variables that individually capture distinct generative factors. As the concept is intuitive but abstract, one needs to quantify it with disentanglement metrics to evaluate and compare the quality of disentangled representations between different models. Current disentanglement metrics are designed to measure the concentration, e.g., absolute deviation, variance, or entropy, of each variable conditioned by each generative factor, optionally offset by the concentration of its marginal distribution, and compare it among different variables. When representations consist of more than two variables, such metrics may fail to detect the interplay between them as they only measure pairwise interactions. In this work, we use the Partial Information Decomposition framework to evaluate information sharing between more than two variables, and build a framework, including a new disentanglement metric, for analyzing how the representations encode the generative factors distinctly, redundantly, and cooperatively. We establish an experimental protocol to assess how each metric evaluates increasingly entangled representations and confirm through artificial and realistic settings that the proposed metric correctly responds to entanglement. Our results are expected to promote information theoretic understanding of disentanglement and lead to further development of metrics as well as learning methods.

【12】 Evaluating the Robustness of Off-Policy Evaluation 标题：评价非政策评估的稳健性链接：https://arxiv.org/abs/2108.13703

作者：Yuta Saito,Takuma Udagawa,Haruka Kiyohara,Kazuki Mogi,Yusuke Narita,Kei Tateno 机构：Hanjuku-Kaso Co., Ltd., Tokyo, Japan, Sony Group Corporation, Tokyo Institute of Technology, Stanford University, California, United States, Yale University, Connecticut, United States 备注：Accepted at RecSys2021 摘要：Off policy Evaluation（OPE）或离线评估（通常指离线评估）仅利用离线日志数据评估假设策略的性能。它特别适用于在线交互涉及高风险和昂贵设置的应用，如精确医学和推荐系统。由于已经提出了许多OPE估计器，并且其中一些估计器具有需要调整的超参数，因此从业者在选择和调整OPE估计器以满足其特定应用方面面临着新的挑战。不幸的是，从研究论文中报告的结果中识别一个可靠的估计器通常是困难的，因为目前的实验程序在一组狭窄的超参数和评估策略上评估和比较估计器的性能。因此，很难知道哪个估计器是安全可靠的。在这项工作中，我们开发了离线评估的可解释性评估（IEOE），这是一种以可解释的方式评估OPE估计器对超参数和/或评估策略变化的鲁棒性的实验程序。然后，使用IEOE程序，我们对OpenBandit数据集（一个用于OPE的大规模公共真实世界数据集）上的各种现有估计量进行了广泛的评估。我们证明了我们的方法可以评估估计器对超参数选择的鲁棒性，帮助我们避免使用不安全的估计器。最后，我们将IEOE应用于真实的电子商务平台数据，并演示如何在实践中使用我们的协议。摘要：Off-policy Evaluation (OPE), or offline evaluation in general, evaluates the performance of hypothetical policies leveraging only offline log data. It is particularly useful in applications where the online interaction involves high stakes and expensive setting such as precision medicine and recommender systems. Since many OPE estimators have been proposed and some of them have hyperparameters to be tuned, there is an emerging challenge for practitioners to select and tune OPE estimators for their specific application. Unfortunately, identifying a reliable estimator from results reported in research papers is often difficult because the current experimental procedure evaluates and compares the estimators' performance on a narrow set of hyperparameters and evaluation policies. Therefore, it is difficult to know which estimator is safe and reliable to use. In this work, we develop Interpretable Evaluation for Offline Evaluation (IEOE), an experimental procedure to evaluate OPE estimators' robustness to changes in hyperparameters and/or evaluation policies in an interpretable manner. Then, using the IEOE procedure, we perform extensive evaluation of a wide variety of existing estimators on Open Bandit Dataset, a large-scale public real-world dataset for OPE. We demonstrate that our procedure can evaluate the estimators' robustness to the hyperparamter choice, helping us avoid using unsafe estimators. Finally, we apply IEOE to real-world e-commerce platform data and demonstrate how to use our protocol in practice.

【13】 Double Machine Learning for Partially Linear Mixed-Effects Models with Repeated Measurements 标题：重复测量部分线性混合效应模型的双机器学习链接：https://arxiv.org/abs/2108.13657

作者：Corinne Emmenegger,Peter Bühlmann 机构：Seminar for Statistics, ETH Zürich 摘要：传统上，在重复测量的部分线性混合效应模型（PLMM）中，使用样条或核方法结合参数估计来推断线性系数（固定效应）。使用机器学习算法使我们能够结合更复杂的交互结构和高维变量。我们采用双机器学习来处理PLMM的非参数部分：非线性变量从线性变量和响应非参数回归。这种调整可以用任何机器学习算法来执行，例如随机森林。调整后的变量满足线性混合效应模型，其中线性系数可通过标准线性混合效应技术进行估计。我们证明了估计的固定效应系数以参数速率收敛，并且是渐近高斯分布和半参数有效的。实例验证了我们提出的算法。我们提出了两项模拟研究，并分析了HIV患者重复CD4细胞计数的数据集。我们的方法的软件代码在R-package dmlalg中提供。摘要：Traditionally, spline or kernel approaches in combination with parametric estimation are used to infer the linear coefficient (fixed effects) in a partially linear mixed-effects model (PLMM) for repeated measurements. Using machine learning algorithms allows us to incorporate more complex interaction structures and high-dimensional variables. We employ double machine learning to cope with the nonparametric part of the PLMM: the nonlinear variables are regressed out nonparametrically from both the linear variables and the response. This adjustment can be performed with any machine learning algorithm, for instance random forests. The adjusted variables satisfy a linear mixed-effects model, where the linear coefficient can be estimated with standard linear mixed-effects techniques. We prove that the estimated fixed effects coefficient converges at the parametric rate and is asymptotically Gaussian distributed and semiparametrically efficient. Empirical examples demonstrate our proposed algorithm. We present two simulation studies and analyze a dataset with repeated CD4 cell counts from HIV patients. Software code for our method is available in the R-package dmlalg.

【14】 Comments on The clinical meaningfulness of a treatment's effect on a time-to-event variable 标题：关于治疗对事件发生时间变量影响的临床意义的评论链接：https://arxiv.org/abs/2108.13575

作者：Christos Argyropoulos 机构：Affiliation: Department of Internal Medicine, Division of Nephrology, University of New, Corresponding Author:, University of New Mexico, MSC ,-, Albuquerque, New Mexico , Phone: (,) ,-, Short title: “Treatment effects and time-to-event regressions” 备注：1 figure 摘要：几年前，Snapinn和Jiang[1]在分析事件时间结果时考虑了绝对与相对治疗效果指标的解释和缺陷。通过仅基于指数和威布尔分布的具体示例和分析考虑，他们得出两个结论：1）临床疗效的常用标准，ARR（绝对风险降低）和中位数（生存时间）差（MD）直接相互矛盾，2）成本效益仅取决于危险比（HR）和形状参数（在威布尔情况下），而不取决于总体基线风险。尽管具有挑衅性，但第一个结论既不适用于所考虑的两种特殊情况，也不适用于更一般的情况，而第二个结论仅严格适用于指数情况。因此，作者推断的含义，即与危险比的相对测量值相比，所有绝对治疗效果的测量值几乎没有价值，不具有一般有效性，因此在评估临床证据时应继续使用绝对和相对测量值。摘要：Some years ago, Snapinn and Jiang[1] considered the interpretation and pitfalls of absolute versus relative treatment effect measures in analyses of time-to-event outcomes. Through specific examples and analytical considerations based solely on the exponential and the Weibull distributions they reach two conclusions: 1) that the commonly used criteria for clinical effectiveness, the ARR (Absolute Risk Reduction) and the median (survival time) difference (MD) directly contradict each other and 2) cost-effectiveness depends only the hazard ratio(HR) and the shape parameter (in the Weibull case) but not the overall baseline risk of the population. Though provocative, the first conclusion does not apply to either the two special cases considered or even more generally, while the second conclusion is strictly correct only for the exponential case. Therefore, the implication inferred by the authors i.e. all measures of absolute treatment effect are of little value compared with the relative measure of the hazard ratio, is not of general validity and hence both absolute and relative measures should continue to be used when appraising clinical evidence.

【15】 New Highly Efficient High-Breakdown Estimator of Multivariate Scatter and Location for Elliptical Distributions 标题：一种新的椭圆分布多元散布和定位的高效高分解估计器链接：https://arxiv.org/abs/2108.13567

作者：Justin A. Fishbone,Lamine Mili 机构：Bradley Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Haycock Rd, Falls Church, VA, USA 摘要：多元位置矩阵和形状矩阵的高崩溃点估计，如光滑硬拒绝的MM估计和Roke S估计，通常被设计为在高斯分布下具有高效率。然而，许多现象是非高斯的，因此这些估计器的效率很低。本文提出了一种新的可调S-估计，称为S-q估计，用于一般的对称椭圆分布类，这类分布包含许多常见的族，如多元高斯分布、t-、柯西分布、拉普拉斯分布、双曲分布和正态逆高斯分布。在这一类中，S-q估计器通常比其他领先的高击穿估计器提供更高的最大效率，同时保持最大击穿点。此外，它的鲁棒性被证明是与这些领先的估计器相当，同时也相对于初始条件更稳定。从实用的角度来看，这些性质使S-q广泛适用于实践者。这一点通过一个示例应用程序——金融组合投资的最小方差最优配置进行了说明。摘要：High-breakdown-point estimators of multivariate location and shape matrices, such as the MM-estimator with smooth hard rejection and the Rocke S-estimator, are generally designed to have high efficiency at the Gaussian distribution. However, many phenomena are non-Gaussian, and these estimators can therefore have poor efficiency. This paper proposes a new tunable S-estimator, termed the S-q estimator, for the general class of symmetric elliptical distributions, a class containing many common families such as the multivariate Gaussian, t-, Cauchy, Laplace, hyperbolic, and normal inverse Gaussian distributions. Across this class, the S-q estimator is shown to generally provide higher maximum efficiency than other leading high-breakdown estimators while maintaining the maximum breakdown point. Furthermore, its robustness is demonstrated to be on par with these leading estimators while also being more stable with respect to initial conditions. From a practical viewpoint, these properties make the S-q broadly applicable for practitioners. This is demonstrated with an example application -- the minimum-variance optimal allocation of financial portfolio investments.

【16】 A Tutorial on Time-Dependent Cohort State-Transition Models in R using a Cost-Effectiveness Analysis Example 标题：使用成本效益分析示例的R中依赖于时间的队列状态转换模型的教程链接：https://arxiv.org/abs/2108.13552

作者：Fernando Alarid-Escudero,Eline M. Krijkamp,Eva A. Enns,Alan Yang,M. G. Myriam Hunink,Petros Pechlivanoglou,Hawre Jalal 机构：Eline Krijkamp, MSc†, M.G. Myriam Hunink, PhD†¶, -,- 备注：41 pages, 12 figures. arXiv admin note: text overlap with arXiv:2001.07824 摘要：本教程展示了如何在R中实现时间相关的队列状态转换模型（CSTM）以进行成本效益分析（CEA），其中转换概率和回报随时间而变化。我们考虑了两种类型的时间依赖性：自模拟开始以来的时间（模拟时间依赖性）和在健康状态下花费的时间（状态驻留依赖性）。我们说明了如何使用先前发布的cSTM（包括概率敏感性分析）基于时间相关的cSTM对多种策略进行CEA。我们还演示了如何从cSTM生成的输出计算各种感兴趣的流行病学结果，例如生存概率和疾病流行率。我们提供了数学符号和R代码来执行计算。本教程以介绍性教程为基础，该教程使用R中的CEA示例介绍了与时间无关的CSTM。我们为更广泛的实现提供了最新的公共代码存储库。摘要：This tutorial shows how to implement time-dependent cohort state-transition models (cSTMs) to conduct cost-effectiveness analyses (CEA) in R, where transition probabilities and rewards vary by time. We account for two types of time dependency: time since the start of the simulation (simulation-time dependency) and time spent in a health state (state residence dependency). We illustrate how to conduct a CEA of multiple strategies based on a time-dependent cSTM using a previously published cSTM, including probabilistic sensitivity analyses. We also demonstrate how to compute various epidemiological outcomes of interest from the outputs generated from the cSTM, such as survival probability and disease prevalence. We present both the mathematical notation and the R code to execute the calculations. This tutorial builds upon an introductory tutorial that introduces time-independent cSTMs using a CEA example in R. We provide an up-to-date public code repository for broader implementation.

【17】 Optimal Daily Trading of Battery Operations Using Arbitrage Spreads 标题：利用套利价差进行电池操作的最优每日交易链接：https://arxiv.org/abs/2108.13511

作者：Ekaterina Abramova,Derek Bunn 机构：��, Citation: Abramova, E.; Bunn, D., Optimal Daily Trading of Battery, Operations Using Arbitrage Spreads. 备注：23 pages, 6 figures, MDPI Energies 摘要：对于电池运营商来说，一个重要的收入来源通常是在前一天的拍卖中套利每小时的价差。如果风险是一个考虑因素，那么解决这一问题的最佳方法具有挑战性，因为这需要估计密度函数。由于小时价格不正常且不独立，从单独估计的价格密度差异中创建差价密度通常是难以解决的。因此，所有日内小时差价的预测直接指定为包含密度的上三角矩阵。该模型是一个灵活的四参数分布，用于根据外部因素（最重要的是风能、太阳能和日前需求预测）生成动态参数估计。这些预测支持存储设施的最佳每日调度，每天运行一个或多个周期。该优化在使用差价交易而非小时价格方面具有创新性，本文认为，小时价格在降低风险方面更具吸引力。与传统的每日高峰和低谷交易方法不同，根据天气预报，多重交易被发现是有利可图和机会主义的。摘要：An important revenue stream for electric battery operators is often arbitraging the hourly price spreads in the day-ahead auction. The optimal approach to this is challenging if risk is a consideration as this requires the estimation of density functions. Since the hourly prices are not normal and not independent, creating spread densities from the difference of separately estimated price densities is generally intractable. Thus, forecasts of all intraday hourly spreads were directly specified as an upper triangular matrix containing densities. The model was a flexible four-parameter distribution used to produce dynamic parameter estimates conditional upon exogenous factors, most importantly wind, solar and the day-ahead demand forecasts. These forecasts supported the optimal daily scheduling of a storage facility, operating on single and multiple cycles per day. The optimization is innovative in its use of spread trades rather than hourly prices, which this paper argues, is more attractive in reducing risk. In contrast to the conventional approach of trading the daily peak and trough, multiple trades are found to be profitable and opportunistic depending upon the weather forecasts.

【18】 Bubblewrap: Online tiling and real-time flow prediction on neural manifolds 标题：Bubblewrap：神经流形上的在线平铺和实时流量预测链接：https://arxiv.org/abs/2108.13941

作者：Anne Draelos,Pranjal Gupta,Na Young Jun,Chaichontat Sriworarat,John Pearson 机构：Biostatistics & Bioinformatics,Psychology & Neuroscience,Neurobiology,Electrical & Com-, puter Engineering, Duke University 摘要：虽然实验神经科学中大多数经典的功能研究都集中在单个神经元的编码特性上，但记录技术的最新发展使得人们越来越重视神经种群的动力学。这就产生了各种各样的模型，用于分析与实验变量相关的群体活动，但直接测试许多神经群体假设需要基于当前神经状态干预系统，因此需要能够在线推断神经状态的模型。现有的方法主要基于动力系统，需要强大的参数假设，这些假设在噪声占主导地位的情况下很容易被违反，并且在现代实验中无法很好地扩展到数千个数据通道。为了解决这个问题，我们提出了一种方法，将快速、稳定的降维与生成的神经流形的软平铺相结合，使动力学近似为平铺之间的概率流。该方法可以使用在线期望最大化进行有效拟合，可扩展到数万个瓦片，并且在动力学以噪声为主或具有多模态转移概率的情况下优于现有方法。由此产生的模型可以以千赫兹的数据速率进行训练，在几分钟内产生精确的神经动力学近似值，并在亚毫秒的时间尺度上产生预测。它在未来的许多时间步骤中保持预测性能，并且速度足够快，可以作为闭环因果实验的一个组成部分。摘要：While most classic studies of function in experimental neuroscience have focused on the coding properties of individual neurons, recent developments in recording technologies have resulted in an increasing emphasis on the dynamics of neural populations. This has given rise to a wide variety of models for analyzing population activity in relation to experimental variables, but direct testing of many neural population hypotheses requires intervening in the system based on current neural state, necessitating models capable of inferring neural state online. Existing approaches, primarily based on dynamical systems, require strong parametric assumptions that are easily violated in the noise-dominated regime and do not scale well to the thousands of data channels in modern experiments. To address this problem, we propose a method that combines fast, stable dimensionality reduction with a soft tiling of the resulting neural manifold, allowing dynamics to be approximated as a probability flow between tiles. This method can be fit efficiently using online expectation maximization, scales to tens of thousands of tiles, and outperforms existing methods when dynamics are noise-dominated or feature multi-modal transition probabilities. The resulting model can be trained at kiloHertz data rates, produces accurate approximations of neural dynamics within minutes, and generates predictions on submillisecond time scales. It retains predictive performance throughout many time steps into the future and is fast enough to serve as a component of closed-loop causal experiments.

【19】 Learning Optimal Prescriptive Trees from Observational Data 标题：从观测数据中学习最优规范树链接：https://arxiv.org/abs/2108.13628

作者：Nathanael Jo,Sina Aghaei,Andrés Gómez,Phebe Vayanos 机构： Phebe VayanosUniversity of Southern California 摘要：我们考虑的问题，学习一个最佳规定树（即，个性化的治疗分配策略的形式二叉树）中等深度，从观测数据。这一问题出现在许多重要的社会领域，如公共卫生和个性化医疗领域，在这些领域，通过被动收集数据，而不是随机试验，根据部署中收集的数据，寻求可解释和数据驱动的干预措施。提出了一种利用混合整数优化（MIO）技术学习最优规定树的方法。我们证明，在温和的条件下，当历史数据样本数趋于无穷大时，我们的方法在某种意义上是渐近精确的，即收敛到最优的样本外处理分配策略。这使我们不同于关于这个主题的现有文献，后者要么要求数据随机化，要么对树施加严格的假设。基于对合成数据和真实数据的大量计算实验，我们证明了即使在有限样本中，我们的渐近保证也能转化为显著的样本外性能改进。摘要：We consider the problem of learning an optimal prescriptive tree (i.e., a personalized treatment assignment policy in the form of a binary tree) of moderate depth, from observational data. This problem arises in numerous socially important domains such as public health and personalized medicine, where interpretable and data-driven interventions are sought based on data gathered in deployment, through passive collection of data, rather than from randomized trials. We propose a method for learning optimal prescriptive trees using mixed-integer optimization (MIO) technology. We show that under mild conditions our method is asymptotically exact in the sense that it converges to an optimal out-of-sample treatment assignment policy as the number of historical data samples tends to infinity. This sets us apart from existing literature on the topic which either requires data to be randomized or imposes stringent assumptions on the trees. Based on extensive computational experiments on both synthetic and real data, we demonstrate that our asymptotic guarantees translate to significant out-of-sample performance improvements even in finite samples.

【20】 Fast Multi-label Learning 标题：快速多标签学习链接：https://arxiv.org/abs/2108.13570

作者：Xiuwen Gong,Dong Yuan,Wei Bao 机构： The University of Sydney 摘要：嵌入方法已经成为最普遍的多标签分类技术之一。然而，嵌入方法的训练过程通常涉及一个复杂的二次或半定规划问题，或者模型甚至可能涉及一个NP难问题。因此，此类方法禁止大规模应用。更重要的是，许多文献已经表明，二进制关联（BR）方法对于某些应用来说通常已经足够好了。不幸的是，BR运行缓慢，因为它与输入数据的大小成线性关系。本文的目标是提供一种简单但有可证明保证的方法，该方法无需复杂的训练过程即可获得具有竞争力的绩效。为了实现我们的目标，我们为多标签分类提供了一个简单的随机草图策略，并从算法和统计学习的角度给出了理论结果。我们的综合实证研究证实了我们的理论发现，并证明了所提出方法的优越性。摘要：Embedding approaches have become one of the most pervasive techniques for multi-label classification. However, the training process of embedding methods usually involves a complex quadratic or semidefinite programming problem, or the model may even involve an NP-hard problem. Thus, such methods are prohibitive on large-scale applications. More importantly, much of the literature has already shown that the binary relevance (BR) method is usually good enough for some applications. Unfortunately, BR runs slowly due to its linear dependence on the size of the input data. The goal of this paper is to provide a simple method, yet with provable guarantees, which can achieve competitive performance without a complex training process. To achieve our goal, we provide a simple stochastic sketch strategy for multi-label classification and present theoretical results from both algorithmic and statistical learning perspectives. Our comprehensive empirical studies corroborate our theoretical findings and demonstrate the superiority of the proposed methods.

【21】 Bayesian Inference of Globular Cluster Properties Using Distribution Functions 标题：利用分布函数进行球状星团属性的贝叶斯推断链接：https://arxiv.org/abs/2108.13491

作者：Gwendolyn M. Eadie,Jeremy J. Webb,Jeffrey S. Rosenthal 机构：David A. Dunlap Department of Astronomy & Astrophysics,University of Toronto, Toronto, ON, Department of Statistical Sciences, University of Toronto, Toronto, ON, ), Submitted to ApJ 备注：submitted to ApJ; 21 pages, 11 figures 摘要：我们提出了一种贝叶斯推断方法，在给定球状星团恒星的空间和运动学信息的情况下，估计球状星团的累积质量剖面和均方速度剖面。从降低的等温动力学模型中生成具有不同大小和浓度的模拟球状星团，通过重复的统计模拟来检验贝叶斯方法估计模型参数的可靠性。我们发现，在给定无偏恒星样本的情况下，我们能够重建用于生成模拟星团的星团参数以及星团的累积质量和平均速度平方分布。我们进一步探讨了由于观察约束而产生的强偏差抽样对该方法的影响。我们的测试表明，如果取而代之的是有偏差的样本，那么我们的估计可能会在某些方面偏离，这取决于簇的形态。总的来说，我们的发现促使我们获得尽可能公正的恒星样本。这可以通过组合来自多个望远镜（如哈勃望远镜和盖亚望远镜）的信息来实现，但需要通过分层模型对测量不确定性进行仔细建模，我们计划在未来的工作中继续进行。摘要：We present a Bayesian inference approach to estimating the cumulative mass profile and mean squared velocity profile of a globular cluster given the spatial and kinematic information of its stars. Mock globular clusters with a range of sizes and concentrations are generated from lowered isothermal dynamical models, from which we test the reliability of the Bayesian method to estimate model parameters through repeated statistical simulation. We find that given unbiased star samples, we are able to reconstruct the cluster parameters used to generate the mock cluster and the cluster's cumulative mass and mean velocity squared profiles with good accuracy. We further explore how strongly biased sampling, which could be the result of observing constraints, may affect this approach. Our tests indicate that if we instead have biased samples, then our estimates can be off in certain ways that are dependent on cluster morphology. Overall, our findings motivate obtaining samples of stars that are as unbiased as possible. This may be achieved by combining information from multiple telescopes (e.g., Hubble and Gaia), but will require careful modeling of the measurement uncertainties through a hierarchical model, which we plan to pursue in future work.

机器翻译，仅供参考

本文参与腾讯云自媒体分享计划，分享自微信公众号。

原始发表：2021-09-01，如有侵权请联系 cloudcommunity@tencent.com 删除

linux