# Data segmentation algorithms: Univariate mean change and beyond

Data segmentation a.k.a. multiple change point analysis has received considerable attention due to its importance in time series analysis and signal processing, with applications in a variety of fields including natural and social sciences, medicine, engineering and finance. In the first part of this survey, we review the existing literature on the canonical data segmentation problem which aims at detecting and localising multiple change points in the mean of univariate time series. We provide an overview of popular methodologies on their computational complexity and theoretical properties. In particular, our theoretical discussion focuses on the separation rate relating to which change points are detectable by a given procedure, and the localisation rate quantifying the precision of corresponding change point estimators, and we distinguish between whether a homogeneous or multiscale viewpoint has been adopted in their derivation. We further highlight that the latter viewpoint provides the most general setting for investigating the optimality of data segmentation algorithms. Arguably, the canonical segmentation problem has been the most popular framework to propose new data segmentation algorithms and study their efficiency in the last decades. In the second part of this survey, we motivate the importance of attaining an in-depth understanding of strengths and weaknesses of methodologies for the change point problem in a simpler, univariate setting, as a stepping stone for the development of methodologies for more complex problems. We illustrate this with a range of examples showcasing the connections between complex distributional changes and those in the mean. We also discuss extensions towards high-dimensional change point problems where we demonstrate that the challenges arising from high dimensionality are orthogonal to those in dealing with multiple change points.

0 条评论

• ### 顶石项目课程的说明的发展：来自教师和学生的观点(CS)

本研究试图为顶石项目课程开发公平、相关和内容有效的评估工具。为实现这个目标，我们提出了基于标尺概念的新评级工具。为了确保新仪器有效和公平，我们先后与计算机科学系...

• ### 内核平滑、平均移位及其使用定向数据的学习理论(CS)

定向数据由分布在（超）球面上的观测结果组成，并出现在许多应用领域，如天文学、生态学和环境科学。本文研究了定向数据内核平滑的统计和计算问题。我们将经典平均移位算法...

• ### 使用主要协变量回归改进样本和特征选择（CS）

从大量候选项中选择最相关的功能和示例是一项在自动数据分析文本中经常发生的任务，它可用于提高模型的计算性能，而且通常也具有可传输性。在这里，我们重点介绍两个流行的...

• ### 大图像数据集：计算机视觉的胜利吗？（CS CY）

本文中，我们调查了有问题的做法和大规模视觉数据集的后果。我们研究了广泛的问题，例如同意和正义问题，以及特定的问题，例如在数据集中包含可验证的色情图片。以Imag...

• ### 2018 MCM Problem B C Notes

2018 MCM Problem B: How Many Languages?(pr1)

• ### A Tutorial on Energy-Based Learning

Yann LeCun, Sumit Chopra, Raia Hadsell, Marc’Aurelio Ranzato, and Fu Jie Huang T...

• ### 电力市场机制设计与联盟博弈的理论与应用（CS）

尽管全球电力市场的具体结构各不相同，但它们都是在可预测，可控发电且边际成本不可忽略的前提下构想的。 最近的变化，特别是不断增长的可再生能源整合，已经挑战了这些假...

• ### 一个根据ICHD-3国际分类的头痛疾病诊断逻辑决策支持系统(CS AI)

决策支持系统在医学领域扮演着重要的角色，因为它们可以增强临床医生处理复杂决策过程的效率和有效性。然而，在头痛疾病的诊断中，现有的方法和工具仍然不是最佳的。一方面...

• ### 噗噗扭蛋｜从IP形象到潮玩的设计秘笈

? ? ? PUPU ALIENS噗噗星人在6月初已正式发布，我们顺势推出萌酷好玩的噗噗星人潮玩给粉丝们。让我们一起来打开噗噗星人潮玩的设计奥秘吧！文末会献上...

• ### Job Prospects of AI

Job Prospects of AI Machine learning engineer is becoming a job, whose demand ga...