文章/答案/技术大牛

发布

社区首页 >专栏 >正交匹配追踪

正交匹配追踪

卡尔曼和玻尔兹曼谁曼

发布于 2023-12-01 00:53:00

2570

文章被收录于专栏：给永远比拿愉快给永远比拿愉快

这篇博文是在对Koredianto Usman《Introduction to Orthogonal Matching Pursuit》文章的翻译，后面附带了一些总结.

这篇文章是前面《Matching Pursuit (MP)》文章的继续. （注：原文中还是有一些小细节错误，请大家睁大眼睛阅读）

简介

考虑下面的问题：给定

x = \begin{bmatrix}-1.2 & 1 & 0\end{bmatrix}

和

\mathrm{A} = \begin{bmatrix}-0.707 & 0.8 & 0 \\ 0.707 & 0.6 & -1\end{bmatrix}

，计算

y = \mathrm{A} \cdot x

这相当简单！

y = \mathrm{A} \cdot x = \begin{bmatrix}-0.707 & 0.8 & 0 \\ 0.707 & 0.6 & -1\end{bmatrix} \cdot \begin{bmatrix}-1.2 & 1 & 0\end{bmatrix} = \begin{bmatrix}1.65 \\ -0.25\end{bmatrix}

现在是比较困难的部分！给定

y = \begin{bmatrix}1.65 \\ -0.25\end{bmatrix}

和

\mathrm{A} = \begin{bmatrix}-0.707 & 0.8 & 0 \\ 0.707 & 0.6 & -1\end{bmatrix}

，如何找到原始的

（或者尽可能接近原始的

）？

在压缩感知（Compressive Sensing）术语中，从

和

\mathrm{A}

中得到

叫做压缩，反之，从

和

\mathrm{A}

得到

叫重建（个人感觉原文中好像说反了，还是我理解错了？）. 重建问题不是一个很简单的问题。

一般地，

被称为original signal或者original vector，

\mathrm{A}

被称为compression matrix或者sensing matrix，

称为compressed signal或者compressed vector.

基本概念

和前面在MP算法中讨论过的一样，感知矩阵

\mathrm{A}

可以看做列向量的集合：

\mathrm{A} = \begin{bmatrix}-0.707 & 0.8 & 0 \\ 0.707 & 0.6 & -1\end{bmatrix} = \begin{bmatrix}b_1 & b_2 & b_3\end{bmatrix}

其中，

b_1 = \begin{bmatrix}-0.707 \\ 0.707\end{bmatrix}, \qquad b_2 = \begin{bmatrix}-0.8 \\ 0.6\end{bmatrix}, \qquad b_1 = \begin{bmatrix}0 \\ -1\end{bmatrix}

这些列被称为基（basis）或者原子（atom）.

现在，我们令

x = \begin{bmatrix}x_1 \\ x_2 \\ x_3\end{bmatrix}

，则

\mathrm{A} \cdot x = \begin{bmatrix}-0.707 \\ 0.707\end{bmatrix} \cdot x_1 + \begin{bmatrix}0.8 \\ 0.6\end{bmatrix} \cdot x_2 + \begin{bmatrix}0 -1\end{bmatrix} \cdot x_3 = y

从该方程可以看出

是使用

中的稀疏对原子的线性组合. 实际上，我们知道

x_1 = -1.2

，

x_2 = 1

，

x_3 = 0

. 换言之，对于得到

值，

b_1

的贡献值为-1.2，

b_2

为1，而

b_2

为0.

OMP算法和MP算法类似，都是从字典中找出哪一个原子对

值的贡献最大，接下来是哪个原子的贡献值大，以此类推. 我们现在知道这个过程需要

次迭代，

是字典中原子的个数. 在该例子中，

等于3.

计算原子的贡献值

字典中有三个原子，

b_1 = \begin{bmatrix}-0.707 \\ 0.707\end{bmatrix}, \qquad b_2 = \begin{bmatrix}-0.8 \\ 0.6\end{bmatrix}, \qquad b_1 = \begin{bmatrix}0 \\ -1\end{bmatrix}

且

y = \begin{bmatrix}1.65 \\ -0.25\end{bmatrix}

因此，各原子对

的贡献值为

<b_1, y> = \begin{bmatrix}-0.707 \\ 0.707\end{bmatrix}^{\mathrm{T}} \cdot \begin{bmatrix}1.65 \\ -0.25\end{bmatrix} = -0.707 \cdot -1.65 + 0.707 \cdot -0.25 = -1.34

<b_2, y> = 1.17

显然，原子

b_1

对

值的贡献最大为-1.34（按绝对值计算）.

在第一次迭代过程中，我们选择

b_1

作为基，基的系数为-1.34.

当然，我们也可以使用矩阵的形式计算点积：

w = \mathrm{A}^{\mathrm{T}} \cdot y = \begin{bmatrix}-0.707 & 0.707 \\ 0.8 & 0.6 \\ 0 & -1\end{bmatrix} \cdot \begin{bmatrix}1.65 \\ -0.25\end{bmatrix} = \begin{bmatrix}-1.34 \\ 1.17 \\ 0.25\end{bmatrix}

原子和

的几何意义如图1：

计算残差

现在已经选定了第一个基

b_1 = \begin{bmatrix}-0.707 \\ 0.707\end{bmatrix}

，贡献值系数为

\lambda_1 = -1.34

. 将该部分贡献值从

中减去，残差为

r = y - \lambda_1 \cdot b_1 = \begin{bmatrix}1.65 \\ -0.25\end{bmatrix} - (-1.34) \cdot \begin{bmatrix}-0.707 \\ 0.707\end{bmatrix} = \begin{bmatrix}0.7 \\ 0.7\end{bmatrix}

残差的几何意义如图2：

重复迭代

经过第一次迭代，我们选择出了基

b_1

. 将选择出的基加入新的矩阵

\mathrm{A}_{new}

，即：

\mathrm{A}_{new} = [b_1] = \begin{bmatrix}-0.707 \\ 0.707\end{bmatrix}

贡献系数矩阵可以写成

x_{rec} = \begin{bmatrix}-1.34 \\ 0 \\ 0\end{bmatrix}

第一个元素是第一个贡献值-1.34，对应着第一个第一个基

b_1

残差值为

r = \begin{bmatrix}0.7 \\ 0.7\end{bmatrix}

现在需要从剩下的基

b_2

和

b_3

中选择对残差

贡献最大的.

w = \begin{bmatrix}b_2 \\ b_3\end{bmatrix}^{\mathrm{T}} \cdot r = \begin{bmatrix}0.98 \\ -0.7\end{bmatrix}

所以，

b_2

的贡献值更大一些是0.98.

我们将选择的基

b_2

添加到

\mathrm{A}_{new}

得到

\mathrm{A}_{new} = \begin{bmatrix}b_1 & b_2\end{bmatrix} = \begin{bmatrix}-0.707 & 0.8 \\ 0.707 & 0.6\end{bmatrix}

接下来的步骤和MP算法稍微有些不同. 这里，我们需要计算

\mathrm{A}_{new}

对

的贡献（而不是

b_2

对

的贡献）

我们使用最小二乘法（Least Square Problem）解决该该贡献值问题：

如何求得

\lambda_1

和

\lambda_2

的值使得

\lambda_1 \cdot \begin{bmatrix}-0.707 \\ 0.707\end{bmatrix} + \lambda_2 \cdot \begin{bmatrix}0.8 \\ 0.6\end{bmatrix}

最接近

y = \begin{bmatrix}1.65 \\ -0.25\end{bmatrix}

？

表示成数学语言为：

\min \|\mathrm{A}_{new} \cdot \mathbf{\lambda} - y\|_2

在这个例子中，即

\min \|\begin{bmatrix}b_1 & b_2\end{bmatrix} = \begin{bmatrix}-0.707 & 0.8 \\ 0.707 & 0.6\end{bmatrix} \cdot \begin{bmatrix}\lambda_1 \\ \lambda_2\end{bmatrix} - \begin{bmatrix}1.65 \\ -0.25\end{bmatrix}\|_2

对于

\min \|\mathrm{A}_{new} \cdot \mathbf{\lambda} - y\|_2

的解可以直接使用公式：

\mathbf{\lambda} = \mathrm{A}_{new}^+ \cdot y

其中，

\mathrm{A}_{new}^+

是

\mathrm{A}_{new}

的谓逆，

\mathrm{A}_{new}^+ = (\mathrm{A}_{new}^{\mathrm{T}} \cdot \mathrm{A}_{new})^{-1} \cdot \mathrm{A}_{new}^{\mathrm{T}}

（注：可参看我的博文《Moore-Penrose广义逆矩阵》和《矛盾方程的最小二乘解法》）

在我们的例子中

\mathrm{A_{new}^+} = \begin{bmatrix}-0.707 & 0.8 \\ 0.707 & 0.6\end{bmatrix}^+ = \begin{bmatrix}-0.6062 & 0.8082 \\ 0.7143 & 0.7143\end{bmatrix}

可以使用MATLAB中的内置函数pinv()进行伪逆的计算.

得到

\mathrm{A_{new}^+}

后，可以计算

\mathbf{\lambda}

\mathbf{\lambda} = \mathrm{A}_{new}^+ \cdot y = \begin{bmatrix}-0.6062 & 0.8082 \\ 0.7143 & 0.7143\end{bmatrix} \cdot \begin{bmatrix}1.65 \\ -0.25\end{bmatrix} = \begin{bmatrix}-1.2 \\ 1\end{bmatrix}

得到

\mathbf{\lambda}

以后，更新系数矩阵

x_{rec}

为

x_{rec} = \begin{bmatrix}-1.2 \\ 1 \\ 0\end{bmatrix}

同样的，

\mathbf{\lambda}

在

x_{rec}

第一个元素和第二个元素对应着选择出来的第一个和第二个基.

得到

\mathrm{A}_{new}

和

\mathbf{\lambda}

以后，我们可以计算残差

r = y - \mathrm{A}_{new} \cdot \mathbf{\lambda} = \begin{bmatrix}1.65 \\ -0.25\end{bmatrix} - \begin{bmatrix}-0.707 & 0.8 \\ 0.707 & 0.6\end{bmatrix} \cdot \begin{bmatrix}—1.2 \\ 1\end{bmatrix} = \begin{bmatrix}0 \\ 0\end{bmatrix}

此时，残差值为

\begin{bmatrix}0 \\ 0\end{bmatrix}

，我们可以停止计算或者继续下一步计算（这里停止，可以节省一些计算工作）.

最后一次迭代

这一步不是必须的，因为残差已经完全消除了（很多实现OMP的软件都需要输入稀疏度

参数，这样经过

次迭代以后，无论残差大小都会停止迭代）.

重新组织信号系数

x_{rec} = \begin{bmatrix}-1.2 \\ 1 \\ 0\end{bmatrix}

，这刚好是原始的系数.

其他例子

给定

x = \begin{bmatrix}0 \\ 3 \\ 1 \\ 2\end{bmatrix}

和

\mathrm{A} = \begin{bmatrix}-0.8 & 0.3 & 1 & 0.4 \\ -0.2 & 0.4 & -0.3 & -0.4 \\ 0.2 & 1 & -0.1 & 0.8 \end{bmatrix}

，

求得

y = \mathrm{A} \cdot x = \begin{bmatrix}2.7 \\ 0.1 \\ 4.5\end{bmatrix}

现在给定

\mathrm{A}

和

，使用OMP算法求解

我们有4个基（原子）：

b_1 = \begin{bmatrix}-0.8 \\ -0.2 \\ 0.2\end{bmatrix} \qquad b_2 = \begin{bmatrix}0.3 \\ 0.4 \\ 1\end{bmatrix} \qquad b_3 = \begin{bmatrix}1 \\ -0.3 \\ -0.1\end{bmatrix} \qquad b_4 = \begin{bmatrix}0.4 \\ -0.4 \\ 0.8\end{bmatrix} \qquad

由于基向量的长度不是1，所以我们首先进行标准化.

\hat{b_1} = b_1 / \|b_1\| = \begin{bmatrix}-0.8 \\ -0.2 \\ 0.2\end{bmatrix} / \sqrt{(-0.8)^2 + (-0.4)^2 + (0.2)^2} = \begin{bmatrix}-0.9428 \\ -0.2357 \\ 0.2357\end{bmatrix}

\hat{b_2} = b_2 / \|b_2\| = \begin{bmatrix}0.2680 \\ 0.3578 \\ 0.8940\end{bmatrix}

\hat{b_3} = b_3 / \|b_3\| = \begin{bmatrix}0.9535 \\ -0.2860 \\ 0.0953\end{bmatrix}

\hat{b_2} = b_2 / \|b_2\| = \begin{bmatrix}0.4082 \\ -0.4082 \\ -0.8165\end{bmatrix}

标准化的基向量对

的贡献

w = \hat{\mathrm{A}}^{\mathrm{T}} \cdot y = \begin{bmatrix}-0.9428 & 0.2680 & 0.9535 & 0.4082 \\ -0.2357 & 0.3578 & -0.2860 & -0.4082\\ 0.2357 & 0.9840 & -0.0953 & -0.8165\end{bmatrix}^{\mathrm{T}} \cdot \begin{bmatrix}2.7 \\ 0.1 \\ 4.5\end{bmatrix} = \begin{bmatrix}-1.5085 \\ 4.7852 \\ 2.1167 \\ 4.7357\end{bmatrix}

第二个基向量

b_2

贡献值最大，所以将

b_2

加入到

\mathrm{A}_{new}

\mathrm{A}_{new} = b_2 = \begin{bmatrix}0.3 \\ 0.4 \\ 1\end{bmatrix}

利用最小二乘法计算

x_{rec}

L_p = \mathrm{A}_{new}^+ \cdot y = [4.28]

因为

L_p

对应着第二个基向量，所以

x_{rec} = \begin{bmatrix}0 \\ 4.28 \\ 0 \\ 0\end{bmatrix}

接下来计算残差

r = y - \mathrm{A}_{new} \cdot L_p = \begin{bmatrix}2.7 \\ 0.1 \\ 4.5\end{bmatrix} - \begin{bmatrix}0.3 \\ 0.4 \\ 1\end{bmatrix} \cdot 4.28 = \begin{bmatrix}1.416 \\ -1.612 \\ 0.22\end{bmatrix}

接下来重复第3步，计算

\hat{b_1}

，

\hat{b_1}

和

\hat{b_1}

对

的贡献.

w = \hat{\mathrm{A}}^{\mathrm{T}} \cdot r = \begin{bmatrix}-0.9428 & 0.9535 & 0.4082 \\ -0.2357 & -0.2860 & -0.4082\\ 0.2357 & -0.0953 & -0.8165\end{bmatrix}^{\mathrm{T}} \cdot \begin{bmatrix}1.416 \\ -1.612 \\ 0.22\end{bmatrix} = \begin{bmatrix}-0.9032 \\ 1.7902 \\ 1.4158\end{bmatrix}

选择第二个贡献最大的基

b_3

，其贡献值为1.7902.

将选择的

b_3

加入到

\mathrm{A}_{new}

中

\mathrm{A}_{new} = \begin{bmatrix}b_1 & b_2\end{bmatrix} = \begin{bmatrix}0.3 & 1\\ 0.4 & -0.3\\ 1 & -0.1\end{bmatrix}

用最小二乘法计算

L_p = \mathrm{A}_{new}^+ \cdot y = \begin{bmatrix}4.1702 \\ 1.7149\end{bmatrix}

这个

L_p

对应着

b_2

和

b_3

，因此

x_{rec} = \begin{bmatrix}0 \\ 4.172 \\ 1.7149 \\ 0\end{bmatrix}

接着计算残差

r = y - \mathrm{A}_{new} \cdot L_p = \begin{bmatrix}2.7 \\ 0.1 \\ 4.5\end{bmatrix} - \begin{bmatrix}0.3 & 1\\ 0.4 & -0.3\\ 1 & -0.1\end{bmatrix} \cdot \begin{bmatrix}4.172 \\ 1.7149\end{bmatrix} = \begin{bmatrix}-0.266 \\ -1.0536 \\ 0.5012\end{bmatrix}

重复步骤2，进行最后一次迭代
计算

b_1

和

b_4

的贡献值

w = \begin{bmatrix}\hat{b_1} & \hat{b_2}\end{bmatrix} \cdot r = \begin{bmatrix}-0.9428 & 0.4082 \\ -0.2357 & -0.4082\\ 0.2357 & -0.8165\end{bmatrix}^{\mathrm{T}} \cdot \begin{bmatrix}-0.266 \\ -1.0536 \\ 0.5012\end{bmatrix} = \begin{bmatrix}0.6172 \\ 0.7308\end{bmatrix}

这里

b_4

提供了最大贡献值0.7308.

b_4

加入

\mathrm{A}_{new} = \begin{bmatrix}0.3 & 1 & 0.4 \\ 0.4 & -0.3 & -0.4 \\ 1 & -0.1 & 0.8\end{bmatrix}

利用最小二乘计算

L_p = \mathrm{A}_{new}^+ \cdot y = \begin{bmatrix}3 \\ 2 \\1\end{bmatrix}

，对应的

x_{rec} = \begin{bmatrix}0 \\ 3 \\ 1 \\ 2\end{bmatrix}

接着计算残差

r = y - \mathrm{A}_{new} \cdot L_p = \begin{bmatrix}2.7 \\ 0.1 \\ 4.5\end{bmatrix} - \begin{bmatrix}0.3 & 1 & 0.4\\ 0.4 & -0.3 & -0.4\\ 1 & -0.1 & 0.8\end{bmatrix} \cdot \begin{bmatrix}3 \\ 1 \\ 2\end{bmatrix} = \begin{bmatrix}0 \\ 0 \\ 0\end{bmatrix}