首页
学习
活动
专区
工具
TVP
发布
精选内容/技术社群/优惠产品,尽在小程序
立即前往

2.5 Norms

Sometimes we need to measure the size of a vector. In machine learning, we usually measure the size of vectors using a function called anorm. Formally, the Lpnorm is given by

有时我们需要测量矢量的大小。 在机器学习中,我们通常使用称为范数的函数来测量向量的大小。 从形式上看,Lp范数由下式给出

for p ∈ R, p ≥ 1.

对于p∈R,p≥1。

Norms, including the Lp norm, are functions mapping vectors to non-negative values. On an intuitive level, the norm of a vector x measures the distance from the origin to the pointx. More rigorously, a norm is any functionf that satisfies the following properties:

包括Lp范数在内的范数是将向量映射到非负值的函数。 在一个直观的层面上,向量x的范数衡量从原点到点x的距离。 更严格地说,规范是满足以下性质的任何函数:

f (x) = 0 ⇒ x = 0

f (x + y) ≤ f (x) + f(y) (the triangle inequality)

∀α ∈ R, f(αx) = |α|f(x)

The L2norm, with p= 2, is known as the Euclidean norm, which is simply the Euclidean distance from the origin to the point identified by x. The L2norm is used so frequently in machine learning that it is often denoted simply as ||x||,with the subscript 2 omitted. It is also common to measure the size of a vectorusing the squared L2 norm, which can be calculated simply as xTx.

具有p=2的L2范数被称为欧几里德范数,是从原点到由x标识的点的欧几里得距离。 L2规范在机器学习中经常使用,它通常简单地表示为|| x ||,省略下标2。 使用平方L2范数来测量矢量的大小也是常见的,其可以简单地计算为xTx。

The squared L2norm is more convenient to work with mathematically and omputationally than the L2orm itself. For example, each derivative of the squared L2norm with respect to each element of x depends only on the corre-sponding element of x, while all the derivatives of the L2norm depend on the entire vector. In many contexts, the squared L2norm may be undesirable because it increases very slowly near the origin. In several machine learning applications, it is important to discriminate between elements that are exactly zero and elements that are small but nonzero. In these cases, we turn to a function that grows at the same rate in all locations, but that retains mathematical simplicity: the L1 norm.The L1norm may be simplified to

L2范数的平方比L2范数本身在数学和计算上更方便。 例如,每个关于x中每一个元素的平方L2范数的导数只取决于x的相应元素,而L2范数的所有导数取决于整个向量。 在许多情况下,L2范数的平方可能是不受欢迎的,因为它在原点附近增长非常缓慢。 在几种机器学习应用中,重要的是要区分恰好为零的元素和小而非零的元素。 在这些情况下,我们转向在所有位置以相同速率增长的函数,但保留数学简单性:L1范数。L1范数可简化为

The L1norm is commonly used in machine learning when the difference between zero and nonzero elements is very important. Every time an element of x moves away from 0 by ε, the L1norm increases by ε.

当零和非零元素之间的差异非常重要时,L1范数就会用于机器学习。 每当x的元素从0移开ε时,L1范数就会增加ε。

We sometimes measure the size of the vector by counting its number of nonzero elements. Some authors refer to this function as the “Lnorm,” but this is incorrect terminology. The number of nonzero entries in a vector is not a norm, because scaling the vector by α does not change the number of nonzero entries. The L1 norm is often used as a substitute for the number of nonzero entries.

我们有时通过计算非零元素的数量来度量矢量的大小。 一些作者将此功能称为“L范数”,但这是不正确的术语。 矢量中非零项的数量不是一个范数,因为按α缩放矢量不会改变非零项的数量。 L1范数通常用作非零条目数量的替代。

One other norm that commonly arises in machine learning is the L∞ norm,also known as the max norm. This norm simplifies to the absolute value of the element with the largest magnitude in the vector,

机器学习中常见的另一个范数是L∞范数,也称为最大范数。 这个范数简化为矢量中幅度最大的元素的绝对值,

Sometimes we may also wish to measure the size of a matrix. In the context of deep learning, the most common way to do this is with the otherwise obscure Frobenius norm:

有时我们也可能希望测量矩阵的大小。 在深度学习的背景下,最常见的做法是使用其他模糊的弗罗贝尼乌斯范数:

which is analogous to the L2norm of a vector.

这类似于矢量的L2范数。

The dot product of two vectors can be rewritten in terms of norms. Specifically,

两个向量的点积可以根据范数重写。具体来说,

xTy = ||x||2||y||2cos θ,

where θ is the angle between x and y.

其中θ是x和y之间的角度。

头条号:

  • 发表于:
  • 原文链接https://kuaibao.qq.com/s/20180516G1SSI100?refer=cp_1026
  • 腾讯「腾讯云开发者社区」是腾讯内容开放平台帐号(企鹅号)传播渠道之一,根据《腾讯内容开放平台服务协议》转载发布内容。
  • 如有侵权,请联系 cloudcommunity@tencent.com 删除。

扫码

添加站长 进交流群

领取专属 10元无门槛券

私享最新 技术干货

扫码加入开发者社群
领券