local_response_normalization出现在论文”ImageNet Classification with deep Convolutional Neural Networks”中,论文中说,这种normalization对于泛化是有好处的.
bix,y=aix,y(k+α∑min(0,i+n/2)j=max(0,i−n/2)(ajx,y)2)β
b_{x,y}^i = \frac{a_{x,y}^i}{ (k+\alpha\sum_{j=max(0,i-n/2)}^{min(0,i+n/2)}(a_{x,y}^j)^2)^\beta} 经过了一个conv2d或pooling后,我们获得了[batch_size, height, width, channels]这样一个tensor.现在,将channels称之为层,不考虑batch_size
tf.nn.local_response_normalization(input, depth_radius=None, bias=None, alpha=None, beta=None, name=None)
'''
Local Response Normalization.
The 4-D input tensor is treated as a 3-D array of 1-D vectors (along the last dimension), and each vector is normalized independently. Within a given vector, each component is divided by the weighted, squared sum of inputs within depth_radius. In detail,
'''
"""
input: A Tensor. Must be one of the following types: float32, half. 4-D.
depth_radius: An optional int. Defaults to 5. 0-D. Half-width of the 1-D normalization window.
bias: An optional float. Defaults to 1. An offset (usually positive to avoid dividing by 0).
alpha: An optional float. Defaults to 1. A scale factor, usually positive.
beta: An optional float. Defaults to 0.5. An exponent.
name: A name for the operation (optional).
"""论文地址 batch_normalization, 故名思意,就是以batch为单位进行normalization - 输入:mini_batch: In={x1,x2,..,xm}In=\{x^1,x^2,..,x^m\} - γ,β\gamma,\beta,需要学习的参数,都是向量 - ϵ\epsilon: 一个常量 - 输出: Out={y1,y2,...,ym}Out=\{y^1, y^2, ..., y^m\} 算法如下: (1)mini_batch mean:
μIn←1m∑i=1mxi
\mu_{In} \leftarrow \frac{1}{m}\sum_{i=1}^m x_i (2)mini_batch variance
σ2In=1m∑i=1m(xi−μIn)2
\sigma_{In}^2=\frac{1}{m}\sum_{i=1}^m(x^i-\mu_In)^2 (3)Normalize
x^i=xi−μInσ2In+ϵ−−−−−−√
\hat x^i=\frac{x^i-\mu_{In}}{\sqrt{\sigma_{In}^2 + \epsilon}} (4)scale and shift
yi=γx^i+β
y^i=\gamma\hat x^i + \beta 可以看出,batch_normalization之后,数据的维数没有任何变化,只是数值发生了变化 OutOut作为下一层的输入 函数: tf.nn.batch_normalization()
def batch_normalization(x,
mean,
variance,
offset,
scale,
variance_epsilon,
name=None):Args:
Tensor of arbitrary dimensionality.Tensor.Tensor.Tensor, often denoted β\beta in equations, or None. If present, will be added to the normalized tensor.Tensor, often denoted γ\gamma in equations, or None. If present, the scale is applied to the normalized tensor.现在,我们需要一个函数 返回mean和variance, 看下面.
def moments(x, axes, shift=None, name=None, keep_dims=False):
# for simple batch normalization pass `axes=[0]` (batch only).对于卷积的batch_normalization, x 为[batch_size, height, width, depth],axes=[0,1,2],就会输出(mean,variance), mean 与 variance 均为标量。