# 干货——线性分类（中）

。在函数中，数据

## Multiclass Support Vector Machine Loss

。我们可以把损失函数想象成一个人，这位SVM先生（或者女士）对于结果有自己的品位，如果某个结果能使得损失值更低，那么SVM就更加喜欢它。

。评分函数输入像素数据，然后通过公式

。同时假设

）加起来，所以我们得到两个部分：

。如果不满足这点，就开始计算损失值。

）。问题在于这个W并不唯一：可能有很多相似的W都能正确地分类所有的数据。一个简单的例子：如果W能够正确分类所有数据，即对于每个数据，损失值都是0。那么当

，以及正则化损失（regularization loss）。完整公式如下所示：

```def L_i(x, y, W):
"""  unvectorized version. Compute the multiclass svm loss for a single example (x,y)  - x is a column vector representing an image (e.g. 3073 x 1 in CIFAR-10)    with an appended bias dimension in the 3073-rd position (i.e. bias trick)  - y is an integer giving index of correct class (e.g. between 0 and 9 in CIFAR-10)  - W is the weight matrix (e.g. 10 x 3073 in CIFAR-10)  """
delta = 1.0 # see notes about delta later in this section
scores = W.dot(x) # scores becomes of size 10 x 1, the scores for each class
correct_class_score = scores[y]
D = W.shape[0] # number of classes, e.g. 10
loss_i = 0.0
for j in xrange(D): # iterate over all wrong classes
if j == y:
# skip for the true class to only loop over incorrect classes
continue
# accumulate loss for the i-th example
loss_i += max(0, scores[j] - correct_class_score + delta)
return loss_idef L_i_vectorized(x, y, W):
"""  A faster half-vectorized implementation. half-vectorized  refers to the fact that for a single example the implementation contains  no for loops, but there is still one loop over the examples (outside this function)  """
delta = 1.0
scores = W.dot(x)
# compute the margins for all classes in one vector operation
margins = np.maximum(0, scores - scores[y] + delta)
# on y-th position scores[y] - scores[y] canceled and gave delta. We want
# to ignore the y-th position and only consider margin on max wrong class
margins[y] = 0
loss_i = np.sum(margins)
return loss_idef L(X, y, W):
"""  fully-vectorized implementation :  - X holds all the training examples as columns (e.g. 3073 x 50,000 in CIFAR-10)  - y is array of integers specifying correct class (e.g. 50,000-D array)  - W are weights (e.g. 10 x 3073)  """
# evaluate loss over all examples in X without using any for loops
# left as exercise to reader in the assignment```

## 实际考虑

=1.0都是安全的。超参数

=1或

=100）从某些角度来看是没意义的，因为权重自己就可以控制差异变大和缩小。也就是说，真正的权衡是我们允许权重能够变大到何种程度（通过正则化强度

215 篇文章53 人订阅

0 条评论

## 相关文章

### 迁移学习在自然语言处理领域的应用

迁移学习近年来在图形领域中得到了快速的发展，主要在于某些特定的领域不具备足够的数据，不能让深度模型学习的很好，需要从其它领域训练好的模型迁移过来，...

1293

1561

### 吴恩达深度学习笔记 course4 week 4 特殊应用:人脸识别与神经风格转换

output:如果图片是对应的K人中的一人,则输出此人ID,否则验证不通过    ,人脸识别比人脸验证更难一些,如果一个人脸验证系统的正确率为99%,即错误率为...

1602

2995

1283

1093

3766

1234

### 模糊决策树算法FID3

Q A 用户 今天发布什么呢？？？ ? ? HHY 今天讲决策树算法哦，不同于清晰决策树，利用了模糊逻辑的模糊决策树算法哦！ 模糊隶属度 ? (a)三角形隶属度...

4098

3768