基于tensorflow实现AI图片鉴黄（NSFW）

sparkexpert

发布于 2018-01-09 11:53:32

6.7K1

发布于 2018-01-09 11:53:32

yahoo开源了用于检测图片是否包含不适宜工作场所（NSFW）内容的深度神经网络项目https://github.com/yahoo/open_nsfw，GitHub 库中包含了网络的 Caffe 模型的代码。检测具有攻击性或成人内容的图像是研究人员进行了几十年的一个难题。随着计算机视觉技术和深度学习的发展，算法已经成熟，雅虎的这个模型能以更高的精度分辨色情图像。由于 NSFW 界定其实是很主观的，有的人反感的东西可能其他人并不觉得如何。雅虎的这个深度神经网络只关注NSFW内容的一种类型，即色情图片。在网上看到了tensorflow实现的版本，进行了测试。

一、原理：来自（https://yahooeng.tumblr.com/post/151148689421/open-sourcing-a-deep-learning-solution-for）

从作者的一些技术文档中可以看出，其实质上是利用了CNN的一些图像分类模型来实现二分类问题（色情与否）。

Training a deep neural network for NSFW classification

We train the models using a dataset of positive (i.e. NSFW) images and negative (i.e. SFW – suitable/safe for work) images. We are not releasing the training images or other details due to the nature of the data, but instead we open source the output model which can be used for classification by a developer.

另外从下面这段话，可以看出。用了ResNet的方法来实现。

While training, the images were resized to 256x256 pixels, horizontally flipped for data augmentation, and randomly cropped to 224x224 pixels, and were then fed to the network. For training residual networks, we used scale augmentation as described in the ResNet paper [1], to avoid overfitting. We evaluated various architectures to experiment with tradeoffs of runtime vs accuracy.

MS_CTC [4] – This architecture was proposed in Microsoft’s constrained time cost paper. It improves on top of AlexNet in terms of speed and accuracy maintaining a combination of convolutional and fully-connected layers.
Squeezenet [3] – This architecture introduces the fire module which contain layers to squeeze and then expand the input data blob. This helps to save the number of parameters keeping the Imagenet accuracy as good as AlexNet, while the memory requirement is only 6MB.
VGG [2] – This architecture has 13 conv layers and 3 FC layers.
GoogLeNet [5] – GoogLeNet introduces inception modules and has 20 convolutional layer stages. It also uses hanging loss functions in intermediate layers to tackle the problem of diminishing gradients for deep networks.
ResNet-50 [1] – ResNets use shortcut connections to solve the problem of diminishing gradients. We used the 50-layer residual network released by the authors.
ResNet-50-thin – The model was generated using our pynetbuilder tool and replicates the Residual Network paper’s 50-layer network (with half number of filters in each layer). You can find more details on how the model was generated and trained here.

二、实验测试：

　　将网上搜索的一些图片放到文件夹下，个性访问文件夹的方式来实现对文件夹下面的所有文件进行判断。

print('加载测试图片...')
        for lists in os.listdir(args.input_file): 
            path = os.path.join(args.input_file, lists) 
            if os.path.splitext(path)[1] == '.jpg':
                print(path) 
                # 图片加载
                image = fn_load_image(path)
                #　检测
                predictions = \
                    sess.run(model.predictions,
                             feed_dict={model.input: image})
        
                print("'{}'图片的检测结果为：".format(path))
                print("\tSFW 得分:\t{}\n\tNSFW 得分:\t{}".format(*predictions[0]))

具体结果如下：

../data/3.jpg：比较奇怪的是，这张图片居然两个得分基本相当，可见其训练数据集主要是女性。