目前正在学习把深度学习应用到NLP,主要是看些论文和博客,同时做些笔记方便理解,还没入门很多东西还不懂,一知半解。贴出来的原因,一是方便自己查看,二是希望大家指点一下,尽快入门。
原paper:Convolutional Neural Networks for Sentence Classification
源代码:https://github.com/dennybritz/cnn-text-classification-tf
原博客:http://www.wildml.com/2015/12/implementing-a-cnn-for-text-classification-in-tensorflow/
sequence_length:句子长度,把每个句子统一填充到59个单词
num_classes:输出的类型个数,这里是积极和消极两类
vocab_size:词典长度,需要在嵌入层定义
embeding_size :嵌入的维度
filter_sizes:卷积核的高度
name_scope,把所有操作加到命名为embedding的顶层节点,用于可视化网络视图
W是我们在训练时得到的嵌入矩阵,通过随机均匀分布进行初始化
tf.nn.embedding_lookup 是真正的embedding操作,结果是一个三维的tensor,[None, sequence_length, embedding_size]
因为卷积操作conv2d需要4个维度的tensor所以需要给embedding结果增加一个维度,得到[None, sequence_length, embedding_size, 1]
卷积和max-pooling
对不同大小的filter建立不同的卷积层,W是卷积的输入矩阵,h是使用relu进行卷积的结果。
“VALID”表示使用narrow卷积,得到的结果大小为[1, sequence_length - filter_size + 1, 1, 1]
"VALID"
padding means that we slide the filter over our sentence without padding the edges, performing a narrow convolution that gives us an output of shape[1, sequence_length - filter_size + 1, 1, 1]
. Performing max-pooling over the output of a specific filter size leaves us with a tensor of shape[batch_size, 1, 1, num_filters]
. This is essentially a feature vector, where the last dimension corresponds to our features. Once we have all the pooled output tensors from each filter size we combine them into one long feature vector of shape[batch_size, num_filters_total]
. Using-1
intf.reshape
tells TensorFlow to flatten the dimension when possible.
session.as_default()
设置为默认视图