我正在尝试理解Tensorflow中的NCE损失函数。NCE损失用于word2vec任务,例如:
# Look up embeddings for inputs.
embeddings = tf.Variable(
tf.random_uniform([vocabulary_size, embedding_size], -1.0, 1.0))
embed = tf.nn.embedding_lookup(embeddings, train_inputs)
# Construct the variables for the NCE loss
nce_weights = tf.Variable(
tf.truncated_normal([vocabulary_size, embedding_size],
stddev=1.0 / math.sqrt(embedding_size)))
nce_biases = tf.Variable(tf.zeros([vocabulary_size]))
# Compute the average NCE loss for the batch.
# tf.nce_loss automatically draws a new sample of the negative labels each
# time we evaluate the loss.
loss = tf.reduce_mean(
tf.nn.nce_loss(weights=nce_weights,
biases=nce_biases,
labels=train_labels,
inputs=embed,
num_sampled=num_sampled,
num_classes=vocabulary_size))
更多详细信息,请参考Tensorflow word2vec_basic.py
在word2vec模型中,我们对构建单词的表示感兴趣。在训练过程中,给定一个滑动窗口,每个单词将有两个嵌入: 1)当单词是中心词时;2)当单词是上下文词时。这两个嵌入分别称为输入向量和输出向量。(more explanations of input and output matrices)
在我看来,输入矩阵是embeddings
,输出矩阵是nce_weights
。是对的吗?
根据s0urcer的post也与nce
有关,它说最终的嵌入矩阵就是输入矩阵。而,some others saying,final_embedding=input_matrix+output_matrix
。哪一个是正确的/更常见的?
https://stackoverflow.com/questions/41475180
复制相似问题