实际上这是帖子的续集
我正在使用gensim训练一个Word2Vec模型,参数为hs=1、sg=0和negative=0。代码修改后需要的培训时间较少,但损失似乎出了问题,一开始会增加,然后减少,我不知道发生了什么。
守则如下:
from gensim.models.keyedvectors import KeyedVectors
from gensim.models import word2vec
import logging
logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s',
level=logging.INFO)
sentences = word2vec.Text8Corpus("text8") # loading the corpus
from gensim.models.callbacks import CallbackAny2Vec
loss_list = []
class Callback(CallbackAny2Vec):
def __init__(self):
self.epoch = 0
def on_epoch_end(self, model):
loss = model.get_latest_training_loss()
loss_list.append(loss)
print('Loss after epoch {}:{}'.format(self.epoch, loss))
model.running_training_loss = 0.0
self.epoch = self.epoch + 1
from gensim.models import KeyedVectors,word2vec,Word2Vec
import time
start_time = time.time()
model = word2vec.Word2Vec(sentences, hs=1, sg=0, negative=0, compute_loss=True, epochs=30, callbacks=[Callback()])
end_time = time.time()
print('Running time: %s seconds' % (end_time - start_time))
代码实际上是用jupyter编写的,如屏幕截图所示:
输出结果如下:
关于输出的更多详细信息:
Loss after epoch 0:39370848.0
Loss after epoch 1:43579636.0
Loss after epoch 2:45213772.0
Loss after epoch 3:46132356.0
Loss after epoch 4:46788412.0
Loss after epoch 5:47218508.0
Loss after epoch 6:47553520.0
Loss after epoch 7:47793332.0
Loss after epoch 8:47995616.0
Loss after epoch 9:48134664.0
Loss after epoch 10:48224960.0
Loss after epoch 11:48326640.0
Loss after epoch 12:48371072.0
Loss after epoch 13:48405980.0
Loss after epoch 14:48437804.0
Loss after epoch 15:48417612.0
Loss after epoch 16:48415112.0
Loss after epoch 17:48396260.0
Loss after epoch 18:48349064.0
Loss after epoch 19:48301088.0
Loss after epoch 20:48247328.0
Loss after epoch 21:48167340.0
Loss after epoch 22:48053500.0
Loss after epoch 23:47937300.0
Loss after epoch 24:47810964.0
Loss after epoch 25:47669088.0
Loss after epoch 26:47500524.0
Loss after epoch 27:47300488.0
Loss after epoch 28:47044920.0
Loss after epoch 29:46747080.0
Running time: 259.9046218395233 seconds
发布于 2022-10-01 23:28:01
我不期望这种涨跌模式;我认为通常的SGD优化通常从一开始就会出现全天候的损失。
然而,如果最终结果向量仍然表现良好,我不会太担心次级进展指标中的意外,比如损失数字,原因如下:
https://stackoverflow.com/questions/73891182
复制相似问题