前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >UltraGCN代码解读

UltraGCN代码解读

作者头像
秋枫学习笔记
发布2022-09-19 11:24:05
5110
发布2022-09-19 11:24:05
举报
文章被收录于专栏:秋枫学习笔记

点击蓝字关注,提升学习效率

代码地址:https://github.com/xue-pai/UltraGCN

对文章不熟悉的小伙伴可以阅读之前的文章

文章解读:CIKM'21「华为」图+推荐系统:比LightGCN更高效更有效的UltraGCN

  • ini文件中为不同数据集的配置文件
  • main.py中为主要代码,包含了读取配置未见,读取数据,模型代码和训练部分
  • data文件夹中包含不同数据集

根据节点的度计算系数

代码语言:javascript
复制
def get_ii_constraint_mat(train_mat, num_neighbors, ii_diagonal_zero = False):
    
    print('Computing \\Omega for the item-item graph... ')
    A = train_mat.T.dot(train_mat) # I * I
    n_items = A.shape[0]
    res_mat = torch.zeros((n_items, num_neighbors))
    res_sim_mat = torch.zeros((n_items, num_neighbors))
    if ii_diagonal_zero:
        A[range(n_items), range(n_items)] = 0
    # 计算用户和商品节点的度
    items_D = np.sum(A, axis = 0).reshape(-1)
    users_D = np.sum(A, axis = 1).reshape(-1)

    # 根据公式9,beta_uD=sqrt(du+1) / du, beta_iD=1/sqrt(di+1)
    beta_uD = (np.sqrt(users_D + 1) / users_D).reshape(-1, 1)
    beta_iD = (1 / np.sqrt(items_D + 1)).reshape(1, -1)
    # 两者相乘后就得到了公式9中的beta_ui系数
    all_ii_constraint_mat = torch.from_numpy(beta_uD.dot(beta_iD))
    # 只保留topk,构造稀疏矩阵
    for i in range(n_items):
        row = all_ii_constraint_mat[i] * torch.from_numpy(A.getrow(i).toarray()[0])
        row_sims, row_idxs = torch.topk(row, num_neighbors)
        res_mat[i] = row_idxs
        res_sim_mat[i] = row_sims
        if i % 15000 == 0:
            print('i-i constraint matrix {} ok'.format(i))

    print('Computation \\Omega OK!')
    return res_mat.long(), res_sim_mat.float()

模型部分

代码语言:javascript
复制
class UltraGCN(nn.Module):
    def __init__(self, params, constraint_mat, ii_constraint_mat, ii_neighbor_mat):
        super(UltraGCN, self).__init__()
        self.user_num = params['user_num']
        self.item_num = params['item_num']
        self.embedding_dim = params['embedding_dim']
        # 这里四个w是预先设置好的,不是可学习的
        # 用于计算损失函数中的系数,为了方便调参,所以作者放在了配置文件里
        self.w1 = params['w1']
        self.w2 = params['w2'] 
        self.w3 = params['w3'] 
        self.w4 = params['w4']

        self.negative_weight = params['negative_weight']
        self.gamma = params['gamma']
        self.lambda_ = params['lambda']

        self.user_embeds = nn.Embedding(self.user_num, self.embedding_dim)
        self.item_embeds = nn.Embedding(self.item_num, self.embedding_dim)

        self.constraint_mat = constraint_mat
        self.ii_constraint_mat = ii_constraint_mat
        self.ii_neighbor_mat = ii_neighbor_mat

        self.initial_weight = params['initial_weight']


        self.initial_weights()

    def initial_weights(self):
        nn.init.normal_(self.user_embeds.weight, std=self.initial_weight)
        nn.init.normal_(self.item_embeds.weight, std=self.initial_weight)


    # 这里根据已经计算好的beta_ui等系数结合预先设置的w来计算损失函数中的总体系数
    # 具体损失函数计算可见cal_loss_L
    def get_omegas(self, users, pos_items, neg_items):
        device = self.get_device()
        if self.w2 > 0:
            pos_weight = torch.mul(self.constraint_mat['beta_uD'][users], self.constraint_mat['beta_iD'][pos_items]).to(device)
            pow_weight = self.w1 + self.w2 * pos_weight
        else:
            pos_weight = self.w1 * torch.ones(len(pos_items)).to(device)
        
        # users = (users * self.item_num).unsqueeze(0)
        if self.w4 > 0:
            neg_weight = torch.mul(torch.repeat_interleave(self.constraint_mat['beta_uD'][users], neg_items.size(1)), self.constraint_mat['beta_iD'][neg_items.flatten()]).to(device)
            neg_weight = self.w3 + self.w4 * neg_weight
        else:
            neg_weight = self.w3 * torch.ones(neg_items.size(0) * neg_items.size(1)).to(device)


        weight = torch.cat((pow_weight, neg_weight))
        return weight


    # 计算损失函数L:User-Item图上的计算
    def cal_loss_L(self, users, pos_items, neg_items, omega_weight):
        device = self.get_device()
        # 通过nn.Embedding得到用户,商品的embedding
        user_embeds = self.user_embeds(users)
        pos_embeds = self.item_embeds(pos_items)
        neg_embeds = self.item_embeds(neg_items)
        # 对应损失函数中用户和邻居正样本求点积
        pos_scores = (user_embeds * pos_embeds).sum(dim=-1) # batch_size
        user_embeds = user_embeds.unsqueeze(1)
        # 对应损失函数中用户和邻居负样本求点积
        neg_scores = (user_embeds * neg_embeds).sum(dim=-1) # batch_size * negative_num
        # 计算正负样本的交叉熵,这里的交叉熵包含了L_C和L_O,由于这里两个损失用的数据是一样的
        # 根据论文中公式12,13,14 L= L_O + lambda * L_C 可以直接改写为L = (1+lambda*beta)*BCE=(1/lambda+beta)*BCE
        # lambda 用于表示两个损失函数的相对重要性
        # 这里的omega_weight就是BCE前面的权重系数,具体计算方式见get_omegas
        # #L = -(w1 + w2*\beta)) * log(sigmoid(e_u e_i)) - \sum_{N-} (w3 + w4*\beta) * log(sigmoid(e_u e_i'))
        neg_labels = torch.zeros(neg_scores.size()).to(device)
        neg_loss = F.binary_cross_entropy_with_logits(neg_scores, neg_labels, weight = omega_weight[len(pos_scores):].view(neg_scores.size()), reduction='none').mean(dim = -1)
        
        pos_labels = torch.ones(pos_scores.size()).to(device)
        pos_loss = F.binary_cross_entropy_with_logits(pos_scores, pos_labels, weight = omega_weight[:len(pos_scores)], reduction='none')
        # 计算总体损失
        loss = pos_loss + neg_loss * self.negative_weight
      
        return loss.sum()


    # 计算L_I,在Item-Item上计算
    def cal_loss_I(self, users, pos_items):
        device = self.get_device()
        # 得到商品的邻居商品的embedding和用户的embedding
        neighbor_embeds = self.item_embeds(self.ii_neighbor_mat[pos_items].to(device)) # len(pos_items) * num_neighbors * dim
        sim_scores = self.ii_constraint_mat[pos_items].to(device) # len(pos_items) * num_neighbors
        user_embeds = self.user_embeds(users).unsqueeze(1)
        # 计算L_I
        loss = -sim_scores * (user_embeds * neighbor_embeds).sum(dim=-1).sigmoid().log()
      
        # loss = loss.sum(-1)
        return loss.sum()

    # L2正则项
    def norm_loss(self):
        loss = 0.0
        for parameter in self.parameters():
            loss += torch.sum(parameter ** 2)
        return loss / 2


    # 训练的时候,计算所有损失后求和
    def forward(self, users, pos_items, neg_items):
        omega_weight = self.get_omegas(users, pos_items, neg_items)
        
        loss = self.cal_loss_L(users, pos_items, neg_items, omega_weight)
        loss += self.gamma * self.norm_loss()
        loss += self.lambda_ * self.cal_loss_I(users, pos_items)
        return loss


    # 测试的时候,计算商品和用户的embedding,然后点积求分数
    def test_foward(self, users):
        items = torch.arange(self.item_num).to(users.device)
        user_embeds = self.user_embeds(users)
        item_embeds = self.item_embeds(items)
         
        return user_embeds.mm(item_embeds.t())


    def get_device(self):
        return self.user_embeds.weight.device
本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2021-12-24,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 秋枫学习笔记 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 根据节点的度计算系数
  • 模型部分
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档