前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >动手实现DeepFM

动手实现DeepFM

作者头像
秋枫学习笔记
发布2022-09-19 10:10:01
2700
发布2022-09-19 10:10:01
举报
文章被收录于专栏:秋枫学习笔记

关注我们,一起学习~

基本原理

DeepFM方法是华为在2017年提出的,是推荐系统中的经典方法,和wide and deep类似,DeepFM也是同时考虑记忆和泛化,在泛化部分采用DNNs结合dropout,batch norm提升模型泛化能力;而记忆部分,采用的是FM,将特征之间进行两两交互。最后,将DNNs部分和FM部分的输出拼接,对偏好分数进行预测。原理相对简单,这里不做过多介绍。

想要详细了解的小伙伴可以前往链接https://www.jianshu.com/p/4ec8d8dcb660

代码实现

代码主要是基于pytorch实现,给出了对应的注释,希望对大家有一点点帮助。

代码语言:javascript
复制
import torch
import torch.nn as nn


class DeepFM(nn.Module):
    def __init__(self, hidden_dims, emb_dim, dropouts, fea_nums_dict, numb_dim):
        """deepfm模型

        Args:
            hidden_dims (list): the hidden dims of dnn
            emb_dim (int): the dim of embedding
            dropouts (list): the rates of dropouts in dnn
            fea_nums_dict (dict): each category's feature nums
            numb_dim (int): the dim of numerical features
        """
        super(DeepFM, self).__init__()
        self.hidden_dims = hidden_dims
        self.emb_dim = emb_dim
        self.dropouts = dropouts + [0]
        self.fea_nums_dict = fea_nums_dict
        self.emb_layer = nn.ModuleDict()
        for fea_name in self.fea_nums_dict.keys():
            self.emb_layer[fea_name] = nn.Embedding(
                self.fea_nums_dict[fea_name], emb_dim
            )

        self.fm_first = nn.ModuleDict()
        for fea_name in self.fea_nums_dict.keys():
            self.fm_first[fea_name] = nn.Embedding(
                self.fea_nums_dict[fea_name], 1)

        input_dim = len(self.fea_nums_dict.values()) * emb_dim + numb_dim
        hidden_dims = [input_dim] + hidden_dims
        self.dnns = nn.ModuleList([nn.BatchNorm1d(input_dim)])
        for i in range(len(hidden_dims) - 1):
            self.dnns.append(nn.Linear(hidden_dims[i], hidden_dims[i + 1]))
            self.dnns.append(nn.BatchNorm1d(hidden_dims[i + 1]))
            self.dnns.append(nn.Dropout(self.dropouts[i]))
        self.predict = nn.Linear(
            hidden_dims[-1] + emb_dim + len(self.fea_nums_dict.values()), 1
        )
        self.sigmoid = nn.Sigmoid()

    def fm_layer(self, features):
        # 一阶交互
        emb_arr = []
        for fea_name in features:
            emb_tmp = self.fm_first[fea_name](features[fea_name])
            emb_arr.append(emb_tmp)
        first_order = torch.cat(emb_arr, 1)

        emb_arr = []
        for fea_name in features:
            emb_tmp = self.emb_layer[fea_name](features[fea_name])
            emb_arr.append(emb_tmp)
        self.embedding = torch.cat(emb_arr, 1)
        # 二阶交互
        sum_square = sum(emb_arr) ** 2
        square_sum = sum([emb * emb for emb in emb_arr])
        second_order = 0.5 * (sum_square - square_sum)
        # 拼接一阶和二阶交互
        fm_result = torch.cat([first_order, second_order], 1)
        return fm_result

    def deep_layer(self, numb_features):
        """dnns层

        Args:
            numb_features (tensor): numerical features

        Returns:
            tensor: dnn_result
        """
        # 拼接类别特征的embedding和数值特征
        deep_emb = torch.cat([self.embedding, numb_features], 1)
        for layer in self.dnns:
            deep_emb = layer(deep_emb)
        return deep_emb

    def forward(self, numb_features, features):
        fm_result = self.fm_layer(features)
        dnn_result = self.deep_layer(numb_features)
        output = torch.cat([fm_result, dnn_result], 1)
        output = self.predict(output)
        output = self.sigmoid(output)
        return output
本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2022-06-20,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 秋枫学习笔记 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档