专栏首页图灵技术域逻辑回归模型及变体实现

逻辑回归模型及变体实现

logistic回归又称logistic回归分析,是一种广义的线性回归分析模型,常用于数据挖掘,疾病自动诊断,经济预测等领域。类logistic模型的相似性在于,所有这些模型中都存在logistic损失的变体,如等式1所示。

这些模型的差异主要是Loss不同。

Loss

Python代码

逻辑回归(Logistic Regression)

Python

# Name: logistic_regression
# Author: Reacubeth
# Time: 2021/5/9 14:36
# Mail: noverfitting@gmail.com
# Site: www.omegaxyz.com
# *_*coding:utf-8 *_*
 
import numpy as np
 
 
class LogisticRegression:
    def __init__(self, name, batch_size, learning_rate, max_iter, optimizer):
        super().__init__()
        self.W = None
        self.name = name
        self.batch_size = batch_size
        self.learning_rate = learning_rate
        self.max_iter = max_iter
        self.optimizer = optimizer.lower()
 
    def fit(self, feature, label, verbose=False):
        num, dim = feature.shape
        self.W = np.random.normal(0, 1, (dim, ))
 
        for t in range(self.max_iter):
            if self.optimizer == 'sgd':
                rand_pos = np.random.choice(num, self.batch_size)
                loss = self.cal_loss(feature[rand_pos, :], label[rand_pos])
                grad = self.cal_grad(feature[rand_pos, :], label[rand_pos])
            else:
                loss = self.cal_loss(feature, label)
                grad = self.cal_grad(feature, label)
            if verbose:
                print(self.name, '@epoch', t, ' loss: ', loss)
            self.W = self.W - self.learning_rate * grad
 
    def predict(self, feature, probability=False):
        if probability:
            return 1 / (1 + np.exp(-np.dot(feature, self.W)))
        else:
            return (1 / (1 + np.exp(-np.dot(feature, self.W))) > 0.5).astype(int)
 
    def score(self, feature, label):
        pred_label = (1 / (1 + np.exp(-np.dot(feature, self.W))) > 0.5).astype(int)
        return np.where(pred_label == label)[0].shape[0] / feature.shape[0]
 
    def cal_loss(self, feature, label):
        num, dim = feature.shape
        return (np.sum(np.log(1 + np.exp(np.dot(feature, self.W)))) - np.dot(label, np.dot(feature, self.W))) / num
 
    def cal_grad(self, feature, label):
        num, dim = feature.shape
        return np.dot(feature.T, 1.0 / (1.0 + np.exp(-np.dot(feature, self.W))) - label) / num

岭逻辑回归(Ridge Logistic Regression)

Python

# Name: ridge_logistic_regression
# Author: Reacubeth
# Time: 2021/5/9 19:36
# Mail: noverfitting@gmail.com
# Site: www.omegaxyz.com
# *_*coding:utf-8 *_*
 
import numpy as np
from numpy import linalg
 
 
class RidgeLogisticRegression:
    def __init__(self, name, batch_size, learning_rate, max_iter, optimizer, lambda_):
        super().__init__()
        self.W = None
        self.name = name
        self.batch_size = batch_size
        self.learning_rate = learning_rate
        self.max_iter = max_iter
        self.optimizer = optimizer.lower()
        self.lambda_ = lambda_
 
    def fit(self, feature, label, verbose=False):
        num, dim = feature.shape
        self.W = np.random.normal(0, 1, (dim, ))
 
        for t in range(self.max_iter):
            if self.optimizer == 'sgd':
                rand_pos = np.random.choice(num, self.batch_size)
                loss = self.cal_loss(feature[rand_pos, :], label[rand_pos])
                grad = self.cal_grad(feature[rand_pos, :], label[rand_pos])
            else:
                loss = self.cal_loss(feature, label)
                grad = self.cal_grad(feature, label)
            if verbose:
                print(self.name, '@epoch', t, ' loss: ', loss)
            self.W = self.W - self.learning_rate * grad
 
    def predict(self, feature, probability=False):
        if probability:
            return 1 / (1 + np.exp(-np.dot(feature, self.W)))
        else:
            return (1 / (1 + np.exp(-np.dot(feature, self.W))) > 0.5).astype(int)
 
    def score(self, feature, label):
        pred_label = (1 / (1 + np.exp(-np.dot(feature, self.W))) > 0.5).astype(int)
        return np.where(pred_label == label)[0].shape[0] / feature.shape[0]
 
    def cal_loss(self, feature, label):
        num, dim = feature.shape
        logistic_loss = np.sum(np.log(1 + np.exp(np.dot(feature, self.W)))) - np.dot(label, np.dot(feature, self.W))
        # yxw = np.dot(label, np.dot(feature, self.W))
        # return (np.sum(np.log(1 + np.exp(-yxw))) + self.lambda_ / 2 * linalg.norm(self.W) ** 2) / num
        return (logistic_loss + self.lambda_ / 2 * linalg.norm(self.W) ** 2) / num
 
    def cal_grad(self, feature, label):
        num, dim = feature.shape
        return (np.dot(feature.T, 1.0 / (1.0 + np.exp(-np.dot(feature, self.W))) - label) + self.lambda_ * self.W) / num

Lasso逻辑回归(Lasso Logistic Regression)

Python

# Name: lasso_logistic_regression
# Author: Reacubeth
# Time: 2021/5/9 20:17
# Mail: noverfitting@gmail.com
# Site: www.omegaxyz.com
# *_*coding:utf-8 *_*
 
import numpy as np
from numpy import linalg
 
 
class LassoLogisticRegression:
    def __init__(self, name, batch_size, learning_rate, max_iter, optimizer, lambda_):
        super().__init__()
        self.W = None
        self.name = name
        self.batch_size = batch_size
        self.learning_rate = learning_rate
        self.max_iter = max_iter
        self.optimizer = optimizer.lower()
        self.lambda_ = lambda_
 
    def fit(self, feature, label, verbose=False):
        num, dim = feature.shape
        self.W = np.random.normal(0, 1, (dim,))
 
        for t in range(self.max_iter):
            if self.optimizer == 'sgd':
                rand_pos = np.random.choice(num, self.batch_size)
                loss = self.cal_loss(feature[rand_pos, :], label[rand_pos])
                grad = self.cal_grad(feature[rand_pos, :], label[rand_pos])
            else:
                loss = self.cal_loss(feature, label)
                grad = self.cal_grad(feature, label)
            if verbose:
                print(self.name, '@epoch', t, ' loss: ', loss)
            self.W = self.W - self.learning_rate * grad
 
    def predict(self, feature, probability=False):
        if probability:
            return 1 / (1 + np.exp(-np.dot(feature, self.W)))
        else:
            return (1 / (1 + np.exp(-np.dot(feature, self.W))) > 0.5).astype(int)
 
    def score(self, feature, label):
        pred_label = (1 / (1 + np.exp(-np.dot(feature, self.W))) > 0.5).astype(int)
        return np.where(pred_label == label)[0].shape[0] / feature.shape[0]
 
    def cal_loss(self, feature, label):
        num, dim = feature.shape
        return ((np.sum(np.log(1 + np.exp(np.dot(feature, self.W)))) - np.dot(label, np.dot(feature, self.W))) +
                self.lambda_ * linalg.norm(self.W, ord=1)) / num
 
    def cal_grad(self, feature, label):
        num, dim = feature.shape
        return (np.dot(feature.T, 1.0 / (1.0 + np.exp(-np.dot(feature, self.W))) - label) +
                self.lambda_ * np.sign(self.W)) / num

Kernel逻辑回归(Kernel Logistic Regression)

Python

# Name: kernel_logistic_regression_lasso
# Author: Reacubeth
# Time: 2021/5/9 23:43
# Mail: noverfitting@gmail.com
# Site: www.omegaxyz.com
# *_*coding:utf-8 *_*
 
import numpy as np
from numpy import linalg
 
 
class KernelLogisticRegression:
    def __init__(self, name, kernel, batch_size, learning_rate, max_iter, optimizer, lambda_, kernel_para):
        super().__init__()
        self.W = None
        self.name = name
        self.kernel = kernel.lower()
        self.batch_size = batch_size
        self.learning_rate = learning_rate
        self.max_iter = max_iter
        self.optimizer = optimizer.lower()
        self.lambda_ = lambda_
        self.kernel_para = kernel_para
 
    def fit(self, feature, label, verbose=False):
        num, dim = feature.shape
        self.W = np.random.normal(0, 1, (dim,))
 
        for t in range(self.max_iter):
            if self.optimizer == 'sgd':
                rand_pos = np.random.choice(num, self.batch_size)
                if self.kernel == 'rbf':
                    K_feature_rand = self.kernel_rbf(feature[rand_pos, :], feature, self.kernel_para)
                elif self.kernel == 'poly':
                    K_feature_rand = self.kernel_poly(feature[rand_pos, :], feature, self.kernel_para)
                elif self.kernel == 'cosine':
                    K_feature_rand = self.kernel_cosine(feature[rand_pos, :], feature)
                else:
                    raise NotImplementedError
                loss = self.cal_loss(K_feature_rand, label[rand_pos])
                grad = self.cal_grad(K_feature_rand, label[rand_pos])
            else:
                if self.kernel == 'rbf':
                    K_feature = self.kernel_rbf(feature, feature, self.kernel_para)
                elif self.kernel == 'poly':
                    K_feature = self.kernel_poly(feature, feature, self.kernel_para)
                elif self.kernel == 'cosine':
                    K_feature = self.kernel_cosine(feature, feature)
                else:
                    raise NotImplementedError
                loss = self.cal_loss(K_feature, label)
                grad = self.cal_grad(K_feature, label)
            if verbose:
                print(self.name, '@epoch', t, ' loss: ', loss)
            self.W = self.W - self.learning_rate * grad
 
    def predict(self, feature, probability=False):
        if self.kernel == 'rbf':
            raw_prob = self.kernel_rbf(feature, feature, self.kernel_para)
        elif self.kernel == 'poly':
            raw_prob = self.kernel_poly(feature, feature, self.kernel_para)
        elif self.kernel == 'cosine':
            raw_prob = self.kernel_cosine(feature, feature)
        else:
            raise NotImplementedError
 
        if probability:
            return 1 / (1 + np.exp(-np.dot(raw_prob, self.W)))
        else:
            return (1 / (1 + np.exp(-np.dot(raw_prob, self.W))) > 0.5).astype(int)
 
    def score(self, feature, label):
        if self.kernel == 'rbf':
            raw_prob = self.kernel_rbf(feature, feature, self.kernel_para)
        elif self.kernel == 'poly':
            raw_prob = self.kernel_poly(feature, feature, self.kernel_para)
        elif self.kernel == 'cosine':
            raw_prob = self.kernel_cosine(feature, feature)
        else:
            raise NotImplementedError
 
        pred_label = (1 / (1 + np.exp(-np.dot(raw_prob, self.W))) > 0.5).astype(int)
        return np.where(pred_label == label)[0].shape[0] / feature.shape[0]
 
    def cal_loss(self, feature, label):
        num, dim = feature.shape
        logistic_loss = np.sum(np.log(1 + np.exp(np.dot(feature, self.W)))) - np.dot(label, np.dot(feature, self.W))
        return logistic_loss / num + self.lambda_ * linalg.norm(self.W, ord=1)
 
    def cal_grad(self, feature, label):
        num, dim = feature.shape
        logistic_grad = np.dot(feature.T, 1.0 / (1.0 + np.exp(-np.dot(feature, self.W))) - label)
        return logistic_grad / num + self.lambda_ * np.sign(self.W)
 
    @staticmethod
    def kernel_rbf(X, Y, sigma):
        norm_X = (linalg.norm(X, axis=1) ** 2)
        norm_Y = (linalg.norm(Y, axis=1) ** 2)
        return np.exp(- (norm_X[:, None] + norm_Y[None, :] - 2 * np.dot(X, Y.T)) / (2 * sigma ** 2))
 
    @staticmethod
    def kernel_poly(X, Y, d):
        return np.dot(X, Y.T) ** d
 
    @staticmethod
    def kernel_cosine(X, Y):
        norm_X = (linalg.norm(X, axis=1) ** 2)
        norm_Y = (linalg.norm(Y, axis=1) ** 2)
        return np.dot(X, Y.T) / (norm_X[:, None] * norm_Y[None, :])

相关文章

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

我来说两句

0 条评论
登录 后参与评论

相关文章

  • 逻辑回归模型及Python实现

    1.模型 在分类问题中,比如判断邮件是否为垃圾邮件,判断肿瘤是否为阳性,目标变量是离散的,只有两种取值,通常会编码为0和1。假设我们有一个特征X,画出散点图,结...

    企鹅号小编
  • [机器学习算法]逻辑回归模型

    线性回归模型可以用于回归模型的学习,当我们需要用线性模型实现分类问题比如二分类问题时,需要用一个单调可微函数将线性回归的连续结果映射到分类回归真实标记的离散值上...

    TOMOCAT
  • R语言实现逻辑回归模型

    首先,本章节使用到的数据集是ISLR包中的Default数据集,数据包含客户信息的模拟数据集。这里的目的是预测哪些客户将拖欠他们的信用卡债务,这个数据集有1w条...

    庄闪闪
  • python 逻辑回归_python实现逻辑回归

    logistic回归又称logistic回归分析,是一种广义的线性回归分析模型,常用于数据挖掘,疾病自动诊断,经济预测等领域。逻辑回归为发生概率除以没有发生概率...

    用户7886150
  • 逻辑斯蒂回归模型

    总第83篇 01|基本概念: 在介绍逻辑回归模型以前,先介绍一下逻辑斯谛分布。 设X是连续型随机变量,X服从逻辑斯蒂分布是指X具有下列分布函数F(x)和密度函数...

    张俊红
  • python实现逻辑回归

    其中θ表示权重参数,x表示输入。θTx为决策边界,就是该决策边界将不同类数据区分开来。

    西西嘛呦
  • 逻辑回归 + GBDT模型融合实战!

    协同过滤和矩阵分解存在的劣势就是仅利用了用户与物品相互行为信息进行推荐, 忽视了用户自身特征, 物品自身特征以及上下文信息等,导致生成的结果往往会比较片面。而这...

    Datawhale
  • 逻辑回归算法原理及实现

    在典型的分类算法中,一般为监督学习,其训练样本中包含样本的特征和标签信息。在二分类中,标签为离散值,如{-1,+1},分别表示负类和正类。分类算法通过对训练样本...

    guichen1013
  • 逻辑回归模型推导及梯度下降

    版权声明:本文为博主原创文章,遵循 CC 4.0 by-sa 版权协议,转载请附上原文出处链接和本声明。

    张凝可
  • Kaggle赛题解析:逻辑回归预测模型实现

    Kaggle是一个数据分析建模的应用竞赛平台,有点类似KDD-CUP(国际知识发现和数据挖掘竞赛),企业或者研究者可以将问题背景、数据、期望指标等发布到Kagg...

    机器学习AI算法工程
  • Kaggle赛题解析:逻辑回归预测模型实现

    作者: 寒小阳 &&龙心尘 原文:http://blog.csdn.net/han_xiaoyang/article/details/49797143 ...

    机器学习AI算法工程
  • 【算法】逻辑回归(Logistic Regression) 模型

    小编邀请您,先思考: 1 逻辑回归算法的原理是什么? 2 逻辑回归算法的有哪些应用? 逻辑回归(Logistic Regression)是机器学习中的一种分类模...

    陆勤_数据人网
  • 我眼中的逻辑回归模型

    当被解释变量Y为 取有限个可能值 的分类变量时,需要建立分类选择模型。分类选择模型大约有十几个左右,例如:

    许卉
  • sklearn线性逻辑回归和非线性逻辑回归的实现

    本文用代码实现怎么利用sklearn来进行线性逻辑回归的计算,下面先来看看用到的数据。

    砸漏
  • Python|机器学习-逻辑回归模型

    用J表示损失函数,计算交叉熵为损失函数的值。交叉熵函数可以衡量预测结果与实际结果的相似性。

    rare0502
  • 机器学习16:逻辑回归模型

    逻辑回归模型是对线性回归模型解决分类任务的改进,是广义线性模型。它可以被看做是Sigmoid函数(logistic方程)所归一化后的线性回归模型,主要用于二...

    用户5473628
  • 机器学习 | LR逻辑回归模型

    核心思想是把原线性回归的取值范围通过Logistic函数映射到一个概率空间,从而将一个回归模型转换为一个分类模型。

    week
  • TF入门03-实现线性回归&逻辑回归

    之前,我们介绍了TF的运算图、会话以及基本的ops,本文使用前面介绍的东西实现两个简单的算法,分别是线性回归和逻辑回归。本文的内容安排如下:

    公众号-不为谁写的歌
  • logistic逻辑回归公式推导及R语言实现

    线性回归模型简单,对于一些线性可分的场景还是简单易用的。Logistic逻辑回归也可以看成线性回归的变种,虽然名字带回归二字但实际上他主要用来二分类,区别于线性...

    知然

扫码关注云+社区

领取腾讯云代金券