前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >论文周报 | 推荐系统领域最新研究进展,含SIGIR、AAAI、CIKM等顶会论文

论文周报 | 推荐系统领域最新研究进展,含SIGIR、AAAI、CIKM等顶会论文

作者头像
张小磊
发布2023-08-22 18:29:41
3120
发布2023-08-22 18:29:41
举报

本文精选了上周(0403-0409)最新发布的15篇推荐系统相关论文,所利用的技术包括大型预训练语言模型、图学习、对比学习、扩散模型、联邦学习等。

以下整理了论文标题以及摘要,如感兴趣可移步原文精读。

1. Zero-Shot Next-Item Recommendation using Large Pretrained Language Models

2. Simplifying Content-Based Neural News Recommendation: On User Modeling and Training Objectives, SIGIR2023

3. HGCC: Enhancing Hyperbolic Graph Convolution Networks on Heterogeneous Collaborative Graph for Recommendation

4. DLRover: An Elastic Deep Training Extension with Auto Job Resource Recommendation

5. Enhancing Clinical Evidence Recommendation with Multi-Channel Heterogeneous Learning on Evidence Graphs

6. DiffuRec: A Diffusion Model for Sequential Recommendation

7. Sequence-aware item recommendations for multiply repeated user-item interactions

8. Code Reviewer Recommendation for Architecture Violations: An Exploratory Study

9. Potential degradation of transportation network efficiency due to route recommendations

10. Manipulating Federated Recommender Systems: Poisoning with Synthetic Users and Its Countermeasures, SIGIR2023

11. Optimism Based Exploration in Large-Scale Recommender Systems

12. A dynamic Bayesian optimized active recommender system for curiosity-driven Human-in-the-loop automated experiments

13. Is More Always Better? The Effects of Personal Characteristics and Level of Detail on the Perception of Explanations in a Recommender System, UMAP2022

14. Hierarchically Fusing Long and Short-Term User Interests for Click-Through Rate Prediction in Product Search, CIKM2022

15. FinalMLP: An Enhanced Two-Stream MLP Model for CTR Prediction, AAAI2023

1. Zero-Shot Next-Item Recommendation using Large Pretrained Language Models

Lei Wang, Ee-Peng Lim

https://arxiv.org/abs/2304.03153

Large language models (LLMs) have achieved impressive zero-shot performance in various natural language processing (NLP) tasks, demonstrating their capabilities for inference without training examples. Despite their success, no research has yet explored the potential of LLMs to perform next-item recommendations in the zero-shot setting. We have identified two major challenges that must be addressed to enable LLMs to act effectively as recommenders. First, the recommendation space can be extremely large for LLMs, and LLMs do not know about the target user's past interacted items and preferences. To address this gap, we propose a prompting strategy called Zero-Shot Next-Item Recommendation (NIR) prompting that directs LLMs to make next-item recommendations. Specifically, the NIR-based strategy involves using an external module to generate candidate items based on user-filtering or item-filtering. Our strategy incorporates a 3-step prompting that guides GPT-3 to carry subtasks that capture the user's preferences, select representative previously watched movies, and recommend a ranked list of 10 movies. We evaluate the proposed approach using GPT-3 on MovieLens 100K dataset and show that it achieves strong zero-shot performance, even outperforming some strong sequential recommendation models trained on the entire training dataset. These promising results highlight the ample research opportunities to use LLMs as recommenders. The code can be found at: https://github.com/AGI-Edgerunners/LLM-Next-Item-Rec

2. Simplifying Content-Based Neural News Recommendation: On User Modeling and Training Objectives, SIGIR2023

Andreea Iana, Goran Glavaš, Heiko Paulheim

https://arxiv.org/abs/2304.03112

The advent of personalized news recommendation has given rise to increasingly complex recommender architectures. Most neural news recommenders rely on user click behavior and typically introduce dedicated user encoders that aggregate the content of clicked news into user embeddings (early fusion). These models are predominantly trained with standard point-wise classification objectives. The existing body of work exhibits two main shortcomings: (1) despite general design homogeneity, direct comparisons between models are hindered by varying evaluation datasets and protocols; (2) it leaves alternative model designs and training objectives vastly unexplored. In this work, we present a unified framework for news recommendation, allowing for a systematic and fair comparison of news recommenders across several crucial design dimensions: (i) candidate-awareness in user modeling, (ii) click behavior fusion, and (iii) training objectives. Our findings challenge the status quo in neural news recommendation. We show that replacing sizable user encoders with parameter-efficient dot products between candidate and clicked news embeddings (late fusion) often yields substantial performance gains. Moreover, our results render contrastive training a viable alternative to point-wise classification objectives.

3. HGCC: Enhancing Hyperbolic Graph Convolution Networks on Heterogeneous Collaborative Graph for Recommendation

Lu Zhang, Ning Wu

https://arxiv.org/abs/2304.02961

Due to the naturally power-law distributed nature of user-item interaction data in recommendation tasks, hyperbolic space modeling has recently been introduced into collaborative filtering methods. Among them, hyperbolic GCN combines the advantages of GCN and hyperbolic space and achieves a surprising performance. However, these methods only partially exploit the nature of hyperbolic space in their designs due to completely random embedding initialization and an inaccurate tangent space aggregation. In addition, the data used in these works mainly focus on user-item interaction data only, which further limits the performance of the models. In this paper, we propose a hyperbolic GCN collaborative filtering model, HGCC, which improves the existing hyperbolic GCN structure for collaborative filtering and incorporates side information. It keeps the long-tailed nature of the collaborative graph by adding power law prior to node embedding initialization; then, it aggregates neighbors directly in multiple hyperbolic spaces through the gyromidpoint method to obtain more accurate computation results; finally, the gate fusion with prior is used to fuse multiple embeddings of one node from different hyperbolic space automatically. Experimental results on four real datasets show that our model is highly competitive and outperforms leading baselines, including hyperbolic GCNs. Further experiments validate the efficacy of our proposed approach and give a further explanation by the learned embedding.

4. DLRover: An Elastic Deep Training Extension with Auto Job Resource Recommendation

Qinlong Wang, Bo Sang, Haitao Zhang, Mingjie Tang, Ke Zhang

https://arxiv.org/abs/2304.01468

The cloud is still a popular platform for distributed deep learning (DL) training jobs since resource sharing in the cloud can improve resource utilization and reduce overall costs. However, such sharing also brings multiple challenges for DL training jobs, e.g., high-priority jobs could impact, even interrupt, low-priority jobs. Meanwhile, most existing distributed DL training systems require users to configure the resources (i.e., the number of nodes and resources like CPU and memory allocated to each node) of jobs manually before job submission and can not adjust the job's resources during the runtime. The resource configuration of a job deeply affect this job's performance (e.g., training throughput, resource utilization, and completion rate). However, this usually leads to poor performance of jobs since users fail to provide optimal resource configuration in most cases. DLROVER is a distributed DL framework can auto-configure a DL job's initial resources and dynamically tune the job's resources to win the better performance. With elastic capability, DLROVER can effectively adjusts the resources of a job when there are performance issues detected or a job fails because of faults or eviction. Evaluations results show DLROVER can outperform manual well-tuned resource configurations. Furthermore, in the production Kubernetes cluster of ANT GROUP, DLROVER reduces the medium of job completion time by 31%, and improves the job completion rate by 6%, CPU utilization by 15%, and memory utilization by 20% compared with manual configuration.

5. Enhancing Clinical Evidence Recommendation with Multi-Channel Heterogeneous Learning on Evidence Graphs

Maolin Luo, Xiang Zhang

https://arxiv.org/abs/2304.01242

Clinical evidence encompasses the associations and impacts between patients, interventions (such as drugs or physiotherapy), problems, and outcomes. The goal of recommending clinical evidence is to provide medical practitioners with relevant information to support their decision-making processes and to generate new evidence. Our specific task focuses on recommending evidence based on clinical problems. However, the direct connections between certain clinical problems and related evidence are often sparse, creating a challenge of link sparsity. Additionally, to recommend appropriate evidence, it is essential to jointly exploit both topological relationships among evidence and textual information describing them. To address these challenges, we define two knowledge graphs: an Evidence Co-reference Graph and an Evidence Text Graph, to represent the topological and linguistic relations among evidential elements, respectively. We also introduce a multi-channel heterogeneous learning model and a fusional attention mechanism to handle the co-reference-text heterogeneity in evidence recommendation. Our experiments demonstrate that our model outperforms state-of-the-art methods on open data.

6. DiffuRec: A Diffusion Model for Sequential Recommendation

Zihao Li, Aixin Sun, Chenliang Li

https://arxiv.org/abs/2304.00686

Mainstream solutions to Sequential Recommendation (SR) represent items with fixed vectors. These vectors have limited capability in capturing items' latent aspects and users' diverse preferences. As a new generative paradigm, Diffusion models have achieved excellent performance in areas like computer vision and natural language processing. To our understanding, its unique merit in representation generation well fits the problem setting of sequential recommendation. In this paper, we make the very first attempt to adapt Diffusion model to SR and propose DiffuRec, for item representation construction and uncertainty injection. Rather than modeling item representations as fixed vectors, we represent them as distributions in DiffuRec, which reflect user's multiple interests and item's various aspects adaptively. In diffusion phase, DiffuRec corrupts the target item embedding into a Gaussian distribution via noise adding, which is further applied for sequential item distribution representation generation and uncertainty injection. Afterwards, the item representation is fed into an Approximator for target item representation reconstruction. In reversion phase, based on user's historical interaction behaviors, we reverse a Gaussian noise into the target item representation, then apply rounding operation for target item prediction. Experiments over four datasets show that DiffuRec outperforms strong baselines by a large margin.

7. Sequence-aware item recommendations for multiply repeated user-item interactions

Juan Pablo Equihua, Maged Ali, Henrik Nordmark, Berthold Lausen

https://arxiv.org/abs/2304.00578

Recommender systems are one of the most successful applications of machine learning and data science. They are successful in a wide variety of application domains, including e-commerce, media streaming content, email marketing, and virtually every industry where personalisation facilitates better user experience or boosts sales and customer engagement. The main goal of these systems is to analyse past user behaviour to predict which items are of most interest to users. They are typically built with the use of matrix-completion techniques such as collaborative filtering or matrix factorisation. However, although these approaches have achieved tremendous success in numerous real-world applications, their effectiveness is still limited when users might interact multiple times with the same items, or when user preferences change over time.

8. Code Reviewer Recommendation for Architecture Violations: An Exploratory Study

Ruiyin Li, Peng Liang, Paris Avgeriou

https://arxiv.org/abs/2303.18058

Code review is a common practice in software development and often conducted before code changes are merged into the code repository. A number of approaches for automatically recommending appropriate reviewers have been proposed to match such code changes to pertinent reviewers. However, such approaches are generic, i.e., they do not focus on specific types of issues during code reviews. In this paper, we propose an approach that focuses on architecture violations, one of the most critical type of issues identified during code review. Specifically, we aim at automating the recommendation of code reviewers, who are potentially qualified to review architecture violations, based on reviews of code changes. To this end, we selected three common similarity detection methods to measure the file path similarity of code commits and the semantic similarity of review comments. We conducted a series of experiments on finding the appropriate reviewers through evaluating and comparing these similarity detection methods in separate and combined ways with the baseline reviewer recommendation approach, RevFinder. The results show that the common similarity detection methods can produce acceptable performance scores and achieve a better performance than RevFinder. The sampling techniques used in recommending code reviewers can impact the performance of reviewer recommendation approaches. We also discuss the potential implications of our findings for both researchers and practitioners.

9. Potential degradation of transportation network efficiency due to route recommendations

Tommaso Toso, Alain Y. Kibangou, Paolo Frasca

https://arxiv.org/abs/2303.17923

In this work, we propose a road traffic model to assess the effects of real-time route recommendations on the traffic flow between an origin and a destination connected by two possible routes. We suppose that this origin-destination pair is subject to a constant demand flow and that a certain fraction of the users constituting this demand has access to a navigation application (app). The model aims to investigate potential drawbacks, at the global level, of the usage of navigation apps. After a comprehensive stability analysis, we show that an excessive use of navigation apps can degrade of the network efficiency, which may take the form of a decrease of performance measures or of the failure to satisfy the user demand.

10. Manipulating Federated Recommender Systems: Poisoning with Synthetic Users and Its Countermeasures, SIGIR2023

Wei Yuan, Quoc Viet Hung Nguyen, Tieke He, Liang Chen, Hongzhi Yin

https://arxiv.org/abs/2304.03054

Federated Recommender Systems (FedRecs) are considered privacy-preserving techniques to collaboratively learn a recommendation model without sharing user data. Since all participants can directly influence the systems by uploading gradients, FedRecs are vulnerable to poisoning attacks of malicious clients. However, most existing poisoning attacks on FedRecs are either based on some prior knowledge or with less effectiveness. To reveal the real vulnerability of FedRecs, in this paper, we present a new poisoning attack method to manipulate target items' ranks and exposure rates effectively in the top- recommendation without relying on any prior knowledge. Specifically, our attack manipulates target items' exposure rate by a group of synthetic malicious users who upload poisoned gradients considering target items' alternative products. We conduct extensive experiments with two widely used FedRecs (Fed-NCF and Fed-LightGCN) on two real-world recommendation datasets. The experimental results show that our attack can significantly improve the exposure rate of unpopular target items with extremely fewer malicious users and fewer global epochs than state-of-the-art attacks. In addition to disclosing the security hole, we design a novel countermeasure for poisoning attacks on FedRecs. Specifically, we propose a hierarchical gradient clipping with sparsified updating to defend against existing poisoning attacks. The empirical results demonstrate that the proposed defending mechanism improves the robustness of FedRecs.

11. Optimism Based Exploration in Large-Scale Recommender Systems

Hongbo Guo, Ruben Naeff, Alex Nikulkov, Zheqing Zhu

https://arxiv.org/abs/2304.02572

Bandit learning algorithms have been an increasingly popular design choice for recommender systems. Despite the strong interest in bandit learning from the community, there remains multiple bottlenecks that prevent many bandit learning approaches from productionalization. Two of the most important bottlenecks are scaling to multi-task and A/B testing. Classic bandit algorithms, especially those leveraging contextual information, often requires reward for uncertainty estimation, which hinders their adoptions in multi-task recommender systems. Moreover, different from supervised learning algorithms, bandit learning algorithms emphasize greatly on the data collection process through their explorative nature. Such explorative behavior induces unfair evaluation for bandit learning agents in a classic A/B test setting. In this work, we present a novel design of production bandit learning life-cycle for recommender systems, along with a novel set of metrics to measure their efficiency in user exploration. We show through large-scale production recommender system experiments and in-depth analysis that our bandit agent design improves personalization for the production recommender system and our experiment design fairly evaluates the performance of bandit learning algorithms.

12. A dynamic Bayesian optimized active recommender system for curiosity-driven Human-in-the-loop automated experiments

Arpan Biswas, Yongtao Liu, Nicole Creange, Yu-Chen Liu, Stephen Jesse, Jan-Chi Yang, Sergei V. Kalinin, Maxim A. Ziatdinov, Rama K. Vasudevan

https://arxiv.org/abs/2304.02484

Optimization of experimental materials synthesis and characterization through active learning methods has been growing over the last decade, with examples ranging from measurements of diffraction on combinatorial alloys at synchrotrons, to searches through chemical space with automated synthesis robots for perovskites. In virtually all cases, the target property of interest for optimization is defined apriori with limited human feedback during operation. In contrast, here we present the development of a new type of human in the loop experimental workflow, via a Bayesian optimized active recommender system (BOARS), to shape targets on the fly, employing human feedback. We showcase examples of this framework applied to pre-acquired piezoresponse force spectroscopy of a ferroelectric thin film, and then implement this in real time on an atomic force microscope, where the optimization proceeds to find symmetric piezoresponse amplitude hysteresis loops. It is found that such features appear more affected by subsurface defects than the local domain structure. This work shows the utility of human-augmented machine learning approaches for curiosity-driven exploration of systems across experimental domains. The analysis reported here is summarized in Colab Notebook for the purpose of tutorial and application to other data: https://github.com/arpanbiswas52/varTBO

13. Is More Always Better? The Effects of Personal Characteristics and Level of Detail on the Perception of Explanations in a Recommender System, UMAP2022

Mohamed Amine Chatti, Mouadh Guesmi, Laura Vorgerd, Thao Ngo, Shoeb Joarder, Qurat Ul Ain, Arham Muslim

https://arxiv.org/abs/2304.00969

Despite the acknowledgment that the perception of explanations may vary considerably between end-users, explainable recommender systems (RS) have traditionally followed a one-size-fits-all model, whereby the same explanation level of detail is provided to each user, without taking into consideration individual user's context, i.e., goals and personal characteristics. To fill this research gap, we aim in this paper at a shift from a one-size-fits-all to a personalized approach to explainable recommendation by giving users agency in deciding which explanation they would like to see. We developed a transparent Recommendation and Interest Modeling Application (RIMA) that provides on-demand personalized explanations of the recommendations, with three levels of detail (basic, intermediate, advanced) to meet the demands of different types of end-users. We conducted a within-subject study (N=31) to investigate the relationship between user's personal characteristics and the explanation level of detail, and the effects of these two variables on the perception of the explainable RS with regard to different explanation goals. Our results show that the perception of explainable RS with different levels of detail is affected to different degrees by the explanation goal and user type. Consequently, we suggested some theoretical and design guidelines to support the systematic design of explanatory interfaces in RS tailored to the user's context.

14. Hierarchically Fusing Long and Short-Term User Interests for Click-Through Rate Prediction in Product Search, CIKM2022

Qijie Shen, Hong Wen, Jing Zhang, Qi Rao

https://arxiv.org/abs/2304.02089

Estimating Click-Through Rate (CTR) is a vital yet challenging task in personalized product search. However, existing CTR methods still struggle in the product search settings due to the following three challenges including how to more effectively extract users' short-term interests with respect to multiple aspects, how to extract and fuse users' long-term interest with short-term interests, how to address the entangling characteristic of long and short-term interests. To resolve these challenges, in this paper, we propose a new approach named Hierarchical Interests Fusing Network (HIFN), which consists of four basic modules namely Short-term Interests Extractor (SIE), Long-term Interests Extractor (LIE), Interests Fusion Module (IFM) and Interests Disentanglement Module (IDM). Specifically, SIE is proposed to extract user's short-term interests by integrating three fundamental interests encoders within it namely query-dependent, target-dependent and causal-dependent interest encoder, respectively, followed by delivering the resultant representation to the module LIE, where it can effectively capture user long-term interests by devising an attention mechanism with respect to the short-term interests from SIE module. In IFM, the achieved long and short-term interests are further fused in an adaptive manner, followed by concatenating it with original raw context features for the final prediction result. Last but not least, considering the entangling characteristic of long and short-term interests, IDM further devises a self-supervised framework to disentangle long and short-term interests. Extensive offline and online evaluations on a real-world e-commerce platform demonstrate the superiority of HIFN over state-of-the-art methods.

15. FinalMLP: An Enhanced Two-Stream MLP Model for CTR Prediction, AAAI2023

Kelong Mao, Jieming Zhu, Liangcai Su, Guohao Cai, Yuru Li, Zhenhua Dong

https://arxiv.org/abs/2304.00902

Click-through rate (CTR) prediction is one of the fundamental tasks for online advertising and recommendation. While multi-layer perceptron (MLP) serves as a core component in many deep CTR prediction models, it has been widely recognized that applying a vanilla MLP network alone is inefficient in learning multiplicative feature interactions. As such, many two-stream interaction models (e.g., DeepFM and DCN) have been proposed by integrating an MLP network with another dedicated network for enhanced CTR prediction. As the MLP stream learns feature interactions implicitly, existing research focuses mainly on enhancing explicit feature interactions in the complementary stream. In contrast, our empirical study shows that a well-tuned two-stream MLP model that simply combines two MLPs can even achieve surprisingly good performance, which has never been reported before by existing work. Based on this observation, we further propose feature selection and interaction aggregation layers that can be easily plugged to make an enhanced two-stream MLP model, FinalMLP. In this way, it not only enables differentiated feature inputs but also effectively fuses stream-level interactions across two streams. Our evaluation results on four open benchmark datasets as well as an online A/B test in our industrial system show that FinalMLP achieves better performance than many sophisticated two-stream CTR models. Our source code will be available at MindSpore/models and FuxiCTR/model_zoo.

本文参与 腾讯云自媒体分享计划,分享自微信公众号。
原始发表:2023-04-10,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 机器学习与推荐算法 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 1. Zero-Shot Next-Item Recommendation using Large Pretrained Language Models
  • 2. Simplifying Content-Based Neural News Recommendation: On User Modeling and Training Objectives, SIGIR2023
  • 3. HGCC: Enhancing Hyperbolic Graph Convolution Networks on Heterogeneous Collaborative Graph for Recommendation
  • 4. DLRover: An Elastic Deep Training Extension with Auto Job Resource Recommendation
  • 5. Enhancing Clinical Evidence Recommendation with Multi-Channel Heterogeneous Learning on Evidence Graphs
  • 6. DiffuRec: A Diffusion Model for Sequential Recommendation
  • 7. Sequence-aware item recommendations for multiply repeated user-item interactions
  • 8. Code Reviewer Recommendation for Architecture Violations: An Exploratory Study
  • 9. Potential degradation of transportation network efficiency due to route recommendations
  • 10. Manipulating Federated Recommender Systems: Poisoning with Synthetic Users and Its Countermeasures, SIGIR2023
  • 11. Optimism Based Exploration in Large-Scale Recommender Systems
  • 12. A dynamic Bayesian optimized active recommender system for curiosity-driven Human-in-the-loop automated experiments
  • 13. Is More Always Better? The Effects of Personal Characteristics and Level of Detail on the Perception of Explanations in a Recommender System, UMAP2022
  • 14. Hierarchically Fusing Long and Short-Term User Interests for Click-Through Rate Prediction in Product Search, CIKM2022
  • 15. FinalMLP: An Enhanced Two-Stream MLP Model for CTR Prediction, AAAI2023
相关产品与服务
联邦学习
联邦学习(Federated Learning,FELE)是一种打破数据孤岛、释放 AI 应用潜能的分布式机器学习技术,能够让联邦学习各参与方在不披露底层数据和底层数据加密(混淆)形态的前提下,通过交换加密的机器学习中间结果实现联合建模。该产品兼顾AI应用与隐私保护,开放合作,协同性高,充分释放大数据生产力,广泛适用于金融、消费互联网等行业的业务创新场景。
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档