Reinforcement Learning Recommendation with Attention-Guided State Modeling

Main Article Content

Isolde Merren

Abstract

This study proposes a novel reinforcement learning-based recommendation algorithm that effectively captures dynamic user preferences and optimizes long-term engagement. Unlike traditional recommendation models that focus solely on short-term accuracy, the proposed framework formulates recommendation as a sequential decision-making problem within a Markov Decision Process (MDP). It employs an attention-enhanced gated recurrent unit (GRU) network to model temporal dependencies in user-item interactions and introduces a hybrid reward-shaping strategy that integrates explicit feedback (ratings) and implicit engagement signals (clicks, dwell time). A deep Q-learning architecture with dual online-target networks ensures stable convergence under sparse and delayed feedback conditions. Experiments conducted on the Yelp dataset show that the proposed RL-Rec algorithm outperforms existing baselines such as MF, NeuMF, GRU4Rec, and DQNRec by significant margins—achieving improvements of 13.6% in Precision@10, 15.4% in NDCG@10, and 13.0% in cumulative reward. The results demonstrate smoother reward convergence and higher recommendation diversity, indicating enhanced exploration-exploitation balance. Ablation studies confirm that both attention mechanisms and recurrent state modeling substantially contribute to accuracy and policy stability. Overall, this research highlights the potential of reinforcement learning to drive next-generation recommendation algorithms that are adaptive, interpretable, and robust in dynamic environments.

Article Details

How to Cite
Merren, I. (2024). Reinforcement Learning Recommendation with Attention-Guided State Modeling. Journal of Computer Science and Software Applications, 4(5), 40–50. Retrieved from https://mfacademia.org/index.php/jcssa/article/view/248
Section
Articles