default search action
Reinforcement Learning Journal, Volume 1
Volume 1, 2024
- Woojin Jeong, Seungki Min:
Improving Thompson Sampling via Information Relaxation for Budgeted Multi-armed Bandits. RLJ 1: 16-28 (2024) - Shuang Wu, Arash A. Amini:
Graph Neural Thompson Sampling. RLJ 1: 29-63 (2024) - Junxiong Wang, Kaiwen Wang, Yueying Li, Nathan Kallus, Immanuel Trummer, Wen Sun:
JoinGym: An Efficient Join Order Selection Environment. RLJ 1: 64-91 (2024) - Antonin Raffin, Olivier Sigaud, Jens Kober, Alin Albu-Schäffer, João Silvério, Freek Stulp:
An Open-Loop Baseline for Reinforcement Learning Locomotion Tasks. RLJ 1: 92-107 (2024) - Raphaël Avalos, Eugenio Bargiacchi, Ann Nowé, Diederik M. Roijers, Frans A. Oliehoek:
Online Planning in POMDPs with State-Requests. RLJ 1: 108-129 (2024) - Abdulaziz Almuzairee, Nicklas Hansen, Henrik I. Christensen:
A Recipe for Unbounded Data Augmentation in Visual Reinforcement Learning. RLJ 1: 130-157 (2024) - Robert J. Moss, Anthony Corso, Jef Caers, Mykel J. Kochenderfer:
BetaZero: Belief-State Planning for Long-Horizon POMDPs using Learned Approximations. RLJ 1: 158-181 (2024) - Audrey Huang, Mohammad Ghavamzadeh, Nan Jiang, Marek Petrik:
Non-adaptive Online Finetuning for Offline Reinforcement Learning. RLJ 1: 182-197 (2024) - Nicholas E. Corrado, Yuxiao Qu, John U. Balis, Adam Labiosa, Josiah P. Hanna:
Guided Data Augmentation for Offline Reinforcement Learning and Imitation Learning. RLJ 1: 198-215 (2024) - Michael Lu, Matin Aghaei, Anant Raj, Sharan Vaswani:
Towards Principled, Practical Policy Gradient for Bandits and Tabular MDPs. RLJ 1: 216-282 (2024) - Benjamin Freed, Thomas Wei, Roberto Calandra, Jeff Schneider, Howie Choset:
Unifying Model-Based and Model-Free Reinforcement Learning with Equivalent Policy Sets. RLJ 1: 283-301 (2024) - Noah Golowich, Ankur Moitra:
The Role of Inherent Bellman Error in Offline Reinforcement Learning with Linear Function Approximation. RLJ 1: 302-341 (2024) - Max Rudolph, Caleb Chuck, Kevin Black, Misha Lvovsky, Scott Niekum, Amy Zhang:
Learning Action-based Representations Using Invariance. RLJ 1: 342-365 (2024) - Oliver Järnefelt, Mahdi Kallel, Carlo D'Eramo:
Cyclicity-Regularized Coordination Graphs. RLJ 1: 366-379 (2024) - Aditya Kapoor, Benjamin Freed, Jeff Schneider, Howie Choset:
Assigning Credit with Partial Reward Decoupling in Multi-Agent Proximal Policy Optimization. RLJ 1: 380-399 (2024) - Quentin Delfosse, Jannis Blüml, Bjarne Gregori, Sebastian Sztwiertnia, Kristian Kersting:
OCAtari: Object-Centric Atari 2600 Reinforcement Learning Environments. RLJ 1: 400-449 (2024) - Jacob Beck, Matthew Thomas Jackson, Risto Vuorio, Zheng Xiong, Shimon Whiteson:
SplAgger: Split Aggregation for Meta-Reinforcement Learning. RLJ 1: 450-469 (2024) - Nan Jiang, Jinzhao Li, Yexiang Xue:
A Tighter Convergence Proof of Reverse Experience Replay. RLJ 1: 470-480 (2024)
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.