default search action
Reinforcement Learning Journal, Volume 3
Volume 3, 2024
- Brett Daley, Marlos C. Machado, Martha White:
Demystifying the Recency Heuristic in Temporal-Difference Learning. RLJ 3: 1019-1036 (2024) - Johan Samir Obando-Ceron, João Guilherme Madeira Araújo, Aaron C. Courville, Pablo Samuel Castro:
On the consistency of hyper-parameter selection in value-based deep reinforcement learning. RLJ 3: 1037-1059 (2024) - Frieda Rong, Max Kleiman-Weiner:
Value Internalization: Learning and Generalizing from Social Reward. RLJ 3: 1060-1071 (2024) - Timon Willi, Johan Samir Obando-Ceron, Jakob Nicolaus Foerster, Gintare Karolina Dziugaite, Pablo Samuel Castro:
Mixture of Experts in a Mixture of RL settings. RLJ 3: 1072-1105 (2024) - Davide Corsi, Davide Camponogara, Alessandro Farinelli:
Aquatic Navigation: A Challenging Benchmark for Deep Reinforcement Learning. RLJ 3: 1106-1123 (2024) - Cyrus Cousins, Kavosh Asadi, Elita A. Lobo, Michael Littman:
On Welfare-Centric Fair Reinforcement Learning. RLJ 3: 1124-1137 (2024) - Jiayu Yao, Weiwei Pan, Finale Doshi-Velez, Barbara E. Engelhardt:
Inverse Reinforcement Learning with Multiple Planning Horizons. RLJ 3: 1138-1167 (2024) - Yixuan Zhang, Qiaomin Xie:
Constant Stepsize Q-learning: Distributional Convergence, Bias and Extrapolation. RLJ 3: 1168-1210 (2024) - Haque Ishfaq, Yixin Tan, Yu Yang, Qingfeng Lan, Jianfeng Lu, A. Rupam Mahmood, Doina Precup, Pan Xu:
More Efficient Randomized Exploration for Reinforcement Learning via Approximate Sampling. RLJ 3: 1211-1235 (2024) - Qining Zhang, Honghao Wei, Lei Ying:
Reinforcement Learning from Human Feedback without Reward Inference: Model-Free Algorithm and Instance-Dependent Analysis. RLJ 3: 1236-1251 (2024) - Kevin Tan, Ziping Xu:
A Natural Extension To Online Algorithms For Hybrid RL With Limited Coverage. RLJ 3: 1252-1264 (2024) - Zhiyuan Zhou, Shreyas Sundara Raman, Henry Sowerby, Michael Littman:
Tiered Reward: Designing Rewards for Specification and Fast Learning of Desired Behavior. RLJ 3: 1265-1288 (2024) - Bin Hu, Chenyang Zhao, Pu Zhang, Zihao Zhou, Yuanhang Yang, Zenglin Xu, Bin Liu:
Enabling Intelligent Interactions between an Agent and an LLM: A Reinforcement Learning Approach. RLJ 3: 1289-1305 (2024) - Kris De Asis, Richard S. Sutton:
An Idiosyncrasy of Time-discretization in Reinforcement Learning. RLJ 3: 1306-1316 (2024) - Sai Prasanna, Karim Farid, Raghu Rajan, André Biedenkapp:
Dreaming of Many Worlds: Learning Contextual World Models aids Zero-Shot Generalization. RLJ 3: 1317-1350 (2024) - Tetsuro Morimura, Kazuhiro Ota, Kenshi Abe, Peinan Zhang:
Policy Gradient Algorithms with Monte Carlo Tree Learning for Non-Markov Decision Processes. RLJ 3: 1351-1376 (2024) - Marin Vlastelica, Jin Cheng, Georg Martius, Pavel Kolev:
Offline Diversity Maximization under Imitation Constraints. RLJ 3: 1377-1409 (2024) - Léopold Maytié, Benjamin Devillers, Alexandre Arnold, Rufin VanRullen:
Zero-shot cross-modal transfer of Reinforcement Learning policies through a Global Workspace. RLJ 3: 1410-1426 (2024) - Motoki Omura, Takayuki Osa, Yusuke Mukuta, Tatsuya Harada:
Stabilizing Extreme Q-learning by Maclaurin Expansion. RLJ 3: 1427-1440 (2024) - Julian Dierkes, Emma Cramer, Holger H. Hoos, Sebastian Trimpe:
Combining Automated Optimisation of Hyperparameters and Reward Shape. RLJ 3: 1441-1466 (2024) - He Wang, Laixi Shi, Yuejie Chi:
Sample Complexity of Offline Distributionally Robust Linear Markov Decision Processes. RLJ 3: 1467-1510 (2024) - Raphaël Boige, Yannis Flet-Berliac, Lars C. P. M. Quaedvlieg, Arthur Flajolet, Guillaume Richard, Thomas Pierrot:
PASTA: Pretrained Action-State Transformer Agents. RLJ 3: 1511-1532 (2024)
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.