default search action
Shimon Whiteson
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j31]Jacob Beck, Matthew Thomas Jackson, Risto Vuorio, Zheng Xiong, Shimon Whiteson:
SplAgger: Split Aggregation for Meta-Reinforcement Learning. RLJ 1: 450-469 (2024) - [j30]Matthew Thomas Jackson, Michael T. Matthews, Cong Lu, Benjamin Ellis, Shimon Whiteson, Jakob Nicolaus Foerster:
Policy-Guided Diffusion. RLJ 4: 1855-1872 (2024) - [c156]Alexander Rutherford, Benjamin Ellis, Matteo Gallici, Jonathan Cook, Andrei Lupu, Garðar Ingvarsson, Timon Willi, Akbir Khan, Christian Schröder de Witt, Alexandra Souly, Saptarashmi Bandyopadhyay, Mikayel Samvelyan, Minqi Jiang, Robert T. Lange, Shimon Whiteson, Bruno Lacerda, Nick Hawes, Tim Rocktäschel, Chris Lu, Jakob N. Foerster:
JaxMARL: Multi-Agent RL Environments and Algorithms in JAX. AAMAS 2024: 2444-2446 - [c155]Matthew Thomas Jackson, Chris Lu, Louis Kirsch, Robert Tjarko Lange, Shimon Whiteson, Jakob Nicolaus Foerster:
Discovering Temporally-Aware Reinforcement Learning Algorithms. ICLR 2024 - [c154]Mattie Fellows, Brandon Kaplowitz, Christian Schröder de Witt, Shimon Whiteson:
Bayesian Exploration Networks. ICML 2024 - [c153]Zheng Xiong, Risto Vuorio, Jacob Beck, Matthieu Zimmer, Kun Shao, Shimon Whiteson:
Distilling Morphology-Conditioned Hypernetworks for Efficient Universal Morphology Control. ICML 2024 - [c152]Reza Mahjourian, Rongbing Mu, Valerii Likhosherstov, Paul Mougin, Xiukun Huang, João V. Messias, Shimon Whiteson:
UniGen: Unified Modeling of Initial Agent States and Trajectories for Generating Autonomous Driving Scenarios. ICRA 2024: 16367-16373 - [i115]Matthew Thomas Jackson, Chris Lu, Louis Kirsch, Robert T. Lange, Shimon Whiteson, Jakob Nicolaus Foerster:
Discovering Temporally-Aware Reinforcement Learning Algorithms. CoRR abs/2402.05828 (2024) - [i114]Zheng Xiong, Risto Vuorio, Jacob Beck, Matthieu Zimmer, Kun Shao, Shimon Whiteson:
Distilling Morphology-Conditioned Hypernetworks for Efficient Universal Morphology Control. CoRR abs/2402.06570 (2024) - [i113]Jacob Beck, Matthew Thomas Jackson, Risto Vuorio, Zheng Xiong, Shimon Whiteson:
SplAgger: Split Aggregation for Meta-Reinforcement Learning. CoRR abs/2403.03020 (2024) - [i112]Matthew Thomas Jackson, Michael T. Matthews, Cong Lu, Benjamin Ellis, Shimon Whiteson, Jakob N. Foerster:
Policy-Guided Diffusion. CoRR abs/2404.06356 (2024) - [i111]Reza Mahjourian, Rongbing Mu, Valerii Likhosherstov, Paul Mougin, Xiukun Huang, João V. Messias, Shimon Whiteson:
UniGen: Unified Modeling of Initial Agent States and Trajectories for Generating Autonomous Driving Scenarios. CoRR abs/2405.03807 (2024) - [i110]Risto Vuorio, Mattie Fellows, Cong Lu, Clémence Grislain, Shimon Whiteson:
A Bayesian Solution To The Imitation Gap. CoRR abs/2407.00495 (2024) - [i109]Alexander David Goldie, Chris Lu, Matthew Thomas Jackson, Shimon Whiteson, Jakob Nicolaus Foerster:
Can Learned Optimization Make Reinforcement Learning Less Difficult? CoRR abs/2407.07082 (2024) - 2023
- [c151]Mingfei Sun, Sam Devlin, Jacob Beck, Katja Hofmann, Shimon Whiteson:
Trust Region Bounds for Decentralized PPO Under Non-stationarity. AAMAS 2023: 5-13 - [c150]Yat Long Lo, Christian Schröder de Witt, Samuel Sokota, Jakob Nicolaus Foerster, Shimon Whiteson:
Cheap Talk Discovery and Utilization in Multi-Agent Reinforcement Learning. ICLR 2023 - [c149]Mattie Fellows, Matthew J. A. Smith, Shimon Whiteson:
Why Target Networks Stabilise Temporal Difference Methods. ICML 2023: 9886-9909 - [c148]Zheng Xiong, Jacob Beck, Shimon Whiteson:
Universal Morphology Control via Contextual Modulation. ICML 2023: 38286-38300 - [c147]Maximilian Igl, Punit Shah, Paul Mougin, Sirish Srinivasan, Tarun Gupta, Brandyn White, Kyriacos Shiarlis, Shimon Whiteson:
Hierarchical Imitation Learning for Stochastic Environments. IROS 2023: 1697-1704 - [c146]Yiren Lu, Justin Fu, George Tucker, Xinlei Pan, Eli Bronstein, Rebecca Roelofs, Benjamin Sapp, Brandyn White, Aleksandra Faust, Shimon Whiteson, Dragomir Anguelov, Sergey Levine:
Imitation Is Not Enough: Robustifying Imitation with Reinforcement Learning for Challenging Driving Scenarios. IROS 2023: 7553-7560 - [c145]Jacob Beck, Risto Vuorio, Zheng Xiong, Shimon Whiteson:
Recurrent Hypernetworks are Surprisingly Strong in Meta-RL. NeurIPS 2023 - [c144]Benjamin Ellis, Jonathan Cook, Skander Moalla, Mikayel Samvelyan, Mingfei Sun, Anuj Mahajan, Jakob N. Foerster, Shimon Whiteson:
SMACv2: An Improved Benchmark for Cooperative Multi-Agent Reinforcement Learning. NeurIPS 2023 - [c143]Matthew Thomas Jackson, Minqi Jiang, Jack Parker-Holder, Risto Vuorio, Chris Lu, Gregory Farquhar, Shimon Whiteson, Jakob N. Foerster:
Discovering General Reinforcement Learning Algorithms with Adversarial Environment Design. NeurIPS 2023 - [c142]Nico Montali, John Lambert, Paul Mougin, Alex Kuefler, Nicholas Rhinehart, Michelle Li, Cole Gulino, Tristan Emrich, Zoey Yang, Shimon Whiteson, Brandyn White, Dragomir Anguelov:
The Waymo Open Sim Agents Challenge. NeurIPS 2023 - [i108]Jacob Beck, Risto Vuorio, Evan Zheran Liu, Zheng Xiong, Luisa M. Zintgraf, Chelsea Finn, Shimon Whiteson:
A Survey of Meta-Reinforcement Learning. CoRR abs/2301.08028 (2023) - [i107]Mingfei Sun, Benjamin Ellis, Anuj Mahajan, Sam Devlin, Katja Hofmann, Shimon Whiteson:
Trust-Region-Free Policy Optimization for Stochastic Policies. CoRR abs/2302.07985 (2023) - [i106]Zheng Xiong, Jacob Beck, Shimon Whiteson:
Universal Morphology Control via Contextual Modulation. CoRR abs/2302.11070 (2023) - [i105]Yat Long Lo, Christian Schröder de Witt, Samuel Sokota, Jakob Nicolaus Foerster, Shimon Whiteson:
Cheap Talk Discovery and Utilization in Multi-Agent Reinforcement Learning. CoRR abs/2303.10733 (2023) - [i104]Nico Montali, John Lambert, Paul Mougin, Alex Kuefler, Nick Rhinehart, Michelle Li, Cole Gulino, Tristan Emrich, Zoey Yang, Shimon Whiteson, Brandyn White, Dragomir Anguelov:
The Waymo Open Sim Agents Challenge. CoRR abs/2305.12032 (2023) - [i103]Mattie Fellows, Brandon Kaplowitz, Christian Schröder de Witt, Shimon Whiteson:
Bayesian Exploration Networks. CoRR abs/2308.13049 (2023) - [i102]Maximilian Igl, Punit Shah, Paul Mougin, Sirish Srinivasan, Tarun Gupta, Brandyn White, Kyriacos Shiarlis, Shimon Whiteson:
Hierarchical Imitation Learning for Stochastic Environments. CoRR abs/2309.14003 (2023) - [i101]Jacob Beck, Risto Vuorio, Zheng Xiong, Shimon Whiteson:
Recurrent Hypernetworks are Surprisingly Strong in Meta-RL. CoRR abs/2309.14970 (2023) - [i100]Matthew Thomas Jackson, Minqi Jiang, Jack Parker-Holder, Risto Vuorio, Chris Lu, Gregory Farquhar, Shimon Whiteson, Jakob Nicolaus Foerster:
Discovering General Reinforcement Learning Algorithms with Adversarial Environment Design. CoRR abs/2310.02782 (2023) - [i99]Alexander Rutherford, Benjamin Ellis, Matteo Gallici, Jonathan Cook, Andrei Lupu, Garðar Ingvarsson, Timon Willi, Akbir Khan, Christian Schröder de Witt, Alexandra Souly, Saptarashmi Bandyopadhyay, Mikayel Samvelyan, Minqi Jiang, Robert Tjarko Lange, Shimon Whiteson, Bruno Lacerda, Nick Hawes, Tim Rocktäschel, Chris Lu, Jakob Nicolaus Foerster:
JaxMARL: Multi-Agent RL Environments in JAX. CoRR abs/2311.10090 (2023) - 2022
- [j29]Shangtong Zhang, Shimon Whiteson:
Truncated Emphatic Temporal Difference Methods for Prediction and Control. J. Mach. Learn. Res. 23: 153:1-153:59 (2022) - [c141]Mingfei Sun, Sam Devlin, Katja Hofmann, Shimon Whiteson:
Deterministic and Discriminative Imitation (D2-Imitation): Revisiting Adversarial Imitation for Sample Efficiency. AAAI 2022: 8378-8385 - [c140]Shangtong Zhang, Romain Laroche, Harm van Seijen, Shimon Whiteson, Remi Tachet des Combes:
A Deeper Look at Discounting Mismatch in Actor-Critic Algorithms. AAMAS 2022: 1491-1499 - [c139]Matthew J. A. Smith, Jelena Luketina, Kristian Hartikainen, Maximilian Igl, Shimon Whiteson:
Learning Skills Diverse in Value-Relevant Features. CoLLAs 2022: 1174-1194 - [c138]Eli Bronstein, Sirish Srinivasan, Supratik Paul, Aman Sinha, Matthew O'Kelly, Payam Nikdel, Shimon Whiteson:
Embedding Synthetic Off-Policy Experience for Autonomous Driving via Zero-Shot Curricula. CoRL 2022: 188-198 - [c137]Angad Singh, Omar Makhlouf, Maximilian Igl, João V. Messias, Arnaud Doucet, Shimon Whiteson:
Particle-Based Score Estimation for State Space Model Learning in Autonomous Driving. CoRL 2022: 1168-1177 - [c136]Jacob Beck, Matthew Thomas Jackson, Risto Vuorio, Shimon Whiteson:
Hypernetworks in Meta-Reinforcement Learning. CoRL 2022: 1478-1487 - [c135]Darius Muglich, Luisa M. Zintgraf, Christian A. Schröder de Witt, Shimon Whiteson, Jakob N. Foerster:
Generalized Beliefs for Cooperative AI. ICML 2022: 16062-16082 - [c134]Samuel Sokota, Christian A. Schröder de Witt, Maximilian Igl, Luisa M. Zintgraf, Philip H. S. Torr, Martin Strohmeier, J. Zico Kolter, Shimon Whiteson, Jakob N. Foerster:
Communicating via Markov Decision Processes. ICML 2022: 20314-20328 - [c133]Maximilian Igl, Daewoo Kim, Alex Kuefler, Paul Mougin, Punit Shah, Kyriacos Shiarlis, Dragomir Anguelov, Mark Palatucci, Brandyn White, Shimon Whiteson:
Symphony: Learning Realistic and Diverse Agents for Autonomous Driving Simulation. ICRA 2022: 2445-2451 - [c132]Eli Bronstein, Mark Palatucci, Dominik Notz, Brandyn White, Alex Kuefler, Yiren Lu, Supratik Paul, Payam Nikdel, Paul Mougin, Hongge Chen, Justin Fu, Austin Abrams, Punit Shah, Evan Racah, Benjamin Frenkel, Shimon Whiteson, Dragomir Anguelov:
Hierarchical Model-Based Imitation Learning for Planning in Autonomous Driving. IROS 2022: 8652-8659 - [c131]Vitaly Kurin, Alessandro De Palma, Ilya Kostrikov, Shimon Whiteson, Pawan Kumar Mudigonda:
In Defense of the Unitary Scalarization for Deep Multi-Task Learning. NeurIPS 2022 - [c130]Darius Muglich, Christian Schröder de Witt, Elise van der Pol, Shimon Whiteson, Jakob N. Foerster:
Equivariant Networks for Zero-Shot Coordination. NeurIPS 2022 - [i98]Vitaly Kurin, Alessandro De Palma, Ilya Kostrikov, Shimon Whiteson, M. Pawan Kumar:
In Defense of the Unitary Scalarization for Deep Multi-Task Learning. CoRR abs/2201.04122 (2022) - [i97]Mingfei Sun, Vitaly Kurin, Guoqing Liu, Sam Devlin, Tao Qin, Katja Hofmann, Shimon Whiteson:
You May Not Need Ratio Clipping in PPO. CoRR abs/2202.00079 (2022) - [i96]Mingfei Sun, Sam Devlin, Katja Hofmann, Shimon Whiteson:
Monotonic Improvement Guarantees under Non-stationarity for Decentralized PPO. CoRR abs/2202.00082 (2022) - [i95]Anuj Mahajan, Mikayel Samvelyan, Tarun Gupta, Benjamin Ellis, Mingfei Sun, Tim Rocktäschel, Shimon Whiteson:
Generalization in Cooperative Multi-Agent Systems. CoRR abs/2202.00104 (2022) - [i94]Maximilian Igl, Daewoo Kim, Alex Kuefler, Paul Mougin, Punit Shah, Kyriacos Shiarlis, Dragomir Anguelov, Mark Palatucci, Brandyn White, Shimon Whiteson:
Symphony: Learning Realistic and Diverse Agents for Autonomous Driving Simulation. CoRR abs/2205.03195 (2022) - [i93]Darius Muglich, Luisa M. Zintgraf, Christian Schröder de Witt, Shimon Whiteson, Jakob N. Foerster:
Generalized Beliefs for Cooperative AI. CoRR abs/2206.12765 (2022) - [i92]Risto Vuorio, Jacob Beck, Shimon Whiteson, Jakob N. Foerster, Gregory Farquhar:
An Investigation of the Bias-Variance Tradeoff in Meta-Gradients. CoRR abs/2209.11303 (2022) - [i91]Eli Bronstein, Mark Palatucci, Dominik Notz, Brandyn White, Alex Kuefler, Yiren Lu, Supratik Paul, Payam Nikdel, Paul Mougin, Hongge Chen, Justin Fu, Austin Abrams, Punit Shah, Evan Racah, Benjamin Frenkel, Shimon Whiteson, Dragomir Anguelov:
Hierarchical Model-Based Imitation Learning for Planning in Autonomous Driving. CoRR abs/2210.09539 (2022) - [i90]Jacob Beck, Matthew Thomas Jackson, Risto Vuorio, Shimon Whiteson:
Hypernetworks in Meta-Reinforcement Learning. CoRR abs/2210.11348 (2022) - [i89]Darius Muglich, Christian Schröder de Witt, Elise van der Pol, Shimon Whiteson, Jakob N. Foerster:
Equivariant Networks for Zero-Shot Coordination. CoRR abs/2210.12124 (2022) - [i88]Eli Bronstein, Sirish Srinivasan, Supratik Paul, Aman Sinha, Matthew O'Kelly, Payam Nikdel, Shimon Whiteson:
Embedding Synthetic Off-Policy Experience for Autonomous Driving via Zero-Shot Curricula. CoRR abs/2212.01375 (2022) - [i87]Angad Singh, Omar Makhlouf, Maximilian Igl, João V. Messias, Arnaud Doucet, Shimon Whiteson:
Particle-Based Score Estimation for State Space Model Learning in Autonomous Driving. CoRR abs/2212.06968 (2022) - [i86]Benjamin Ellis, Skander Moalla, Mikayel Samvelyan, Mingfei Sun, Anuj Mahajan, Jakob N. Foerster, Shimon Whiteson:
SMACv2: An Improved Benchmark for Cooperative Multi-Agent Reinforcement Learning. CoRR abs/2212.07489 (2022) - [i85]Yiren Lu, Justin Fu, George Tucker, Xinlei Pan, Eli Bronstein, Becca Roelofs, Benjamin Sapp, Brandyn White, Aleksandra Faust, Shimon Whiteson, Dragomir Anguelov, Sergey Levine:
Imitation Is Not Enough: Robustifying Imitation with Reinforcement Learning for Challenging Driving Scenarios. CoRR abs/2212.11419 (2022) - 2021
- [j28]Jacopo Castellini, Frans A. Oliehoek, Rahul Savani, Shimon Whiteson:
Analysing factorizations of action-value networks for cooperative multi-agent reinforcement learning. Auton. Agents Multi Agent Syst. 35(2): 25 (2021) - [j27]Luisa M. Zintgraf, Sebastian Schulze, Cong Lu, Leo Feng, Maximilian Igl, Kyriacos Shiarlis, Yarin Gal, Katja Hofmann, Shimon Whiteson:
VariBAD: Variational Bayes-Adaptive Deep RL via Meta-Learning. J. Mach. Learn. Res. 22: 289:1-289:39 (2021) - [j26]Dmitrii Beloborodov, Alexander E. Ulanov, Jakob N. Foerster, Shimon Whiteson, A. I. Lvovsky:
Reinforcement learning enhanced quantum-inspired algorithm for combinatorial optimization. Mach. Learn. Sci. Technol. 2(2): 25009 (2021) - [c129]Shangtong Zhang, Bo Liu, Shimon Whiteson:
Mean-Variance Policy Iteration for Risk-Averse Reinforcement Learning. AAAI 2021: 10905-10913 - [c128]Luisa M. Zintgraf, Sam Devlin, Kamil Ciosek, Shimon Whiteson, Katja Hofmann:
Deep Interactive Bayesian Reinforcement Learning via Meta-Learning. AAMAS 2021: 1712-1714 - [c127]Guangliang Li, Hamdi Dibeklioglu, Shimon Whiteson, Hayley Hung:
Facial Feedback for Reinforcement Learning: A Case Study and Offline Analysis Using the TAMER Framework. AAMAS 2021: 1735-1737 - [c126]Tonghan Wang, Tarun Gupta, Anuj Mahajan, Bei Peng, Shimon Whiteson, Chongjie Zhang:
RODE: Learning Roles to Decompose Multi-Agent Tasks. ICLR 2021 - [c125]Maximilian Igl, Gregory Farquhar, Jelena Luketina, Wendelin Boehmer, Shimon Whiteson:
Transient Non-stationarity and Generalisation in Deep Reinforcement Learning. ICLR 2021 - [c124]Vitaly Kurin, Maximilian Igl, Tim Rocktäschel, Wendelin Boehmer, Shimon Whiteson:
My Body is a Cage: the Role of Morphology in Graph-Based Incompatible Control. ICLR 2021 - [c123]Tarun Gupta, Anuj Mahajan, Bei Peng, Wendelin Boehmer, Shimon Whiteson:
UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning. ICML 2021: 3930-3941 - [c122]Shariq Iqbal, Christian A. Schröder de Witt, Bei Peng, Wendelin Boehmer, Shimon Whiteson, Fei Sha:
Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning. ICML 2021: 4596-4606 - [c121]Anuj Mahajan, Mikayel Samvelyan, Lei Mao, Viktor Makoviychuk, Animesh Garg, Jean Kossaifi, Shimon Whiteson, Yuke Zhu, Animashree Anandkumar:
Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning. ICML 2021: 7301-7312 - [c120]Shangtong Zhang, Yi Wan, Richard S. Sutton, Shimon Whiteson:
Average-Reward Off-Policy Policy Evaluation with Function Approximation. ICML 2021: 12578-12588 - [c119]Shangtong Zhang, Hengshuai Yao, Shimon Whiteson:
Breaking the Deadly Triad with a Target Network. ICML 2021: 12621-12631 - [c118]Luisa M. Zintgraf, Leo Feng, Cong Lu, Maximilian Igl, Kristian Hartikainen, Katja Hofmann, Shimon Whiteson:
Exploration in Approximate Hyper-State Space for Meta Reinforcement Learning. ICML 2021: 12991-13001 - [c117]Shangtong Zhang, Wendelin Boehmer, Shimon Whiteson:
Deep Residual Reinforcement Learning (Extended Abstract). IJCAI 2021: 4869-4873 - [c116]Ling Pan, Tabish Rashid, Bei Peng, Longbo Huang, Shimon Whiteson:
Regularized Softmax Deep Multi-Agent Q-Learning. NeurIPS 2021: 1365-1377 - [c115]Bei Peng, Tabish Rashid, Christian Schröder de Witt, Pierre-Alexandre Kamienny, Philip H. S. Torr, Wendelin Boehmer, Shimon Whiteson:
FACMAC: Factored Multi-Agent Centralised Policy Gradients. NeurIPS 2021: 12208-12221 - [c114]Mattie Fellows, Kristian Hartikainen, Shimon Whiteson:
Bayesian Bellman Operators. NeurIPS 2021: 13641-13656 - [c113]Charlie Blake, Vitaly Kurin, Maximilian Igl, Shimon Whiteson:
Snowflake: Scaling GNNs to high-dimensional continuous control via parameter freezing. NeurIPS 2021: 23983-23992 - [i84]Shangtong Zhang, Yi Wan, Richard S. Sutton, Shimon Whiteson:
Average-Reward Off-Policy Policy Evaluation with Function Approximation. CoRR abs/2101.02808 (2021) - [i83]Luisa M. Zintgraf, Sam Devlin, Kamil Ciosek, Shimon Whiteson, Katja Hofmann:
Deep Interactive Bayesian Reinforcement Learning via Meta-Learning. CoRR abs/2101.03864 (2021) - [i82]Shangtong Zhang, Hengshuai Yao, Shimon Whiteson:
Breaking the Deadly Triad with a Target Network. CoRR abs/2101.08862 (2021) - [i81]Charlie Blake, Vitaly Kurin, Maximilian Igl, Shimon Whiteson:
Snowflake: Scaling GNNs to High-Dimensional Continuous Control via Parameter Freezing. CoRR abs/2103.01009 (2021) - [i80]Ling Pan, Tabish Rashid, Bei Peng, Longbo Huang, Shimon Whiteson:
Softmax with Regularization: Better Value Estimation in Multi-Agent Reinforcement Learning. CoRR abs/2103.11883 (2021) - [i79]Bozhidar Vasilev, Tarun Gupta, Bei Peng, Shimon Whiteson:
Semi-On-Policy Training for Sample Efficient Multi-Agent Policy Gradients. CoRR abs/2104.13446 (2021) - [i78]Anuj Mahajan, Mikayel Samvelyan, Lei Mao, Viktor Makoviychuk, Animesh Garg, Jean Kossaifi, Shimon Whiteson, Yuke Zhu, Animashree Anandkumar:
Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning. CoRR abs/2106.00136 (2021) - [i77]Mingfei Sun, Anuj Mahajan, Katja Hofmann, Shimon Whiteson:
SoftDICE for Imitation Learning: Rethinking Off-policy Distribution Matching. CoRR abs/2106.03155 (2021) - [i76]Matthew Fellows, Kristian Hartikainen, Shimon Whiteson:
Bayesian Bellman Operators. CoRR abs/2106.05012 (2021) - [i75]Samuel Sokota, Christian Schröder de Witt, Maximilian Igl, Luisa M. Zintgraf, Philip H. S. Torr, Shimon Whiteson, Jakob N. Foerster:
Implicit Communication as Minimum Entropy Coupling. CoRR abs/2107.08295 (2021) - [i74]Shangtong Zhang, Shimon Whiteson:
Truncated Emphatic Temporal Difference Methods for Prediction and Control. CoRR abs/2108.05338 (2021) - [i73]Pascal Van Der Vaart, Anuj Mahajan, Shimon Whiteson:
Model based Multi-agent Reinforcement Learning with Tensor Decompositions. CoRR abs/2110.14524 (2021) - [i72]Anuj Mahajan, Mikayel Samvelyan, Lei Mao, Viktor Makoviychuk, Animesh Garg, Jean Kossaifi, Shimon Whiteson, Yuke Zhu, Animashree Anandkumar:
Reinforcement Learning in Factored Action Spaces using Tensor Decompositions. CoRR abs/2110.14538 (2021) - [i71]Zheng Xiong, Luisa M. Zintgraf, Jacob Beck, Risto Vuorio, Shimon Whiteson:
On the Practical Consistency of Meta-Reinforcement Learning Algorithms. CoRR abs/2112.00478 (2021) - [i70]Mingfei Sun, Sam Devlin, Katja Hofmann, Shimon Whiteson:
Deterministic and Discriminative Imitation (D2-Imitation): Revisiting Adversarial Imitation for Sample Efficiency. CoRR abs/2112.06054 (2021) - 2020
- [j25]Guangliang Li, Hamdi Dibeklioglu, Shimon Whiteson, Hayley Hung:
Facial feedback for reinforcement learning: a case study and offline analysis using the TAMER framework. Auton. Agents Multi Agent Syst. 34(1): 22 (2020) - [j24]Kamil Ciosek, Shimon Whiteson:
Expected Policy Gradients for Reinforcement Learning. J. Mach. Learn. Res. 21: 52:1-52:51 (2020) - [j23]Supratik Paul, Konstantinos I. Chatzilygeroudis, Kamil Ciosek, Jean-Baptiste Mouret, Michael A. Osborne, Shimon Whiteson:
Robust Reinforcement Learning with Bayesian Optimisation and Quadrature. J. Mach. Learn. Res. 21: 151:1-151:31 (2020) - [j22]Tabish Rashid, Mikayel Samvelyan, Christian Schröder de Witt, Gregory Farquhar, Jakob N. Foerster, Shimon Whiteson:
Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning. J. Mach. Learn. Res. 21: 178:1-178:51 (2020) - [c112]Yash Satsangi, Sungsu Lim, Shimon Whiteson, Frans A. Oliehoek, Martha White:
Maximizing Information Gain in Partially Observable Environments via Prediction Rewards. AAMAS 2020: 1215-1223 - [c111]Shangtong Zhang, Wendelin Boehmer, Shimon Whiteson:
Deep Residual Reinforcement Learning. AAMAS 2020: 1611-1619 - [c110]Tabish Rashid, Bei Peng, Wendelin Boehmer, Shimon Whiteson:
Optimistic Exploration even with a Pessimistic Initialisation. ICLR 2020 - [c109]Luisa M. Zintgraf, Kyriacos Shiarlis, Maximilian Igl, Sebastian Schulze, Yarin Gal, Katja Hofmann, Shimon Whiteson:
VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning. ICLR 2020 - [c108]Wendelin Boehmer, Vitaly Kurin, Shimon Whiteson:
Deep Coordination Graphs. ICML 2020: 980-991 - [c107]Gregory Farquhar, Laura Gustafson, Zeming Lin, Shimon Whiteson, Nicolas Usunier, Gabriel Synnaeve:
Growing Action Spaces. ICML 2020: 3040-3051 - [c106]Shangtong Zhang, Bo Liu, Shimon Whiteson:
GradientDICE: Rethinking Generalized Offline Estimation of Stationary Values. ICML 2020: 11194-11203 - [c105]Shangtong Zhang, Bo Liu, Hengshuai Yao, Shimon Whiteson:
Provably Convergent Two-Timescale Off-Policy Actor-Critic with Function Approximation. ICML 2020: 11204-11213 - [c104]Vitaly Kurin, Saad Godil, Shimon Whiteson, Bryan Catanzaro:
Can Q-Learning with Graph Networks Learn a Generalizable Branching Heuristic for a SAT Solver? NeurIPS 2020 - [c103]Tabish Rashid, Gregory Farquhar, Bei Peng, Shimon Whiteson:
Weighted QMIX: Expanding Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning. NeurIPS 2020 - [c102]Shangtong Zhang, Vivek Veeriah, Shimon Whiteson:
Learning Retrospective Knowledge with Reverse Reinforcement Learning. NeurIPS 2020 - [c101]Maximilian Igl, Andrew Gambardella, Jinke He, Nantas Nardelli, N. Siddharth, Wendelin Boehmer, Shimon Whiteson:
Multitask Soft Option Learning. UAI 2020: 969-978 - [i69]Guangliang Li, Hamdi Dibeklioglu, Shimon Whiteson, Hayley Hung:
Facial Feedback for Reinforcement Learning: A Case Study and Offline Analysis Using the TAMER Framework. CoRR abs/2001.08703 (2020) - [i68]Shangtong Zhang, Bo Liu, Shimon Whiteson:
GradientDICE: Rethinking Generalized Offline Estimation of Stationary Values. CoRR abs/2001.11113 (2020) - [i67]Dmitrii Beloborodov, Alexander E. Ulanov, Jakob N. Foerster, Shimon Whiteson, A. I. Lvovsky:
Reinforcement Learning Enhanced Quantum-inspired Algorithm for Combinatorial Optimization. CoRR abs/2002.04676 (2020) - [i66]Tabish Rashid, Bei Peng, Wendelin Böhmer, Shimon Whiteson:
Optimistic Exploration even with a Pessimistic Initialisation. CoRR abs/2002.12174 (2020) - [i65]Christian Schröder de Witt, Bei Peng, Pierre-Alexandre Kamienny, Philip H. S. Torr, Wendelin Böhmer, Shimon Whiteson:
Deep Multi-Agent Reinforcement Learning for Decentralized Continuous Cooperative Control. CoRR abs/2003.06709 (2020) - [i64]Tabish Rashid, Mikayel Samvelyan, Christian Schröder de Witt, Gregory Farquhar, Jakob N. Foerster, Shimon Whiteson:
Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning. CoRR abs/2003.08839 (2020) - [i63]Shangtong Zhang, Bo Liu, Shimon Whiteson:
Per-Step Reward: A New Perspective for Risk-Averse Reinforcement Learning. CoRR abs/2004.10888 (2020) - [i62]Yash Satsangi, Sungsu Lim, Shimon Whiteson, Frans A. Oliehoek, Martha White:
Maximizing Information Gain in Partially Observable Environments via Prediction Reward. CoRR abs/2005.04912 (2020) - [i61]Pierre-Alexandre Kamienny, Kai Arulkumaran, Feryal M. P. Behbahani, Wendelin Boehmer, Shimon Whiteson:
Privileged Information Dropout in Reinforcement Learning. CoRR abs/2005.09220 (2020) - [i60]Shariq Iqbal, Christian A. Schröder de Witt, Bei Peng, Wendelin Böhmer, Shimon Whiteson, Fei Sha:
AI-QMIX: Attention and Imagination for Dynamic Multi-Agent Reinforcement Learning. CoRR abs/2006.04222 (2020) - [i59]Maximilian Igl, Gregory Farquhar, Jelena Luketina, Wendelin Boehmer, Shimon Whiteson:
The Impact of Non-stationarity on Generalisation in Deep Reinforcement Learning. CoRR abs/2006.05826 (2020) - [i58]Tabish Rashid, Gregory Farquhar, Bei Peng, Shimon Whiteson:
Weighted QMIX: Expanding Monotonic Value Function Factorisation. CoRR abs/2006.10800 (2020) - [i57]Shangtong Zhang, Vivek Veeriah, Shimon Whiteson:
Learning Retrospective Knowledge with Reverse Reinforcement Learning. CoRR abs/2007.06703 (2020) - [i56]Minqi Jiang, Jelena Luketina, Nantas Nardelli, Pasquale Minervini, Philip H. S. Torr, Shimon Whiteson, Tim Rocktäschel:
WordCraft: An Environment for Benchmarking Commonsense Agents. CoRR abs/2007.09185 (2020) - [i55]Yash Satsangi, Shimon Whiteson, Frans A. Oliehoek, Matthijs T. J. Spaan:
Exploiting Submodular Value Functions For Scaling Up Active Perception. CoRR abs/2009.09696 (2020) - [i54]Luisa M. Zintgraf, Leo Feng, Maximilian Igl, Kristian Hartikainen, Katja Hofmann, Shimon Whiteson:
Exploration in Approximate Hyper-State Space for Meta Reinforcement Learning. CoRR abs/2010.01062 (2020) - [i53]Shangtong Zhang, Romain Laroche, Harm van Seijen, Shimon Whiteson, Remi Tachet des Combes:
A Deeper Look at Discounting Mismatch in Actor-Critic Algorithms. CoRR abs/2010.01069 (2020) - [i52]Tonghan Wang, Tarun Gupta, Anuj Mahajan, Bei Peng, Shimon Whiteson, Chongjie Zhang:
RODE: Learning Roles to Decompose Multi-Agent Tasks. CoRR abs/2010.01523 (2020) - [i51]Vitaly Kurin, Maximilian Igl, Tim Rocktäschel, Wendelin Boehmer, Shimon Whiteson:
My Body is a Cage: the Role of Morphology in Graph-Based Incompatible Control. CoRR abs/2010.01856 (2020) - [i50]Tarun Gupta, Anuj Mahajan, Bei Peng, Wendelin Böhmer, Shimon Whiteson:
UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning. CoRR abs/2010.02974 (2020) - [i49]Yash Satsangi, Shimon Whiteson, Frans A. Oliehoek, Henri Bouma:
Real-Time Resource Allocation for Tracking Systems. CoRR abs/2010.03024 (2020) - [i48]Christian Schröder de Witt, Tarun Gupta, Denys Makoviichuk, Viktor Makoviychuk, Philip H. S. Torr, Mingfei Sun, Shimon Whiteson:
Is Independent Learning All You Need in the StarCraft Multi-Agent Challenge? CoRR abs/2011.09533 (2020)
2010 – 2019
- 2019
- [c100]Jacopo Castellini, Frans A. Oliehoek, Rahul Savani, Shimon Whiteson:
The Representational Capacity of Action-Value Networks for Multi-Agent Reinforcement Learning. AAMAS 2019: 1862-1864 - [c99]Mikayel Samvelyan, Tabish Rashid, Christian Schröder de Witt, Gregory Farquhar, Nantas Nardelli, Tim G. J. Rudner, Chia-Man Hung, Philip H. S. Torr, Jakob N. Foerster, Shimon Whiteson:
The StarCraft Multi-Agent Challenge. AAMAS 2019: 2186-2188 - [c98]Alistair Letcher, Jakob N. Foerster, David Balduzzi, Tim Rocktäschel, Shimon Whiteson:
Stable Opponent Shaping in Differentiable Games. ICLR (Poster) 2019 - [c97]Jakob N. Foerster, H. Francis Song, Edward Hughes, Neil Burch, Iain Dunning, Shimon Whiteson, Matthew M. Botvinick, Michael Bowling:
Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning. ICML 2019: 1942-1951 - [c96]Jingkai Mao, Jakob N. Foerster, Tim Rocktäschel, Maruan Al-Shedivat, Gregory Farquhar, Shimon Whiteson:
A Baseline for Any Order Gradient Estimation in Stochastic Computation Graphs. ICML 2019: 4343-4351 - [c95]Supratik Paul, Michael A. Osborne, Shimon Whiteson:
Fingerprint Policy Optimisation for Robust Reinforcement Learning. ICML 2019: 5082-5091 - [c94]Luisa M. Zintgraf, Kyriacos Shiarlis, Vitaly Kurin, Katja Hofmann, Shimon Whiteson:
Fast Context Adaptation via Meta-Learning. ICML 2019: 7693-7702 - [c93]Feryal M. P. Behbahani, Kyriacos Shiarlis, Xi Chen, Vitaly Kurin, Sudhanshu Kasewa, Ciprian Stirbu, João Gomes, Supratik Paul, Frans A. Oliehoek, João V. Messias, Shimon Whiteson:
Learning From Demonstration in the Wild. ICRA 2019: 775-781 - [c92]Jelena Luketina, Nantas Nardelli, Gregory Farquhar, Jakob N. Foerster, Jacob Andreas, Edward Grefenstette, Shimon Whiteson, Tim Rocktäschel:
A Survey of Reinforcement Learning Informed by Natural Language. IJCAI 2019: 6309-6317 - [c91]Shangtong Zhang, Wendelin Boehmer, Shimon Whiteson:
Generalized Off-Policy Actor-Critic. NeurIPS 2019: 1999-2009 - [c90]Shangtong Zhang, Shimon Whiteson:
DAC: The Double Actor-Critic Architecture for Learning Options. NeurIPS 2019: 2010-2020 - [c89]Supratik Paul, Vitaly Kurin, Shimon Whiteson:
Fast Efficient Hyperparameter Tuning for Policy Gradient Methods. NeurIPS 2019: 4618-4628 - [c88]Matthew Fellows, Anuj Mahajan, Tim G. J. Rudner, Shimon Whiteson:
VIREL: A Variational Inference Framework for Reinforcement Learning. NeurIPS 2019: 7120-7134 - [c87]Anuj Mahajan, Tabish Rashid, Mikayel Samvelyan, Shimon Whiteson:
MAVEN: Multi-Agent Variational Exploration. NeurIPS 2019: 7611-7622 - [c86]Gregory Farquhar, Shimon Whiteson, Jakob N. Foerster:
Loaded DiCE: Trading off Bias and Variance in Any-Order Score Function Gradient Estimators for Reinforcement Learning. NeurIPS 2019: 8149-8160 - [c85]Christian Schröder de Witt, Jakob N. Foerster, Gregory Farquhar, Philip H. S. Torr, Wendelin Boehmer, Shimon Whiteson:
Multi-Agent Common Knowledge Reinforcement Learning. NeurIPS 2019: 9924-9935 - [i47]Mikayel Samvelyan, Tabish Rashid, Christian Schröder de Witt, Gregory Farquhar, Nantas Nardelli, Tim G. J. Rudner, Chia-Man Hung, Philip H. S. Torr, Jakob N. Foerster, Shimon Whiteson:
The StarCraft Multi-Agent Challenge. CoRR abs/1902.04043 (2019) - [i46]Supratik Paul, Vitaly Kurin, Shimon Whiteson:
Fast Efficient Hyperparameter Tuning for Policy Gradients. CoRR abs/1902.06583 (2019) - [i45]Jacopo Castellini, Frans A. Oliehoek, Rahul Savani, Shimon Whiteson:
The Representational Capacity of Action-Value Networks for Multi-Agent Reinforcement Learning. CoRR abs/1902.07497 (2019) - [i44]Shangtong Zhang, Wendelin Boehmer, Shimon Whiteson:
Generalized Off-Policy Actor-Critic. CoRR abs/1903.11329 (2019) - [i43]Maximilian Igl, Andrew Gambardella, Nantas Nardelli, N. Siddharth, Wendelin Böhmer, Shimon Whiteson:
Multitask Soft Option Learning. CoRR abs/1904.01033 (2019) - [i42]Shangtong Zhang, Shimon Whiteson:
DAC: The Double Actor-Critic Architecture for Learning Options. CoRR abs/1904.12691 (2019) - [i41]Shangtong Zhang, Wendelin Boehmer, Shimon Whiteson:
Deep Residual Reinforcement Learning. CoRR abs/1905.01072 (2019) - [i40]Wendelin Böhmer, Tabish Rashid, Shimon Whiteson:
Exploration with Unreliable Intrinsic Reward in Multi-Agent Reinforcement Learning. CoRR abs/1906.02138 (2019) - [i39]Jelena Luketina, Nantas Nardelli, Gregory Farquhar, Jakob N. Foerster, Jacob Andreas, Edward Grefenstette, Shimon Whiteson, Tim Rocktäschel:
A Survey of Reinforcement Learning Informed by Natural Language. CoRR abs/1906.03926 (2019) - [i38]Gregory Farquhar, Laura Gustafson, Zeming Lin, Shimon Whiteson, Nicolas Usunier, Gabriel Synnaeve:
Growing Action Spaces. CoRR abs/1906.12266 (2019) - [i37]Gregory Farquhar, Shimon Whiteson, Jakob N. Foerster:
Loaded DiCE: Trading off Bias and Variance in Any-Order Score Function Estimators for Reinforcement Learning. CoRR abs/1909.10549 (2019) - [i36]Vitaly Kurin, Saad Godil, Shimon Whiteson, Bryan Catanzaro:
Improving SAT Solver Heuristics with Graph Networks and Reinforcement Learning. CoRR abs/1909.11830 (2019) - [i35]Wendelin Böhmer, Vitaly Kurin, Shimon Whiteson:
Deep Coordination Graphs. CoRR abs/1910.00091 (2019) - [i34]Anuj Mahajan, Tabish Rashid, Mikayel Samvelyan, Shimon Whiteson:
MAVEN: Multi-Agent Variational Exploration. CoRR abs/1910.07483 (2019) - [i33]Luisa M. Zintgraf, Kyriacos Shiarlis, Maximilian Igl, Sebastian Schulze, Yarin Gal, Katja Hofmann, Shimon Whiteson:
VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning. CoRR abs/1910.08348 (2019) - [i32]Shangtong Zhang, Bo Liu, Hengshuai Yao, Shimon Whiteson:
Provably Convergent Off-Policy Actor-Critic with Function Approximation. CoRR abs/1911.04384 (2019) - [i31]Leo Feng, Luisa M. Zintgraf, Bei Peng, Shimon Whiteson:
VIABLE: Fast Adaptation via Backpropagating Learned Loss. CoRR abs/1911.13159 (2019) - 2018
- [j21]Guangliang Li, Shimon Whiteson, W. Bradley Knox, Hayley Hung:
Social interaction for efficient agent learning from human reward. Auton. Agents Multi Agent Syst. 32(1): 1-25 (2018) - [j20]Yash Satsangi, Shimon Whiteson, Frans A. Oliehoek, Matthijs T. J. Spaan:
Exploiting submodular value functions for scaling up active perception. Auton. Robots 42(2): 209-233 (2018) - [c84]Kamil Ciosek, Shimon Whiteson:
Expected Policy Gradients. AAAI 2018: 2868-2875 - [c83]Jakob N. Foerster, Gregory Farquhar, Triantafyllos Afouras, Nantas Nardelli, Shimon Whiteson:
Counterfactual Multi-Agent Policy Gradients. AAAI 2018: 2974-2982 - [c82]Supratik Paul, Konstantinos I. Chatzilygeroudis, Kamil Ciosek, Jean-Baptiste Mouret, Michael A. Osborne, Shimon Whiteson:
Alternating Optimisation and Quadrature for Robust Control. AAAI 2018: 3925-3933 - [c81]Jakob N. Foerster, Richard Y. Chen, Maruan Al-Shedivat, Shimon Whiteson, Pieter Abbeel, Igor Mordatch:
Learning with Opponent-Learning Awareness. AAMAS 2018: 122-130 - [c80]Gregory Farquhar, Tim Rocktäschel, Maximilian Igl, Shimon Whiteson:
TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement Learning. ICLR (Poster) 2018 - [c79]Jakob N. Foerster, Gregory Farquhar, Maruan Al-Shedivat, Tim Rocktäschel, Eric P. Xing, Shimon Whiteson:
DiCE: The Infinitely Differentiable Monte-Carlo Estimator. ICLR (Workshop) 2018 - [c78]Matthew Fellows, Kamil Ciosek, Shimon Whiteson:
Fourier Policy Gradients. ICML 2018: 1485-1494 - [c77]Jakob N. Foerster, Gregory Farquhar, Maruan Al-Shedivat, Tim Rocktäschel, Eric P. Xing, Shimon Whiteson:
DiCE: The Infinitely Differentiable Monte Carlo Estimator. ICML 2018: 1524-1533 - [c76]Maximilian Igl, Luisa M. Zintgraf, Tuan Anh Le, Frank Wood, Shimon Whiteson:
Deep Variational Reinforcement Learning for POMDPs. ICML 2018: 2122-2131 - [c75]Tabish Rashid, Mikayel Samvelyan, Christian Schröder de Witt, Gregory Farquhar, Jakob N. Foerster, Shimon Whiteson:
QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning. ICML 2018: 4292-4301 - [c74]Kyriacos Shiarlis, Markus Wulfmeier, Sasha Salter, Shimon Whiteson, Ingmar Posner:
TACO: Learning Task Decomposition via Temporal Alignment for Control. ICML 2018: 4661-4670 - [i30]Kamil Ciosek, Shimon Whiteson:
Expected Policy Gradients for Reinforcement Learning. CoRR abs/1801.03326 (2018) - [i29]Jakob N. Foerster, Gregory Farquhar, Maruan Al-Shedivat, Tim Rocktäschel, Eric P. Xing, Shimon Whiteson:
DiCE: The Infinitely Differentiable Monte-Carlo Estimator. CoRR abs/1802.05098 (2018) - [i28]Matthew Fellows, Kamil Ciosek, Shimon Whiteson:
Fourier Policy Gradients. CoRR abs/1802.06891 (2018) - [i27]Kyriacos Shiarlis, Markus Wulfmeier, Sasha Salter, Shimon Whiteson, Ingmar Posner:
TACO: Learning Task Decomposition via Temporal Alignment for Control. CoRR abs/1803.01840 (2018) - [i26]Tabish Rashid, Mikayel Samvelyan, Christian Schröder de Witt, Gregory Farquhar, Jakob N. Foerster, Shimon Whiteson:
QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning. CoRR abs/1803.11485 (2018) - [i25]Supratik Paul, Michael A. Osborne, Shimon Whiteson:
Contextual Policy Optimisation. CoRR abs/1805.10662 (2018) - [i24]Maximilian Igl, Luisa M. Zintgraf, Tuan Anh Le, Frank Wood, Shimon Whiteson:
Deep Variational Reinforcement Learning for POMDPs. CoRR abs/1806.02426 (2018) - [i23]Luisa M. Zintgraf, Kyriacos Shiarlis, Vitaly Kurin, Katja Hofmann, Shimon Whiteson:
CAML: Fast Context Adaptation via Meta-Learning. CoRR abs/1810.03642 (2018) - [i22]Jakob N. Foerster, Christian A. Schröder de Witt, Gregory Farquhar, Philip H. S. Torr, Wendelin Boehmer, Shimon Whiteson:
Multi-Agent Common Knowledge Reinforcement Learning. CoRR abs/1810.11702 (2018) - [i21]Matthew Fellows, Anuj Mahajan, Tim G. J. Rudner, Shimon Whiteson:
VIREL: A Variational Inference Framework for Reinforcement Learning. CoRR abs/1811.01132 (2018) - [i20]Jakob N. Foerster, H. Francis Song, Edward Hughes, Neil Burch, Iain Dunning, Shimon Whiteson, Matthew M. Botvinick, Michael Bowling:
Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning. CoRR abs/1811.01458 (2018) - [i19]Feryal M. P. Behbahani, Kyriacos Shiarlis, Xi Chen, Vitaly Kurin, Sudhanshu Kasewa, Ciprian Stirbu, João Gomes, Supratik Paul, Frans A. Oliehoek, João V. Messias, Shimon Whiteson:
Learning from Demonstration in the Wild. CoRR abs/1811.03516 (2018) - [i18]Alistair Letcher, Jakob N. Foerster, David Balduzzi, Tim Rocktäschel, Shimon Whiteson:
Stable Opponent Shaping in Differentiable Games. CoRR abs/1811.08469 (2018) - 2017
- [b2]Diederik M. Roijers, Shimon Whiteson:
Multi-Objective Decision Making. Synthesis Lectures on Artificial Intelligence and Machine Learning, Morgan & Claypool Publishers 2017, ISBN 978-3-031-00448-3 - [c73]Kamil Andrzej Ciosek, Shimon Whiteson:
OFFER: Off-Environment Reinforcement Learning. AAAI 2017: 1819-1825 - [c72]Shimon Whiteson:
Intro to Reinforcement Learning. BMVC 2017 - [c71]Jakob N. Foerster, Nantas Nardelli, Gregory Farquhar, Triantafyllos Afouras, Philip H. S. Torr, Pushmeet Kohli, Shimon Whiteson:
Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning. ICML 2017: 1146-1155 - [c70]Kyriacos Shiarlis, João V. Messias, Shimon Whiteson:
Rapidly exploring learning trees. ICRA 2017: 1541-1548 - [c69]Kyriacos Shiarlis, João V. Messias, Shimon Whiteson:
Acquiring social interaction behaviours for telepresence robots via deep learning from demonstration. IROS 2017: 37-42 - [c68]João V. Messias, Shimon Whiteson:
Dynamic-Depth Context Tree Weighting. NIPS 2017: 3328-3337 - [c67]Yash Satsangi, Shimon Whiteson, Frans A. Oliehoek, Henri Bouma:
Real-Time Resource Allocation for Tracking Systems. UAI 2017 - [i17]Jakob N. Foerster, Nantas Nardelli, Gregory Farquhar, Philip H. S. Torr, Pushmeet Kohli, Shimon Whiteson:
Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning. CoRR abs/1702.08887 (2017) - [i16]Jakob N. Foerster, Gregory Farquhar, Triantafyllos Afouras, Nantas Nardelli, Shimon Whiteson:
Counterfactual Multi-Agent Policy Gradients. CoRR abs/1705.08926 (2017) - [i15]Kamil Ciosek, Shimon Whiteson:
Expected Policy Gradients. CoRR abs/1706.05374 (2017) - [i14]Jakob N. Foerster, Richard Y. Chen, Maruan Al-Shedivat, Shimon Whiteson, Pieter Abbeel, Igor Mordatch:
Learning with Opponent-Learning Awareness. CoRR abs/1709.04326 (2017) - [i13]Gregory Farquhar, Tim Rocktäschel, Maximilian Igl, Shimon Whiteson:
TreeQN and ATreeC: Differentiable Tree Planning for Deep Reinforcement Learning. CoRR abs/1710.11417 (2017) - 2016
- [j19]Guangliang Li, Shimon Whiteson, W. Bradley Knox, Hayley Hung:
Using informative behavior to increase engagement while learning from human reward. Auton. Agents Multi Agent Syst. 30(5): 826-848 (2016) - [c66]Kyriacos Shiarlis, João V. Messias, Shimon Whiteson:
Inverse Reinforcement Learning from Failure. AAMAS 2016: 1060-1068 - [c65]Guangliang Li, Hamdi Dibeklioglu, Shimon Whiteson, Hayley Hung:
Towards Learning from Implicit Human Reward: (Extended Abstract). AAMAS 2016: 1353-1354 - [c64]Yash Satsangi, Shimon Whiteson, Frans A. Oliehoek:
Probably Approximately Correct Greedy Maximization: (Extended Abstract). AAMAS 2016: 1387-1388 - [c63]Yash Satsangi, Shimon Whiteson, Frans A. Oliehoek:
PAC Greedy Maximization with Efficient Bounds on Information Gain for Sensor Selection. IJCAI 2016: 3220-3227 - [c62]Jakob N. Foerster, Yannis M. Assael, Nando de Freitas, Shimon Whiteson:
Learning to Communicate with Deep Multi-Agent Reinforcement Learning. NIPS 2016: 2137-2145 - [c61]Anne Schuth, Harrie Oosterhuis, Shimon Whiteson, Maarten de Rijke:
Multileave Gradient Descent for Fast Online Learning to Rank. WSDM 2016: 457-466 - [i12]Jakob N. Foerster, Yannis M. Assael, Nando de Freitas, Shimon Whiteson:
Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks. CoRR abs/1602.02672 (2016) - [i11]Yash Satsangi, Shimon Whiteson, Frans A. Oliehoek:
Probably Approximately Correct Greedy Maximization. CoRR abs/1602.07860 (2016) - [i10]Jakob N. Foerster, Yannis M. Assael, Nando de Freitas, Shimon Whiteson:
Learning to Communicate with Deep Multi-Agent Reinforcement Learning. CoRR abs/1605.06676 (2016) - [i9]Supratik Paul, Kamil Ciosek, Michael A. Osborne, Shimon Whiteson:
Alternating Optimisation and Quadrature for Robust Reinforcement Learning. CoRR abs/1605.07496 (2016) - [i8]Hossam Mossalam, Yannis M. Assael, Diederik M. Roijers, Shimon Whiteson:
Multi-Objective Deep Reinforcement Learning. CoRR abs/1610.02707 (2016) - [i7]Yannis M. Assael, Brendan Shillingford, Shimon Whiteson, Nando de Freitas:
LipNet: Sentence-level Lipreading. CoRR abs/1611.01599 (2016) - 2015
- [j18]Diederik Marijn Roijers, Shimon Whiteson, Frans A. Oliehoek:
Computing Convex Coverage Sets for Faster Multi-objective Coordination. J. Artif. Intell. Res. 52: 399-443 (2015) - [c60]Yash Satsangi, Shimon Whiteson, Frans A. Oliehoek:
Exploiting Submodular Value Functions for Faster Dynamic Sensor Selection. AAAI 2015: 3356-3363 - [c59]Guangliang Li, Hayley Hung, Shimon Whiteson:
A Large-Scale Study of Agents Learning from Human Reward. AAMAS 2015: 1771-1772 - [c58]Chiel Kooijman, Maarten de Waard, Maarten Inja, Diederik M. Roijers, Shimon Whiteson:
Pareto Local Search for MOMDP Planning. ESANN 2015 - [c57]Diederik Marijn Roijers, Shimon Whiteson, Frans A. Oliehoek:
Point-Based Planning for Multi-Objective POMDPs. IJCAI 2015: 1666-1672 - [c56]Masrour Zoghi, Zohar S. Karnin, Shimon Whiteson, Maarten de Rijke:
Copeland Dueling Bandits. NIPS 2015: 307-315 - [c55]Artem Grotov, Shimon Whiteson, Maarten de Rijke:
Bayesian Ranker Comparison Based on Historical User Interactions. SIGIR 2015: 273-282 - [c54]Masrour Zoghi, Shimon Whiteson, Maarten de Rijke:
MergeRUCB: A Method for Large-Scale Online Ranker Evaluation. WSDM 2015: 17-26 - [i6]Masrour Zoghi, Zohar Shay Karnin, Shimon Whiteson, Maarten de Rijke:
Copeland Dueling Bandits. CoRR abs/1506.00312 (2015) - 2014
- [j17]Matthijs Snel, Shimon Whiteson:
Learning potential functions and their representations for multi-task reinforcement learning. Auton. Agents Multi Agent Syst. 28(4): 637-681 (2014) - [j16]Harm van Seijen, Shimon Whiteson, Leon J. H. M. Kester:
Efficient Abstraction Selection in Reinforcement Learning. Comput. Intell. 30(4): 657-699 (2014) - [j15]Katja Hofmann, Shimon Whiteson, Anne Schuth, Maarten de Rijke:
"Learning to rank for information retrieval from user interactions" by K. Hofmann, S. Whiteson, A. Schuth, and M. de Rijke with Martin Vesely as coordinator. SIGWEB Newsl. 2014(Spring): 5:1-5:7 (2014) - [c53]Paris Mavromoustakos Blom, Sander Bakkes, Chek Tien Tan, Shimon Whiteson, Diederik M. Roijers, Roberto Valenti, Theo Gevers:
Towards Personalised Gaming via Facial Expression Recognition. AIIDE 2014 - [c52]Diederik Marijn Roijers, Joris Scharpff, Matthijs T. J. Spaan, Frans A. Oliehoek, Mathijs de Weerdt, Shimon Whiteson:
Bounded Approximations for Linear Multi-Objective Planning Under Uncertainty. ICAPS 2014 - [c51]Diederik M. Roijers, Shimon Whiteson, Frans A. Oliehoek:
Linear support for multi-objective coordination graphs. AAMAS 2014: 1297-1304 - [c50]Guangliang Li, Hayley Hung, Shimon Whiteson, W. Bradley Knox:
Leveraging social networks to motivate humans to train agents. AAMAS 2014: 1571-1572 - [c49]Anne Schuth, Floor Sietsma, Shimon Whiteson, Damien Lefortier, Maarten de Rijke:
Multileaved Comparisons for Fast Online Evaluation. CIKM 2014: 71-80 - [c48]Anne Schuth, Floor Sietsma, Shimon Whiteson, Maarten de Rijke:
Optimizing Base Rankers Using Clicks - A Case Study Using BM25. ECIR 2014: 75-87 - [c47]Sander Bakkes, Shimon Whiteson:
Design criteria for challenge balancing of personalised game spaces. FDG 2014 - [c46]Sander Bakkes, Shimon Whiteson, Guangliang Li, George Viorel Visniuc, Efstathios Charitos, Norbert Heijne, Arjen Swellengrebel:
Challenge balancing for personalised game spaces. GEM 2014: 1-8 - [c45]Guangliang Li, Hayley Hung, Shimon Whiteson, W. Bradley Knox:
Learning from human reward benefits from socio-competitive feedback. ICDL-EPIROB 2014: 93-100 - [c44]Masrour Zoghi, Shimon Whiteson, Rémi Munos, Maarten de Rijke:
Relative Upper Confidence Bound for the K-Armed Dueling Bandit Problem. ICML 2014: 10-18 - [c43]Maarten Inja, Chiel Kooijman, Maarten de Waard, Diederik M. Roijers, Shimon Whiteson:
Queued Pareto Local Search for Multi-Objective Optimization. PPSN 2014: 589-599 - [c42]Masrour Zoghi, Shimon Whiteson, Maarten de Rijke, Rémi Munos:
Relative confidence sampling for efficient on-line ranker evaluation. WSDM 2014: 73-82 - [i5]Frans Adriaan Oliehoek, Matthijs T. J. Spaan, Christopher Amato, Shimon Whiteson:
Incremental Clustering and Expansion for Faster Optimal Planning in Dec-POMDPs. CoRR abs/1402.0566 (2014) - [i4]Diederik Marijn Roijers, Peter Vamplew, Shimon Whiteson, Richard Dazeley:
A Survey of Multi-Objective Sequential Decision-Making. CoRR abs/1402.0590 (2014) - 2013
- [j14]Katja Hofmann, Shimon Whiteson, Maarten de Rijke:
Balancing exploration and exploitation in listwise and pairwise online learning to rank for information retrieval. Inf. Retr. 16(1): 63-90 (2013) - [j13]Frans A. Oliehoek, Matthijs T. J. Spaan, Christopher Amato, Shimon Whiteson:
Incremental Clustering and Expansion for Faster Optimal Planning in Dec-POMDPs. J. Artif. Intell. Res. 46: 449-509 (2013) - [j12]Diederik M. Roijers, Peter Vamplew, Shimon Whiteson, Richard Dazeley:
A Survey of Multi-Objective Sequential Decision-Making. J. Artif. Intell. Res. 48: 67-113 (2013) - [j11]Katja Hofmann, Shimon Whiteson, Maarten de Rijke:
Fidelity, Soundness, and Efficiency of Interleaved Comparison Methods. ACM Trans. Inf. Syst. 31(4): 17:1-17:43 (2013) - [c41]Diederik M. Roijers, Shimon Whiteson, Frans A. Oliehoek:
Computing Convex Coverage Sets for Multi-objective Coordination Graphs. ADT 2013: 309-323 - [c40]Frans A. Oliehoek, Shimon Whiteson, Matthijs T. J. Spaan:
Approximate solutions for factored Dec-POMDPs with many agents. AAMAS 2013: 563-570 - [c39]Guangliang Li, Hayley Hung, Shimon Whiteson, W. Bradley Knox:
Using informative behavior to increase engagement in the tamer framework. AAMAS 2013: 909-916 - [c38]Diederik M. Roijers, Shimon Whiteson, Frans A. Oliehoek:
Multi-objective variable elimination for collaborative graphical games. AAMAS 2013: 1209-1210 - [c37]Anne Schuth, Katja Hofmann, Shimon Whiteson, Maarten de Rijke:
Lerot: an online learning to rank framework. LivingLab@CIKM 2013: 23-26 - [c36]Katja Hofmann, Anne Schuth, Shimon Whiteson, Maarten de Rijke:
Reusing Historical Interaction Data for Faster Online Learning to Rank for IR. DIR 2013: 30-31 - [c35]Thomas G. van den Berg, Shimon Whiteson:
Critical factors in the performance of hyperNEAT. GECCO 2013: 759-766 - [c34]Harm van Seijen, Shimon Whiteson, Leon J. H. M. Kester:
Efficient Abstraction Selection in Reinforcement Learning (Extended Abstract). SARA 2013 - [c33]Katja Hofmann, Anne Schuth, Shimon Whiteson, Maarten de Rijke:
Reusing historical interaction data for faster online learning to rank for IR. WSDM 2013: 183-192 - [i3]Masrour Zoghi, Shimon Whiteson, Rémi Munos, Maarten de Rijke:
Relative Upper Confidence Bound for the K-Armed Dueling Bandit Problem. CoRR abs/1312.3393 (2013) - 2012
- [c32]Karun Rao, Shimon Whiteson:
V-MAX: tempered optimism for better PAC reinforcement learning. AAMAS 2012: 375-382 - [c31]Katja Hofmann, Shimon Whiteson, Maarten de Rijke:
Estimating interleaved comparison outcomes from historical click data. CIKM 2012: 1779-1783 - [c30]Frans A. Oliehoek, Shimon Whiteson, Matthijs T. J. Spaan:
Exploiting Structure in Cooperative Bayesian Games. UAI 2012: 654-665 - [p3]Shimon Whiteson:
Evolutionary Computation for Reinforcement Learning. Reinforcement Learning 2012: 325-355 - [i2]Frans A. Oliehoek, Shimon Whiteson, Matthijs T. J. Spaan:
Exploiting Structure in Cooperative Bayesian Games. CoRR abs/1210.4886 (2012) - 2011
- [j10]Rogier Koppejan, Shimon Whiteson:
Neuroevolutionary reinforcement learning for generalized control of simulated helicopters. Evol. Intell. 4(4): 219-241 (2011) - [j9]Harm van Seijen, Shimon Whiteson, Hado van Hasselt, Marco A. Wiering:
Exploiting Best-Match Equations for Efficient Reinforcement Learning. J. Mach. Learn. Res. 12: 2045-2094 (2011) - [j8]Shimon Whiteson, Michael L. Littman:
Introduction to the special issue on empirical evaluations in reinforcement learning. Mach. Learn. 84(1-2): 1-6 (2011) - [c29]Shimon Whiteson, Brian Tanner, Matthew E. Taylor, Peter Stone:
Protecting against evaluation overfitting in empirical reinforcement learning. ADPRL 2011: 120-127 - [c28]Katja Hofmann, Shimon Whiteson, Maarten de Rijke:
A probabilistic method for inferring preferences from clicks. CIKM 2011: 249-258 - [c27]Katja Hofmann, Shimon Whiteson, Maarten de Rijke:
Balancing Exploration and Exploitation in Learning to Rank Online. ECIR 2011: 251-263 - [c26]Matthijs Snel, Shimon Whiteson:
Multi-Task Reinforcement Learning: Shaping and Feature Selection. EWRL 2011: 237-248 - [c25]Steijn Kistemaker, Shimon Whiteson:
Critical factors in the performance of novelty search. GECCO 2011: 965-972 - [c24]Matthijs Snel, Shimon Whiteson, Yasuo Kuniyoshi:
Robust central pattern generators for embodied hierarchical reinforcement learning. ICDL-EPIROB 2011: 1-6 - [c23]Katja Hofmann, Shimon Whiteson, Maarten de Rijke:
Adapting Rankers Online. IRFC 2011: 1-2 - [i1]Frans A. Oliehoek, Shimon Whiteson, Matthijs T. J. Spaan:
Exploiting Agent and Type Independence in Collaborative Graphical Bayesian Games. CoRR abs/1108.0404 (2011) - 2010
- [b1]Shimon Whiteson:
Adaptive Representations for Reinforcement Learning. Studies in Computational Intelligence 291, Springer 2010, ISBN 978-3-642-13931-4, pp. 1-104 - [j7]Shimon Whiteson, Matthew E. Taylor, Peter Stone:
Critical factors in the empirical performance of temporal difference and evolutionary methods for reinforcement learning. Auton. Agents Multi Agent Syst. 21(1): 1-35 (2010) - [j6]Shimon Whiteson, Brian Tanner, Adam White:
Report on the 2008 Reinforcement Learning Competition. AI Mag. 31(2): 81-94 (2010) - [c22]Matthijs Snel, Shimon Whiteson:
Multi-task evolutionary shaping without pre-specified representations. GECCO 2010: 1031-1038 - [p2]Harm van Seijen, Shimon Whiteson, Leon J. H. M. Kester:
Switching between Representations in Reinforcement Learning. Interactive Collaborative Information Systems 2010: 65-84 - [p1]Bram Bakker, Shimon Whiteson, Leon J. H. M. Kester, Frans C. A. Groen:
Traffic Light Control by Multiagent Reinforcement Learning Systems. Interactive Collaborative Information Systems 2010: 475-510
2000 – 2009
- 2009
- [j5]Shimon Whiteson, Daniel Whiteson:
Machine learning for event selection in high energy physics. Eng. Appl. Artif. Intell. 22(8): 1203-1217 (2009) - [c21]Harm van Seijen, Hado van Hasselt, Shimon Whiteson, Marco A. Wiering:
A theoretical and empirical analysis of Expected Sarsa. ADPRL 2009: 177-184 - [c20]Frans A. Oliehoek, Shimon Whiteson, Matthijs T. J. Spaan:
Lossless clustering of histories in decentralized POMDPs. AAMAS (1) 2009: 577-584 - [c19]Corrado Grappiolo, Shimon Whiteson, Gregor Pavlin, Bram Bakker:
Integrating distributed Bayesian inference and reinforcement learning for sensor management. FUSION 2009: 93-101 - [c18]Rogier Koppejan, Shimon Whiteson:
Neuroevolutionary reinforcement learning for generalized helicopter control. GECCO 2009: 145-152 - [c17]Mark Kroon, Shimon Whiteson:
Automatic Feature Selection for Model-Based Reinforcement Learning in Factored MDPs. ICMLA 2009: 324-330 - [c16]Harm van Seijen, Shimon Whiteson:
Postponed Updates for Temporal-Difference Reinforcement Learning. ISDA 2009: 665-672 - 2008
- [c15]Frans A. Oliehoek, Matthijs T. J. Spaan, Shimon Whiteson, Nikos Vlassis:
Exploiting locality of interaction in factored Dec-POMDPs. AAMAS (1) 2008: 517-524 - [c14]Lior Kuyer, Shimon Whiteson, Bram Bakker, Nikos Vlassis:
Multiagent Reinforcement Learning for Urban Traffic Control Using Coordination Graphs. ECML/PKDD (1) 2008: 656-671 - 2007
- [j4]Shimon Whiteson, Matthew E. Taylor, Peter Stone:
Empirical Studies in Action Selection with Reinforcement Learning. Adapt. Behav. 15(1): 33-50 (2007) - [c13]Matthew E. Taylor, Shimon Whiteson, Peter Stone:
Temporal Difference and Policy Search Methods for Reinforcement Learning: An Empirical Comparison. AAAI 2007: 1675-1678 - [c12]Shimon Whiteson, Daniel Whiteson:
Stochastic Optimization for Collision Selection in High Energy Physics. AAAI 2007: 1819-1825 - [c11]Matthew E. Taylor, Shimon Whiteson, Peter Stone:
Transfer via inter-task mappings in policy search reinforcement learning. AAMAS 2007: 37 - 2006
- [j3]Shimon Whiteson, Peter Stone:
Evolutionary Function Approximation for Reinforcement Learning. J. Mach. Learn. Res. 7: 877-917 (2006) - [c10]Shimon Whiteson, Peter Stone:
Sample-Efficient Evolutionary Function Approximation for Reinforcement Learning. AAAI 2006: 518-523 - [c9]Matthew E. Taylor, Shimon Whiteson, Peter Stone:
Comparing evolutionary and temporal difference methods in a reinforcement learning domain. GECCO 2006: 1321-1328 - [c8]Shimon Whiteson, Peter Stone:
On-line evolutionary computation for reinforcement learning in stochastic domains. GECCO 2006: 1577-1584 - 2005
- [j2]Shimon Whiteson, Nate Kohl, Risto Miikkulainen, Peter Stone:
Evolving Soccer Keepaway Players Through Task Decomposition. Mach. Learn. 59(1-2): 5-30 (2005) - [c7]Shimon Whiteson:
Improving Reinforcement Learning Function Approximators via Neuroevolution. AAAI 2005: 1666-1667 - [c6]Shimon Whiteson:
Improving reinforcement learning function approximators via neuroevolution. AAMAS 2005: 1386 - [c5]Shimon Whiteson, Peter Stone, Kenneth O. Stanley, Risto Miikkulainen, Nate Kohl:
Automatic feature selection in neuroevolution. GECCO 2005: 1225-1232 - 2004
- [j1]Shimon Whiteson, Peter Stone:
Adaptive job routing and scheduling. Eng. Appl. Artif. Intell. 17(7): 855-869 (2004) - [c4]Shimon Whiteson, Peter Stone:
Towards Autonomic Computing: Adaptive Job Routing and Scheduling. AAAI 2004: 916-922 - [c3]Shimon Whiteson, Peter Stone:
Towards Autonomic Computing: Adaptive Network Routing and Scheduling. ICAC 2004: 286-287 - 2003
- [c2]Shimon Whiteson, Peter Stone:
Concurrent layered learning. AAMAS 2003: 193-200 - [c1]Shimon Whiteson, Nate Kohl, Risto Miikkulainen, Peter Stone:
Evolving Keepaway Soccer Players through Task Decomposition. GECCO 2003: 356-368
Coauthor Index
aka: Wendelin Boehmer
aka: Kamil Andrzej Ciosek
aka: Jakob Nicolaus Foerster
aka: Frans Adriaan Oliehoek
aka: Diederik Marijn Roijers
aka: Philip H. S. Torr
aka: Christian A. Schröder de Witt
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-12-08 02:23 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint