default search action
Olivier Pietquin
Person information
- affiliation: Google DeepMind
- affiliation: University Lille 1, France
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c152]Kai Cui, Gökçe Dayanikli, Mathieu Laurière, Matthieu Geist, Olivier Pietquin, Heinz Koeppl:
Learning Discrete-Time Major-Minor Mean Field Games. AAAI 2024: 9616-9625 - [c151]Arash Ahmadian, Chris Cremer, Matthias Gallé, Marzieh Fadaee, Julia Kreutzer, Olivier Pietquin, Ahmet Üstün, Sara Hooker:
Back to Basics: Revisiting REINFORCE-Style Optimization for Learning from Human Feedback in LLMs. ACL (1) 2024: 12248-12267 - [c150]Mathieu Rita, Florian Strub, Rahma Chaabouni, Paul Michel, Emmanuel Dupoux, Olivier Pietquin:
Countering Reward Over-Optimization in LLM with Demonstration-Guided Reinforcement Learning. ACL (Findings) 2024: 12447-12472 - [c149]Zida Wu, Mathieu Laurière, Samuel Jia Cong Chua, Matthieu Geist, Olivier Pietquin, Ankur Mehta:
Population-aware Online Mirror Descent for Mean-Field Games by Deep Reinforcement Learning. AAMAS 2024: 2561-2563 - [c148]Yannis Flet-Berliac, Nathan Grinsztajn, Florian Strub, Eugene Choi, Bill Wu, Chris Cremer, Arash Ahmadian, Yash Chandak, Mohammad Gheshlaghi Azar, Olivier Pietquin, Matthieu Geist:
Contrastive Policy Gradient: Aligning LLMs on sequence-level scores in a supervised-friendly fashion. EMNLP 2024: 21353-21370 - [c147]Geoffrey Cideron, Sertan Girgin, Mauro Verzetti, Damien Vincent, Matej Kastelic, Zalán Borsos, Brian McWilliams, Victor Ungureanu, Olivier Bachem, Olivier Pietquin, Matthieu Geist, Léonard Hussenot, Neil Zeghidour, Andrea Agostinelli:
MusicRL: Aligning Music Generation to Human Preferences. ICML 2024 - [i86]Geoffrey Cideron, Sertan Girgin, Mauro Verzetti, Damien Vincent, Matej Kastelic, Zalán Borsos, Brian McWilliams, Victor Ungureanu, Olivier Bachem, Olivier Pietquin, Matthieu Geist, Léonard Hussenot, Neil Zeghidour, Andrea Agostinelli:
MusicRL: Aligning Music Generation to Human Preferences. CoRR abs/2402.04229 (2024) - [i85]Arash Ahmadian, Chris Cremer, Matthias Gallé, Marzieh Fadaee, Julia Kreutzer, Olivier Pietquin, Ahmet Üstün, Sara Hooker:
Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs. CoRR abs/2402.14740 (2024) - [i84]Zida Wu, Mathieu Laurière, Samuel Jia Cong Chua, Matthieu Geist, Olivier Pietquin, Ankur Mehta:
Population-aware Online Mirror Descent for Mean-Field Games by Deep Reinforcement Learning. CoRR abs/2403.03552 (2024) - [i83]Mathieu Rita, Paul Michel, Rahma Chaabouni, Olivier Pietquin, Emmanuel Dupoux, Florian Strub:
Language Evolution with Deep Learning. CoRR abs/2403.11958 (2024) - [i82]Mathieu Rita, Florian Strub, Rahma Chaabouni, Paul Michel, Emmanuel Dupoux, Olivier Pietquin:
Countering Reward Over-optimization in LLM with Demonstration-Guided Reinforcement Learning. CoRR abs/2404.19409 (2024) - [i81]Eugene Choi, Arash Ahmadian, Matthieu Geist, Olivier Pietquin, Mohammad Gheshlaghi Azar:
Self-Improving Robust Preference Optimization. CoRR abs/2406.01660 (2024) - [i80]Yannis Flet-Berliac, Nathan Grinsztajn, Florian Strub, Eugene Choi, Chris Cremer, Arash Ahmadian, Yash Chandak, Mohammad Gheshlaghi Azar, Olivier Pietquin, Matthieu Geist:
Contrastive Policy Gradient: Aligning LLMs on sequence-level scores in a supervised-friendly fashion. CoRR abs/2406.19185 (2024) - [i79]Nathan Grinsztajn, Yannis Flet-Berliac, Mohammad Gheshlaghi Azar, Florian Strub, Bill Wu, Eugene Choi, Chris Cremer, Arash Ahmadian, Yash Chandak, Olivier Pietquin, Matthieu Geist:
Averaging log-likelihoods in direct alignment. CoRR abs/2406.19188 (2024) - [i78]Marius Miron, Sara Keen, Jen-Yu Liu, Benjamin Hoffman, Masato Hagiwara, Olivier Pietquin, Felix Effenberger, Maddie Cusimano:
Biodenoising: animal vocalization denoising without access to clean data. CoRR abs/2410.03427 (2024) - 2023
- [j16]Eugene Kharitonov, Damien Vincent, Zalán Borsos, Raphaël Marinier, Sertan Girgin, Olivier Pietquin, Matt Sharifi, Marco Tagliasacchi, Neil Zeghidour:
Speak, Read and Prompt: High-Fidelity Text-to-Speech with Minimal Supervision. Trans. Assoc. Comput. Linguistics 11: 1703-1718 (2023) - [j15]Zalán Borsos, Raphaël Marinier, Damien Vincent, Eugene Kharitonov, Olivier Pietquin, Matthew Sharifi, Dominik Roblek, Olivier Teboul, David Grangier, Marco Tagliasacchi, Neil Zeghidour:
AudioLM: A Language Modeling Approach to Audio Generation. IEEE ACM Trans. Audio Speech Lang. Process. 31: 2523-2533 (2023) - [c146]Paul Roit, Johan Ferret, Lior Shani, Roee Aharoni, Geoffrey Cideron, Robert Dadashi, Matthieu Geist, Sertan Girgin, Léonard Hussenot, Orgad Keller, Nikola Momchev, Sabela Ramos Garea, Piotr Stanczyk, Nino Vieillard, Olivier Bachem, Gal Elidan, Avinatan Hassidim, Olivier Pietquin, Idan Szpektor:
Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback. ACL (1) 2023: 6252-6272 - [c145]Toshinori Kitamura, Tadashi Kozuno, Yunhao Tang, Nino Vieillard, Michal Valko, Wenhao Yang, Jincheng Mei, Pierre Ménard, Mohammad Gheshlaghi Azar, Rémi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvári, Wataru Kumagai, Yutaka Matsuo:
Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice. ICML 2023: 17135-17175 - [c144]Giorgia Ramponi, Pavel Kolev, Olivier Pietquin, Niao He, Mathieu Laurière, Matthieu Geist:
On Imitation in Mean-field Games. NeurIPS 2023 - [i77]Chris Donahue, Antoine Caillon, Adam Roberts, Ethan Manilow, Philippe Esling, Andrea Agostinelli, Mauro Verzetti, Ian Simon, Olivier Pietquin, Neil Zeghidour, Jesse H. Engel:
SingSong: Generating musical accompaniments from singing. CoRR abs/2301.12662 (2023) - [i76]Eugene Kharitonov, Damien Vincent, Zalán Borsos, Raphaël Marinier, Sertan Girgin, Olivier Pietquin, Matthew Sharifi, Marco Tagliasacchi, Neil Zeghidour:
Speak, Read and Prompt: High-Fidelity Text-to-Speech with Minimal Supervision. CoRR abs/2302.03540 (2023) - [i75]Geoffrey Cideron, Baruch Tabanpour, Sebastian Curi, Sertan Girgin, Léonard Hussenot, Gabriel Dulac-Arnold, Matthieu Geist, Olivier Pietquin, Robert Dadashi:
Get Back Here: Robust Imitation by Return-to-Distribution Planning. CoRR abs/2305.01400 (2023) - [i74]Toshinori Kitamura, Tadashi Kozuno, Yunhao Tang, Nino Vieillard, Michal Valko, Wenhao Yang, Jincheng Mei, Pierre Ménard, Mohammad Gheshlaghi Azar, Rémi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvári, Wataru Kumagai, Yutaka Matsuo:
Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice. CoRR abs/2305.13185 (2023) - [i73]Paul Roit, Johan Ferret, Lior Shani, Roee Aharoni, Geoffrey Cideron, Robert Dadashi, Matthieu Geist, Sertan Girgin, Léonard Hussenot, Orgad Keller, Nikola Momchev, Sabela Ramos, Piotr Stanczyk, Nino Vieillard, Olivier Bachem, Gal Elidan, Avinatan Hassidim, Olivier Pietquin, Idan Szpektor:
Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback. CoRR abs/2306.00186 (2023) - [i72]Giorgia Ramponi, Pavel Kolev, Olivier Pietquin, Niao He, Mathieu Laurière, Matthieu Geist:
On Imitation in Mean-field Games. CoRR abs/2306.14799 (2023) - [i71]Kai Cui, Gökçe Dayanikli, Mathieu Laurière, Matthieu Geist, Olivier Pietquin, Heinz Koeppl:
Learning Discrete-Time Major-Minor Mean Field Games. CoRR abs/2312.10787 (2023) - 2022
- [c143]Shideh Rezaeifar, Robert Dadashi, Nino Vieillard, Léonard Hussenot, Olivier Bachem, Olivier Pietquin, Matthieu Geist:
Offline Reinforcement Learning as Anti-exploration. AAAI 2022: 8106-8114 - [c142]Sarah Perrin, Mathieu Laurière, Julien Pérolat, Romuald Élie, Matthieu Geist, Olivier Pietquin:
Generalization in Mean Field Games by Learning Master Policies. AAAI 2022: 9413-9421 - [c141]Nino Vieillard, Marcin Andrychowicz, Anton Raichuk, Olivier Pietquin, Matthieu Geist:
Implicitly Regularized RL with Implicit Q-values. AISTATS 2022: 1380-1402 - [c140]Matthieu Geist, Julien Pérolat, Mathieu Laurière, Romuald Elie, Sarah Perrin, Olivier Bachem, Rémi Munos, Olivier Pietquin:
Concave Utility Reinforcement Learning: The Mean-field Game Viewpoint. AAMAS 2022: 489-497 - [c139]Alexis Jacq, Johan Ferret, Olivier Pietquin, Matthieu Geist:
Lazy-MDPs: Towards Interpretable RL by Learning When to Act. AAMAS 2022: 669-677 - [c138]Paul Muller, Mark Rowland, Romuald Elie, Georgios Piliouras, Julien Pérolat, Mathieu Laurière, Raphaël Marinier, Olivier Pietquin, Karl Tuyls:
Learning Equilibria in Mean-Field Games: Introducing Mean-Field PSRO. AAMAS 2022: 926-934 - [c137]Julien Pérolat, Sarah Perrin, Romuald Elie, Mathieu Laurière, Georgios Piliouras, Matthieu Geist, Karl Tuyls, Olivier Pietquin:
Scaling Mean Field Games by Online Mirror Descent. AAMAS 2022: 1028-1037 - [c136]Theophile Cabannes, Mathieu Laurière, Julien Pérolat, Raphaël Marinier, Sertan Girgin, Sarah Perrin, Olivier Pietquin, Alexandre M. Bayen, Eric Goubault, Romuald Elie:
Solving N-Player Dynamic Routing Games with Congestion: A Mean-Field Approach. AAMAS 2022: 1557-1559 - [c135]Mathieu Rita, Florian Strub, Jean-Bastien Grill, Olivier Pietquin, Emmanuel Dupoux:
On the role of population heterogeneity in emergent communication. ICLR 2022 - [c134]Robert Dadashi, Léonard Hussenot, Damien Vincent, Sertan Girgin, Anton Raichuk, Matthieu Geist, Olivier Pietquin:
Continuous Control with Action Quantization from Demonstrations. ICML 2022: 4537-4557 - [c133]Mathieu Laurière, Sarah Perrin, Sertan Girgin, Paul Muller, Ayush Jain, Theophile Cabannes, Georgios Piliouras, Julien Pérolat, Romuald Elie, Olivier Pietquin, Matthieu Geist:
Scalable Deep Reinforcement Learning Algorithms for Mean Field Games. ICML 2022: 12078-12095 - [c132]Alice Martin, Guillaume Quispe, Charles Ollion, Sylvain Le Corff, Florian Strub, Olivier Pietquin:
Learning Natural Language Generation with Truncated Reinforcement Learning. NAACL-HLT 2022: 12-37 - [c131]Mathieu Rita, Corentin Tallec, Paul Michel, Jean-Bastien Grill, Olivier Pietquin, Emmanuel Dupoux, Florian Strub:
Emergent Communication: Generalization and Overfitting in Lewis Games. NeurIPS 2022 - [i70]Alexis Jacq, Johan Ferret, Olivier Pietquin, Matthieu Geist:
Lazy-MDPs: Towards Interpretable Reinforcement Learning by Learning When to Act. CoRR abs/2203.08542 (2022) - [i69]Mathieu Laurière, Sarah Perrin, Sertan Girgin, Paul Muller, Ayush Jain, Theophile Cabannes, Georgios Piliouras, Julien Pérolat, Romuald Élie, Olivier Pietquin, Matthieu Geist:
Scalable Deep Reinforcement Learning Algorithms for Mean Field Games. CoRR abs/2203.11973 (2022) - [i68]Mathieu Rita, Florian Strub, Jean-Bastien Grill, Olivier Pietquin, Emmanuel Dupoux:
On the role of population heterogeneity in emergent communication. CoRR abs/2204.12982 (2022) - [i67]Mathieu Laurière, Sarah Perrin, Matthieu Geist, Olivier Pietquin:
Learning Mean Field Games: A Survey. CoRR abs/2205.12944 (2022) - [i66]Tadashi Kozuno, Wenhao Yang, Nino Vieillard, Toshinori Kitamura, Yunhao Tang, Jincheng Mei, Pierre Ménard, Mohammad Gheshlaghi Azar, Michal Valko, Rémi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvári:
KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal. CoRR abs/2205.14211 (2022) - [i65]Paul Muller, Romuald Elie, Mark Rowland, Mathieu Laurière, Julien Pérolat, Sarah Perrin, Matthieu Geist, Georgios Piliouras, Olivier Pietquin, Karl Tuyls:
Learning Correlated Equilibria in Mean-Field Games. CoRR abs/2208.10138 (2022) - [i64]Zalán Borsos, Raphaël Marinier, Damien Vincent, Eugene Kharitonov, Olivier Pietquin, Matthew Sharifi, Olivier Teboul, David Grangier, Marco Tagliasacchi, Neil Zeghidour:
AudioLM: a Language Modeling Approach to Audio Generation. CoRR abs/2209.03143 (2022) - [i63]Geoffrey Cideron, Sertan Girgin, Anton Raichuk, Olivier Pietquin, Olivier Bachem, Léonard Hussenot:
vec2text with Round-Trip Translations. CoRR abs/2209.06792 (2022) - [i62]Mathieu Rita, Corentin Tallec, Paul Michel, Jean-Bastien Grill, Olivier Pietquin, Emmanuel Dupoux, Florian Strub:
Emergent Communication: Generalization and Overfitting in Lewis Games. CoRR abs/2209.15342 (2022) - [i61]Alexis Jacq, Manu Orsini, Gabriel Dulac-Arnold, Olivier Pietquin, Matthieu Geist, Olivier Bachem:
C3PO: Learning to Achieve Arbitrary Goals via Massively Entropic Pretraining. CoRR abs/2211.03521 (2022) - 2021
- [c130]Johan Ferret, Olivier Pietquin, Matthieu Geist:
Self-Imitation Advantage Learning. AAMAS 2021: 501-509 - [c129]Léonard Hussenot, Robert Dadashi, Matthieu Geist, Olivier Pietquin:
Show Me the Way: Intrinsic Motivation from Demonstrations. AAMAS 2021: 620-628 - [c128]Aaqib Saeed, David Grangier, Olivier Pietquin, Neil Zeghidour:
Learning From Heterogeneous Eeg Signals with Differentiable Channel Reordering. ICASSP 2021: 1255-1259 - [c127]Marcin Andrychowicz, Anton Raichuk, Piotr Stanczyk, Manu Orsini, Sertan Girgin, Raphaël Marinier, Léonard Hussenot, Matthieu Geist, Olivier Pietquin, Marcin Michalski, Sylvain Gelly, Olivier Bachem:
What Matters for On-Policy Deep Actor-Critic Methods? A Large-Scale Study. ICLR 2021 - [c126]Robert Dadashi, Léonard Hussenot, Matthieu Geist, Olivier Pietquin:
Primal Wasserstein Imitation Learning. ICLR 2021 - [c125]Yannis Flet-Berliac, Johan Ferret, Olivier Pietquin, Philippe Preux, Matthieu Geist:
Adversarially Guided Actor-Critic. ICLR 2021 - [c124]Robert Dadashi, Shideh Rezaeifar, Nino Vieillard, Léonard Hussenot, Olivier Pietquin, Matthieu Geist:
Offline Reinforcement Learning with Pseudometric Learning. ICML 2021: 2307-2318 - [c123]Léonard Hussenot, Marcin Andrychowicz, Damien Vincent, Robert Dadashi, Anton Raichuk, Sabela Ramos, Nikola Momchev, Sertan Girgin, Raphaël Marinier, Lukasz Stafiniak, Manu Orsini, Olivier Bachem, Matthieu Geist, Olivier Pietquin:
Hyperparameter Selection for Imitation Learning. ICML 2021: 4511-4522 - [c122]Sarah Perrin, Mathieu Laurière, Julien Pérolat, Matthieu Geist, Romuald Élie, Olivier Pietquin:
Mean Field Games Flock! The Reinforcement Learning Way. IJCAI 2021: 356-362 - [c121]Mathieu Seurin, Florian Strub, Philippe Preux, Olivier Pietquin:
Don't Do What Doesn't Matter: Intrinsic Motivation with Action Usefulness. IJCAI 2021: 2950-2956 - [c120]Nathan Grinsztajn, Johan Ferret, Olivier Pietquin, Philippe Preux, Matthieu Geist:
There Is No Turning Back: A Self-Supervised Approach for Reversibility-Aware Reinforcement Learning. NeurIPS 2021: 1898-1911 - [c119]Manu Orsini, Anton Raichuk, Léonard Hussenot, Damien Vincent, Robert Dadashi, Sertan Girgin, Matthieu Geist, Olivier Bachem, Olivier Pietquin, Marcin Andrychowicz:
What Matters for Adversarial Imitation Learning? NeurIPS 2021: 14656-14668 - [i60]Yannis Flet-Berliac, Johan Ferret, Olivier Pietquin, Philippe Preux, Matthieu Geist:
Adversarially Guided Actor-Critic. CoRR abs/2102.04376 (2021) - [i59]Julien Pérolat, Sarah Perrin, Romuald Elie, Mathieu Laurière, Georgios Piliouras, Matthieu Geist, Karl Tuyls, Olivier Pietquin:
Scaling up Mean Field Games with Online Mirror Descent. CoRR abs/2103.00623 (2021) - [i58]Robert Dadashi, Shideh Rezaeifar, Nino Vieillard, Léonard Hussenot, Olivier Pietquin, Matthieu Geist:
Offline Reinforcement Learning with Pseudometric Learning. CoRR abs/2103.01948 (2021) - [i57]Sarah Perrin, Mathieu Laurière, Julien Pérolat, Matthieu Geist, Romuald Élie, Olivier Pietquin:
Mean Field Games Flock! The Reinforcement Learning Way. CoRR abs/2105.07933 (2021) - [i56]Mathieu Seurin, Florian Strub, Philippe Preux, Olivier Pietquin:
Don't Do What Doesn't Matter: Intrinsic Motivation with Action Usefulness. CoRR abs/2105.09992 (2021) - [i55]Léonard Hussenot, Marcin Andrychowicz, Damien Vincent, Robert Dadashi, Anton Raichuk, Lukasz Stafiniak, Sertan Girgin, Raphaël Marinier, Nikola Momchev, Sabela Ramos, Manu Orsini, Olivier Bachem, Matthieu Geist, Olivier Pietquin:
Hyperparameter Selection for Imitation Learning. CoRR abs/2105.12034 (2021) - [i54]Manu Orsini, Anton Raichuk, Léonard Hussenot, Damien Vincent, Robert Dadashi, Sertan Girgin, Matthieu Geist, Olivier Bachem, Olivier Pietquin, Marcin Andrychowicz:
What Matters for Adversarial Imitation Learning? CoRR abs/2106.00672 (2021) - [i53]Matthieu Geist, Julien Pérolat, Mathieu Laurière, Romuald Elie, Sarah Perrin, Olivier Bachem, Rémi Munos, Olivier Pietquin:
Concave Utility Reinforcement Learning: the Mean-field Game viewpoint. CoRR abs/2106.03787 (2021) - [i52]Nathan Grinsztajn, Johan Ferret, Olivier Pietquin, Philippe Preux, Matthieu Geist:
There Is No Turning Back: A Self-Supervised Approach for Reversibility-Aware Reinforcement Learning. CoRR abs/2106.04480 (2021) - [i51]Shideh Rezaeifar, Robert Dadashi, Nino Vieillard, Léonard Hussenot, Olivier Bachem, Olivier Pietquin, Matthieu Geist:
Offline Reinforcement Learning as Anti-Exploration. CoRR abs/2106.06431 (2021) - [i50]Nino Vieillard, Marcin Andrychowicz, Anton Raichuk, Olivier Pietquin, Matthieu Geist:
Implicitly Regularized RL with Implicit Q-Values. CoRR abs/2108.07041 (2021) - [i49]Alice Martin, Guillaume Quispe, Charles Ollion, Sylvain Le Corff, Florian Strub, Olivier Pietquin:
Learning Natural Language Generation from Scratch. CoRR abs/2109.09371 (2021) - [i48]Sarah Perrin, Mathieu Laurière, Julien Pérolat, Romuald Élie, Matthieu Geist, Olivier Pietquin:
Generalization in Mean Field Games by Learning Master Policies. CoRR abs/2109.09717 (2021) - [i47]Robert Dadashi, Léonard Hussenot, Damien Vincent, Sertan Girgin, Anton Raichuk, Matthieu Geist, Olivier Pietquin:
Continuous Control with Action Quantization from Demonstrations. CoRR abs/2110.10149 (2021) - [i46]Theophile Cabannes, Mathieu Laurière, Julien Pérolat, Raphaël Marinier, Sertan Girgin, Sarah Perrin, Olivier Pietquin, Alexandre M. Bayen, Éric Goubault, Romuald Elie:
Solving N-player dynamic routing games with congestion: a mean field approach. CoRR abs/2110.11943 (2021) - [i45]Sabela Ramos, Sertan Girgin, Léonard Hussenot, Damien Vincent, Hanna Yakubovich, Daniel Toyama, Anita Gergely, Piotr Stanczyk, Raphaël Marinier, Jeremiah Harmsen, Olivier Pietquin, Nikola Momchev:
RLDS: an Ecosystem to Generate, Share and Use Datasets in Reinforcement Learning. CoRR abs/2111.02767 (2021) - [i44]Paul Muller, Mark Rowland, Romuald Elie, Georgios Piliouras, Julien Pérolat, Mathieu Laurière, Raphaël Marinier, Olivier Pietquin, Karl Tuyls:
Learning Equilibria in Mean-Field Games: Introducing Mean-Field PSRO. CoRR abs/2111.08350 (2021) - 2020
- [c118]Nino Vieillard, Olivier Pietquin, Matthieu Geist:
Deep Conservative Policy Iteration. AAAI 2020: 6070-6077 - [c117]Romuald Elie, Julien Pérolat, Mathieu Laurière, Matthieu Geist, Olivier Pietquin:
On the Convergence of Model Free Learning in Mean Field Games. AAAI 2020: 7143-7150 - [c116]Alexis Jacq, Julien Pérolat, Matthieu Geist, Olivier Pietquin:
Foolproof Cooperative Learning. ACML 2020: 401-416 - [c115]Nino Vieillard, Bruno Scherrer, Olivier Pietquin, Matthieu Geist:
Momentum in Reinforcement Learning. AISTATS 2020: 2529-2538 - [c114]Léonard Hussenot, Matthieu Geist, Olivier Pietquin:
CopyCAT: : Taking Control of Neural Policies with Constant Attacks. AAMAS 2020: 548-556 - [c113]Yuchen Lu, Soumye Singhal, Florian Strub, Olivier Pietquin, Aaron C. Courville:
Supervised Seeded Iterated Learning for Interactive Language Learning. EMNLP (1) 2020: 3962-3970 - [c112]Yuchen Lu, Soumye Singhal, Florian Strub, Aaron C. Courville, Olivier Pietquin:
Countering Language Drift with Seeded Iterated Learning. ICML 2020: 6437-6447 - [c111]Johan Ferret, Raphaël Marinier, Matthieu Geist, Olivier Pietquin:
Self-Attentional Credit Assignment for Transfer in Reinforcement Learning. IJCAI 2020: 2655-2661 - [c110]Mathieu Seurin, Philippe Preux, Olivier Pietquin:
"I'm Sorry Dave, I'm Afraid I Can't Do That" Deep Q-Learning from Forbidden Actions. IJCNN 2020: 1-8 - [c109]Mathieu Seurin, Florian Strub, Philippe Preux, Olivier Pietquin:
A Machine of Few Words: Interactive Speaker Recognition with Reinforcement Learning. INTERSPEECH 2020: 4323-4327 - [c108]Sarah Perrin, Julien Pérolat, Mathieu Laurière, Matthieu Geist, Romuald Elie, Olivier Pietquin:
Fictitious Play for Mean Field Games: Continuous Time Analysis and Applications. NeurIPS 2020 - [c107]Nino Vieillard, Tadashi Kozuno, Bruno Scherrer, Olivier Pietquin, Rémi Munos, Matthieu Geist:
Leverage the Average: an Analysis of KL Regularization in Reinforcement Learning. NeurIPS 2020 - [c106]Nino Vieillard, Olivier Pietquin, Matthieu Geist:
Munchausen Reinforcement Learning. NeurIPS 2020 - [c105]Geoffrey Cideron, Mathieu Seurin, Florian Strub, Olivier Pietquin:
HIGhER: Improving instruction following with Hindsight Generation for Experience Replay. SSCI 2020: 225-232 - [p3]Olivier Buffet, Olivier Pietquin, Paul Weng:
Reinforcement Learning. A Guided Tour of Artificial Intelligence Research (1) (I) 2020: 389-414 - [e1]Olivier Pietquin, Smaranda Muresan, Vivian Chen, Casey Kennington, David Vandyke, Nina Dethlefs, Koji Inoue, Erik Ekstedt, Stefan Ultes:
Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue, SIGdial 2020, 1st virtual meeting, July 1-3, 2020. Association for Computational Linguistics 2020, ISBN 978-1-952148-02-6 [contents] - [i43]Yuchen Lu, Soumye Singhal, Florian Strub, Olivier Pietquin, Aaron C. Courville:
Countering Language Drift with Seeded Iterated Learning. CoRR abs/2003.12694 (2020) - [i42]Nino Vieillard, Tadashi Kozuno, Bruno Scherrer, Olivier Pietquin, Rémi Munos, Matthieu Geist:
Leverage the Average: an Analysis of Regularization in RL. CoRR abs/2003.14089 (2020) - [i41]Olivier Buffer, Olivier Pietquin, Paul Weng:
Reinforcement Learning. CoRR abs/2005.14419 (2020) - [i40]Robert Dadashi, Léonard Hussenot, Matthieu Geist, Olivier Pietquin:
Primal Wasserstein Imitation Learning. CoRR abs/2006.04678 (2020) - [i39]Marcin Andrychowicz, Anton Raichuk, Piotr Stanczyk, Manu Orsini, Sertan Girgin, Raphaël Marinier, Léonard Hussenot, Matthieu Geist, Olivier Pietquin, Marcin Michalski, Sylvain Gelly, Olivier Bachem:
What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study. CoRR abs/2006.05990 (2020) - [i38]Léonard Hussenot, Robert Dadashi, Matthieu Geist, Olivier Pietquin:
Show me the Way: Intrinsic Motivation from Demonstrations. CoRR abs/2006.12917 (2020) - [i37]Sarah Perrin, Julien Pérolat, Mathieu Laurière, Matthieu Geist, Romuald Elie, Olivier Pietquin:
Fictitious Play for Mean Field Games: Continuous Time Analysis and Applications. CoRR abs/2007.03458 (2020) - [i36]Alice Martin, Charles Ollion, Florian Strub, Sylvain Le Corff, Olivier Pietquin:
The Monte Carlo Transformer: a stochastic self-attention model for sequence prediction. CoRR abs/2007.08620 (2020) - [i35]Nino Vieillard, Olivier Pietquin, Matthieu Geist:
Munchausen Reinforcement Learning. CoRR abs/2007.14430 (2020) - [i34]Mathieu Seurin, Florian Strub, Philippe Preux, Olivier Pietquin:
A Machine of Few Words - Interactive Speaker Recognition with Reinforcement Learning. CoRR abs/2008.03127 (2020) - [i33]Yuchen Lu, Soumye Singhal, Florian Strub, Olivier Pietquin, Aaron C. Courville:
Supervised Seeded Iterated Learning for Interactive Language Learning. CoRR abs/2010.02975 (2020) - [i32]Aaqib Saeed, David Grangier, Olivier Pietquin, Neil Zeghidour:
Learning from Heterogeneous EEG Signals with Differentiable Channel Reordering. CoRR abs/2010.13694 (2020) - [i31]Johan Ferret, Olivier Pietquin, Matthieu Geist:
Self-Imitation Advantage Learning. CoRR abs/2012.11989 (2020)
2010 – 2019
- 2019
- [c104]Diana Borsa, Nicolas Heess, Bilal Piot, Siqi Liu, Leonard Hasenclever, Rémi Munos, Olivier Pietquin:
Observational Learning by Reinforcement Learning. AAMAS 2019: 1117-1124 - [c103]Matthieu Geist, Bruno Scherrer, Olivier Pietquin:
A Theory of Regularized Markov Decision Processes. ICML 2019: 2160-2169 - [c102]Alexis Jacq, Matthieu Geist, Ana Paiva, Olivier Pietquin:
Learning from a Learner. ICML 2019: 2990-2999 - [c101]Nicolas Carrara, Edouard Leurent, Romain Laroche, Tanguy Urvoy, Odalric-Ambrym Maillard, Olivier Pietquin:
Budgeted Reinforcement Learning in Continuous State Space. NeurIPS 2019: 9295-9305 - [i30]Geoffrey Cideron, Mathieu Seurin, Florian Strub, Olivier Pietquin:
Self-Educated Language Agent with Hindsight Experience Replay for Instruction Following. ViGIL@NeurIPS 2019 - [i29]Matthieu Geist, Bruno Scherrer, Olivier Pietquin:
A Theory of Regularized Markov Decision Processes. CoRR abs/1901.11275 (2019) - [i28]Nicolas Carrara, Edouard Leurent, Romain Laroche, Tanguy Urvoy, Odalric-Ambrym Maillard, Olivier Pietquin:
Scaling up budgeted reinforcement learning. CoRR abs/1903.01004 (2019) - [i27]Léonard Hussenot, Matthieu Geist, Olivier Pietquin:
Targeted Attacks on Deep Reinforcement Learning Agents through Adversarial Observations. CoRR abs/1905.12282 (2019) - [i26]Nino Vieillard, Olivier Pietquin, Matthieu Geist:
Deep Conservative Policy Iteration. CoRR abs/1906.09784 (2019) - [i25]Alexis Jacq, Julien Pérolat, Matthieu Geist, Olivier Pietquin:
Foolproof Cooperative Learning. CoRR abs/1906.09831 (2019) - [i24]Lucas Beyer, Damien Vincent, Olivier Teboul, Sylvain Gelly, Matthieu Geist, Olivier Pietquin:
MULEX: Disentangling Exploitation from Exploration in Deep RL. CoRR abs/1907.00868 (2019) - [i23]Romuald Elie, Julien Pérolat, Mathieu Laurière, Matthieu Geist, Olivier Pietquin:
Approximate Fictitious Play for Mean Field Games. CoRR abs/1907.02633 (2019) - [i22]Johan Ferret, Raphaël Marinier, Matthieu Geist, Olivier Pietquin:
Credit Assignment as a Proxy for Transfer in Reinforcement Learning. CoRR abs/1907.08027 (2019) - [i21]Mathieu Seurin, Philippe Preux, Olivier Pietquin:
"I'm sorry Dave, I'm afraid I can't do that" Deep Q-learning from forbidden action. CoRR abs/1910.02078 (2019) - [i20]Nino Vieillard, Olivier Pietquin, Matthieu Geist:
On Connections between Constrained Optimization and Reinforcement Learning. CoRR abs/1910.08476 (2019) - [i19]Nino Vieillard, Bruno Scherrer, Olivier Pietquin, Matthieu Geist:
Momentum in Reinforcement Learning. CoRR abs/1910.09322 (2019) - [i18]Geoffrey Cideron, Mathieu Seurin, Florian Strub, Olivier Pietquin:
Self-Educated Language Agent With Hindsight Experience Replay For Instruction Following. CoRR abs/1910.09451 (2019) - 2018
- [c100]Todd Hester, Matej Vecerík, Olivier Pietquin, Marc Lanctot, Tom Schaul, Bilal Piot, Dan Horgan, John Quan, Andrew Sendonaris, Ian Osband, Gabriel Dulac-Arnold, John P. Agapiou, Joel Z. Leibo, Audrunas Gruslys:
Deep Q-learning From Demonstrations. AAAI 2018: 3223-3230 - [c99]Julien Pérolat, Bilal Piot, Olivier Pietquin:
Actor-Critic Fictitious Play in Simultaneous Move Multistage Games. AISTATS 2018: 919-928 - [c98]Merwan Barlier, Romain Laroche, Olivier Pietquin:
Training Dialogue Systems With Human Advice. AAMAS 2018: 999-1007 - [c97]Florian Strub, Mathieu Seurin, Ethan Perez, Harm de Vries, Jérémie Mary, Philippe Preux, Aaron C. Courville, Olivier Pietquin:
Visual Reasoning with Multi-hop Feature Modulation. ECCV (5) 2018: 808-831 - [c96]Alexandre Berard, Laurent Besacier, Ali Can Kocabiyikoglu, Olivier Pietquin:
End-to-End Automatic Speech Translation of Audiobooks. ICASSP 2018: 6224-6228 - [c95]Meire Fortunato, Mohammad Gheshlaghi Azar, Bilal Piot, Jacob Menick, Matteo Hessel, Ian Osband, Alex Graves, Volodymyr Mnih, Rémi Munos, Demis Hassabis, Olivier Pietquin, Charles Blundell, Shane Legg:
Noisy Networks For Exploration. ICLR (Poster) 2018 - [p2]Gil Keren, Amr El-Desoky Mousa, Olivier Pietquin, Stefanos Zafeiriou, Björn W. Schuller:
Deep learning for multisensorial and multimodal interaction. The Handbook of Multimodal-Multisensor Interfaces, Volume 2 (2) 2018: 99-128 - [i17]Alexandre Bérard, Laurent Besacier, Ali Can Kocabiyikoglu, Olivier Pietquin:
End-to-End Automatic Speech Translation of Audiobooks. CoRR abs/1802.04200 (2018) - [i16]Tobias Pohlen, Bilal Piot, Todd Hester, Mohammad Gheshlaghi Azar, Dan Horgan, David Budden, Gabriel Barth-Maron, Hado van Hasselt, John Quan, Mel Vecerík, Matteo Hessel, Rémi Munos, Olivier Pietquin:
Observe and Look Further: Achieving Consistent Performance on Atari. CoRR abs/1805.11593 (2018) - [i15]Florian Strub, Mathieu Seurin, Ethan Perez, Harm de Vries, Jérémie Mary, Philippe Preux, Aaron C. Courville, Olivier Pietquin:
Visual Reasoning with Multi-hop Feature Modulation. CoRR abs/1808.04446 (2018) - [i14]Julien Pérolat, Mateusz Malinowski, Bilal Piot, Olivier Pietquin:
Playing the Game of Universal Adversarial Perturbations. CoRR abs/1809.07802 (2018) - 2017
- [j14]Bilal Piot, Matthieu Geist, Olivier Pietquin:
Bridging the Gap Between Imitation Learning and Inverse Reinforcement Learning. IEEE Trans. Neural Networks Learn. Syst. 28(8): 1814-1826 (2017) - [c94]Julien Pérolat, Florian Strub, Bilal Piot, Olivier Pietquin:
Learning Nash Equilibrium for General-Sum Markov Games from Batch Data. AISTATS 2017: 232-241 - [c93]Harm de Vries, Florian Strub, Sarath Chandar, Olivier Pietquin, Hugo Larochelle, Aaron C. Courville:
GuessWhat?! Visual Object Discovery through Multi-modal Dialogue. CVPR 2017: 4466-4475 - [c92]Florian Strub, Harm de Vries, Jérémie Mary, Bilal Piot, Aaron C. Courville, Olivier Pietquin:
End-to-end optimization of goal-driven and visually grounded dialogue systems. IJCAI 2017: 2765-2771 - [c91]Matthieu Geist, Bilal Piot, Olivier Pietquin:
Is the Bellman residual a bad proxy? NIPS 2017: 3205-3214 - [c90]Harm de Vries, Florian Strub, Jérémie Mary, Hugo Larochelle, Olivier Pietquin, Aaron C. Courville:
Modulating early visual processing by language. NIPS 2017: 6594-6604 - [c89]Alexandre Berard, Laurent Besacier, Olivier Pietquin:
LIG-CRIStAL Submission for the WMT 2017 Automatic Post-Editing Task. WMT 2017: 623-629 - [i13]Florian Strub, Harm de Vries, Jérémie Mary, Bilal Piot, Aaron C. Courville, Olivier Pietquin:
End-to-end optimization of goal-driven and visually grounded dialogue systems. CoRR abs/1703.05423 (2017) - [i12]Todd Hester, Matej Vecerík, Olivier Pietquin, Marc Lanctot, Tom Schaul, Bilal Piot, Andrew Sendonaris, Gabriel Dulac-Arnold, Ian Osband, John P. Agapiou, Joel Z. Leibo, Audrunas Gruslys:
Learning from Demonstrations for Real World Reinforcement Learning. CoRR abs/1704.03732 (2017) - [i11]Diana Borsa, Bilal Piot, Rémi Munos, Olivier Pietquin:
Observational Learning by Reinforcement Learning. CoRR abs/1706.06617 (2017) - [i10]Meire Fortunato, Mohammad Gheshlaghi Azar, Bilal Piot, Jacob Menick, Ian Osband, Alex Graves, Vlad Mnih, Rémi Munos, Demis Hassabis, Olivier Pietquin, Charles Blundell, Shane Legg:
Noisy Networks for Exploration. CoRR abs/1706.10295 (2017) - [i9]Harm de Vries, Florian Strub, Jérémie Mary, Hugo Larochelle, Olivier Pietquin, Aaron C. Courville:
Modulating early visual processing by language. CoRR abs/1707.00683 (2017) - [i8]Alexandre Berard, Olivier Pietquin, Laurent Besacier:
LIG-CRIStAL System for the WMT17 Automatic Post-Editing Task. CoRR abs/1707.05118 (2017) - [i7]Matej Vecerík, Todd Hester, Jonathan Scholz, Fumin Wang, Olivier Pietquin, Bilal Piot, Nicolas Heess, Thomas Rothörl, Thomas Lampe, Martin A. Riedmiller:
Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards. CoRR abs/1707.08817 (2017) - 2016
- [c88]Julien Pérolat, Bilal Piot, Bruno Scherrer, Olivier Pietquin:
On the Use of Non-Stationary Strategies for Solving Two-Player Zero-Sum Markov Games. AISTATS 2016: 893-901 - [c87]Layla El Asri, Bilal Piot, Matthieu Geist, Romain Laroche, Olivier Pietquin:
Score-based Inverse Reinforcement Learning. AAMAS 2016: 457-465 - [c86]Hadrien Glaude, Olivier Pietquin:
PAC learning of Probabilistic Automaton based on the Method of Moments. ICML 2016: 820-829 - [c85]Julien Pérolat, Bilal Piot, Matthieu Geist, Bruno Scherrer, Olivier Pietquin:
Softened Approximate Policy Iteration for Markov Games. ICML 2016: 1860-1868 - [c84]Merwan Barlier, Romain Laroche, Olivier Pietquin:
A Stochastic Model for Computer-Aided Human-Human Dialogue. INTERSPEECH 2016: 2051-2055 - [c83]Layla El Asri, Romain Laroche, Olivier Pietquin:
Compact and Interpretable Dialogue State Representation with Genetic Sparse Distributed Memory. IWSDS 2016: 39-51 - [c82]Alexandre Berard, Christophe Servan, Olivier Pietquin, Laurent Besacier:
MultiVec: a Multilingual and Multilevel Representation Learning Toolkit for NLP. LREC 2016 - [c81]Merwan Barlier, Romain Laroche, Olivier Pietquin:
Learning dialogue dynamics with the method of moments. SLT 2016: 98-105 - [p1]Stéphane Dupont, Hüseyin Çakmak, William Curran, Thierry Dutoit, Jennifer Hofmann, Gary McKeown, Olivier Pietquin, Tracey Platt, Willibald Ruch, Jérôme Urbain:
Laughter Research: A Review of the ILHAIRE Project. Toward Robotic Socially Believable Behaving Systems (I) 2016: 147-181 - [i6]Bilal Piot, Matthieu Geist, Olivier Pietquin:
Difference of Convex Functions Programming Applied to Control with Expert Data. CoRR abs/1606.01128 (2016) - [i5]Matthieu Geist, Bilal Piot, Olivier Pietquin:
Should one minimize the expected Bellman residual or maximize the mean value? CoRR abs/1606.07636 (2016) - [i4]Julien Pérolat, Florian Strub, Bilal Piot, Olivier Pietquin:
Learning Nash Equilibrium for General-Sum Markov Games from Batch Data. CoRR abs/1606.08718 (2016) - [i3]Harm de Vries, Florian Strub, Sarath Chandar, Olivier Pietquin, Hugo Larochelle, Aaron C. Courville:
GuessWhat?! Visual object discovery through multi-modal dialogue. CoRR abs/1611.08481 (2016) - [i2]Alexandre Berard, Olivier Pietquin, Christophe Servan, Laurent Besacier:
Listen and Translate: A Proof of Concept for End-to-End Speech-to-Text Translation. CoRR abs/1612.01744 (2016) - 2015
- [j13]Timothé Collet, Olivier Pietquin:
Optimism in Active Learning. Comput. Intell. Neurosci. 2015: 973696:1-973696:17 (2015) - [c80]Hadrien Glaude, Cyrille Enderli, Olivier Pietquin:
Spectral learning with non negative probabilities for finite state automaton. ASRU 2015: 71-77 - [c79]Bilal Piot, Olivier Pietquin, Matthieu Geist:
Imitation Learning Applied to Embodied Conversational Agents. MLIS@ICML 2015: 1-5 - [c78]Heriberto Cuayáhuitl, Nina Dethlefs, Lutz Frommberger, Martijn van Otterlo, Olivier Pietquin:
Proceedings of the 4th Workshop on Machine Learning for Interactive Systems (MLIS-2015). MLIS@ICML 2015: 4 - [c77]Julien Pérolat, Bruno Scherrer, Bilal Piot, Olivier Pietquin:
Approximate Dynamic Programming for Two-Player Zero-Sum Markov Games. ICML 2015: 1321-1329 - [c76]Hadrien Glaude, Cyrille Enderli, Olivier Pietquin:
Non-negative Spectral Learning for Linear Sequential Systems. ICONIP (2) 2015: 143-151 - [c75]Timothé Collet, Olivier Pietquin:
Optimism in Active Learning with Gaussian Processes. ICONIP (2) 2015: 152-160 - [c74]Thibaut Munzer, Bilal Piot, Matthieu Geist, Olivier Pietquin, Manuel Lopes:
Inverse Reinforcement Learning in Relational Domains. IJCAI 2015: 3735-3741 - [c73]Hadrien Glaude, Cyrille Enderli, Jean-François Grandin, Olivier Pietquin:
Learning of scanning strategies for electronic support using predictive state representations. MLSP 2015: 1-6 - [c72]Merwan Barlier, Julien Pérolat, Romain Laroche, Olivier Pietquin:
Human-Machine Dialogue as a Stochastic Game. SIGDIAL Conference 2015: 2-11 - [c71]Timothé Collet, Olivier Pietquin:
Bayesian Credible Intervals for Online and Active Learning of Classification Trees. SSCI 2015: 571-578 - 2014
- [c70]Timothé Collet, Olivier Pietquin:
Active learning for classification: An optimistic approach. ADPRL 2014: 1-8 - [c69]Hadrien Glaude, Olivier Pietquin, Cyrille Enderli:
Subspace identification for predictive state representation by nuclear norm minimization. ADPRL 2014: 1-8 - [c68]Bilal Piot, Matthieu Geist, Olivier Pietquin:
Boosted and reward-regularized classification for apprenticeship learning. AAMAS 2014: 1249-1256 - [c67]Layla El Asri, Hatim Khouzaimi, Romain Laroche, Olivier Pietquin:
Ordinal regression for interaction quality prediction. ICASSP 2014: 3221-3225 - [c66]Bilal Piot, Olivier Pietquin, Matthieu Geist:
Predicting when to laugh with structured classification. INTERSPEECH 2014: 1786-1790 - [c65]Layla El Asri, Rémi Lemonnier, Romain Laroche, Olivier Pietquin, Hatim Khouzaimi:
NASTIA: Negotiating Appointment Setting Interface. LREC 2014: 266-271 - [c64]Layla El Asri, Romain Laroche, Olivier Pietquin:
DINASTI: Dialogues with a Negotiating Appointment Setting Interface. LREC 2014: 272-278 - [c63]Bilal Piot, Matthieu Geist, Olivier Pietquin:
Difference of Convex Functions Programming for Reinforcement Learning. NIPS 2014: 2519-2527 - [c62]Bilal Piot, Matthieu Geist, Olivier Pietquin:
Boosted Bellman Residual Minimization Handling Expert Demonstrations. ECML/PKDD (2) 2014: 549-564 - [i1]Matthieu Geist, Olivier Pietquin:
Kalman Temporal Differences. CoRR abs/1406.3270 (2014) - 2013
- [j12]Olivier Pietquin, Helen F. Hastie:
A survey on metrics for the evaluation of user simulations. Knowl. Eng. Rev. 28(1): 59-73 (2013) - [j11]Edouard Klein, Bilal Piot, Matthieu Geist, Olivier Pietquin:
Classification structurée pour l'apprentissage par renforcement inverse. Rev. d'Intelligence Artif. 27(2): 155-169 (2013) - [j10]Matthieu Geist, Olivier Pietquin:
Algorithmic Survey of Parametric Value Function Approximation. IEEE Trans. Neural Networks Learn. Syst. 24(6): 845-867 (2013) - [c61]Radoslaw Niewiadomski, Jennifer Hofmann, Jérôme Urbain, Tracey Platt, Johannes Wagner, Bilal Piot, Hüseyin Çakmak, Sathish Pammi, Tobias Baur, Stéphane Dupont, Matthieu Geist, Florian Lingenfelser, Gary McKeown, Olivier Pietquin, Willibald Ruch:
Laugh-aware virtual agent and its impact on user amusement. AAMAS 2013: 619-626 - [c60]Lucie Daubigney, Matthieu Geist, Olivier Pietquin:
Random projections: A remedy for overfitting issues in time series prediction with echo state networks. ICASSP 2013: 3253-3257 - [c59]Maurizio Mancini, Laurent Ach, Emeline Bantegnie, Tobias Baur, Nadia Berthouze, Debajyoti Datta, Yu Ding, Stéphane Dupont, Harry J. Griffin, Florian Lingenfelser, Radoslaw Niewiadomski, Catherine Pelachaud, Olivier Pietquin, Bilal Piot, Jérôme Urbain, Gualtiero Volpe, Johannes Wagner:
Laugh When You're Winning. eNTERFACE 2013: 50-79 - [c58]Olivier Pietquin:
Inverse reinforcement learning for interactive systems. MLIS@IJCAI 2013: 71-75 - [c57]Lucie Daubigney, Matthieu Geist, Olivier Pietquin:
Particle swarm optimisation of spoken dialogue system strategies. INTERSPEECH 2013: 470-474 - [c56]Edouard Klein, Bilal Piot, Matthieu Geist, Olivier Pietquin:
A Cascaded Supervised Learning Approach to Inverse Reinforcement Learning. ECML/PKDD (1) 2013: 1-16 - [c55]Bilal Piot, Matthieu Geist, Olivier Pietquin:
Learning from Demonstrations: Is It Worth Estimating a Reward Function? ECML/PKDD (1) 2013: 17-32 - [c54]Lucie Daubigney, Matthieu Geist, Olivier Pietquin:
Model-free POMDP optimisation of tutoring systems with echo-state networks. SIGDIAL Conference 2013: 102-106 - [c53]Layla El Asri, Romain Laroche, Olivier Pietquin:
Reward Shaping for Statistical Optimisation of Dialogue Management. SLSP 2013: 93-101 - 2012
- [j9]Jason D. Williams, Kai Yu, Brahim Chaib-draa, Oliver Lemon, Roberto Pieraccini, Olivier Pietquin, Pascal Poupart, Steve J. Young:
Introduction to the Issue on Advances in Spoken Dialogue Systems and Mobile Interface. IEEE J. Sel. Top. Signal Process. 6(8): 889-890 (2012) - [j8]Lucie Daubigney, Matthieu Geist, Senthilkumar Chandramohan, Olivier Pietquin:
A Comprehensive Reinforcement Learning Framework for Dialogue Management Optimization. IEEE J. Sel. Top. Signal Process. 6(8): 891-902 (2012) - [c52]Senthilkumar Chandramohan, Matthieu Geist, Fabrice Lefèvre, Olivier Pietquin:
Behavior Specific User Simulation in Spoken Dialogue Systems. ITG Conference on Speech Communication 2012: 1-4 - [c51]Olivier Pietquin, Fabio Tango:
A Reinforcement Learning Approach to Optimize the longitudinal Behavior of a Partial Autonomous Driving Assistance System. ECAI 2012: 987-992 - [c50]Senthilkumar Chandramohan, Matthieu Geist, Fabrice Lefèvre, Olivier Pietquin:
Clustering behaviors of Spoken Dialogue Systems users. ICASSP 2012: 4981-4984 - [c49]Lucie Daubigney, Matthieu Geist, Olivier Pietquin:
Off-policy learning in large-scale POMDP-based dialogue systems. ICASSP 2012: 4989-4992 - [c48]Senthilkumar Chandramohan, Matthieu Geist, Fabrice Lefèvre, Olivier Pietquin:
Co-adaptation in Spoken Dialogue Systems. IWSDS 2012: 343-353 - [c47]Olivier Pietquin:
Statistical User Simulation for Spoken Dialogue Systems: What for, Which Data, Which Future? SDCTD@NAACL-HLT 2012: 9-10 - [c46]Edouard Klein, Matthieu Geist, Bilal Piot, Olivier Pietquin:
Inverse Reinforcement Learning through Structured Classification. NIPS 2012: 1016-1024 - [c45]Layla El Asri, Romain Laroche, Olivier Pietquin:
Reward Function Learning for Dialogue Management. STAIRS 2012: 95-106 - [c44]Lucie Daubigney, Matthieu Geist, Olivier Pietquin:
Optimisation d'un tuteur intelligent à partir d'un jeu de données fixé (Optimization of a tutoring system from a fixed set of data) [in French]. JEP-TALN-RECITAL 2012 2012: 241-248 - 2011
- [b1]Olivier Pietquin:
De l'Apprentissage Statistique pour le Contrôle Optimal et le Traitement du Signal. Paul Sabatier University, Toulouse, France, 2011 - [j7]Beatrice Chevaillier, Damien Mandry, Jean-Luc Collette, Michel Claudon, Marie-Agnès Galloy, Olivier Pietquin:
Functional Segmentation of Renal DCE-MRI Sequences Using Vector Quantization Algorithms. Neural Process. Lett. 34(1): 71-85 (2011) - [j6]Oliver Lemon, Olivier Pietquin:
Introduction to special issue on machine learning for adaptivity in spoken dialogue systems. ACM Trans. Speech Lang. Process. 7(3): 3:1-3:3 (2011) - [j5]Olivier Pietquin, Matthieu Geist, Senthilkumar Chandramohan, Hervé Frezza-Buet:
Sample-efficient batch reinforcement learning for dialogue management optimization. ACM Trans. Speech Lang. Process. 7(3): 7:1-7:21 (2011) - [c43]Matthieu Geist, Olivier Pietquin:
Parametric value function approximation: A unified view. ADPRL 2011: 9-16 - [c42]Jérémy Fix, Matthieu Geist, Olivier Pietquin, Hervé Frezza-Buet:
Dynamic neural field optimization using the unscented Kalman filter. CCMB 2011: 74-80 - [c41]Olivier Pietquin, Fabio Tango, Raghav Aras:
Batch reinforcement learning for optimizing longitudinal driving assistance strategies. CIVTS 2011: 73-79 - [c40]Lucie Daubigney, Olivier Pietquin:
Single-trial P300 detection with Kalman filtering and SVMs. ESANN 2011 - [c39]Edouard Klein, Matthieu Geist, Olivier Pietquin:
Batch, Off-Policy and Model-Free Apprenticeship Learning. EWRL 2011: 285-296 - [c38]Fabio Tango, Luca Minin, Raghav Aras, Olivier Pietquin:
Automation Effects on Driver's Behaviour When Integrating a PADAS and a Distraction Classifier. HCI (17) 2011: 503-512 - [c37]Hadrien Glaude, Fadi Akrimi, Matthieu Geist, Olivier Pietquin:
A Non-parametric Approach to Approximate Dynamic Programming. ICMLA (1) 2011: 317-322 - [c36]Olivier Pietquin, Matthieu Geist, Senthilkumar Chandramohan:
Sample Efficient On-Line Learning of Optimal Dialogue Policies with Kalman Temporal Differences. IJCAI 2011: 1878-1883 - [c35]Stéphane Rossignol, Olivier Pietquin, Michel Ianotto:
Training a BN-based user model for dialogue simulation with missing data. IJCNLP 2011: 598-604 - [c34]Senthilkumar Chandramohan, Matthieu Geist, Fabrice Lefèvre, Olivier Pietquin:
User Simulation in Dialogue Systems Using Inverse Reinforcement Learning. INTERSPEECH 2011: 1025-1028 - [c33]Lucie Daubigney, Milica Gasic, Senthilkumar Chandramohan, Matthieu Geist, Olivier Pietquin, Steve J. Young:
Uncertainty Management for On-Line Optimisation of a POMDP-Based Large-Scale Spoken Dialogue System. INTERSPEECH 2011: 1301-1304 - [c32]Olivier Pietquin, Lucie Daubigney, Matthieu Geist:
Optimization of a tutoring system from a fixed set of data. SLaTE 2011: 97-100 - [c31]Matthieu Geist, Olivier Pietquin:
Managing Uncertainty within KTD. Active Learning and Experimental Design @ AISTATS 2011: 157-168 - 2010
- [j4]Matthieu Geist, Olivier Pietquin:
Kalman Temporal Differences. J. Artif. Intell. Res. 39: 483-532 (2010) - [j3]Matthieu Geist, Olivier Pietquin, Gabriel Fricout:
Différences temporelles de Kalman. Cas déterministe. Rev. d'Intelligence Artif. 24(4): 423-443 (2010) - [j2]Julien Oster, Olivier Pietquin, Michel Kraemer, Jacques Felblinger:
Nonlinear Bayesian Filtering for Denoising of Electrocardiograms Acquired in a Magnetic Resonance Environment. IEEE Trans. Biomed. Eng. 57(7): 1628-1638 (2010) - [c30]Jean-Louis Gutzwiller, Hervé Frezza-Buet, Olivier Pietquin:
Online speaker diarization with a size-monitored growing neural gas algorithm. ESANN 2010 - [c29]Beatrice Chevaillier, Jean-Luc Collette, Damien Mandry, Michel Claudon, Olivier Pietquin:
Objective assessment of renal DCE-MRI image segmentation. EUSIPCO 2010: 1214-1218 - [c28]Julien Oster, Olivier Pietquin, Michel Kraemer, Jacques Felblinger:
Bayesian framework for artifact reduction on ECG IN MRI. ICASSP 2010: 489-492 - [c27]Matthieu Geist, Olivier Pietquin:
Statistically linearized least-squares temporal differences. ICUMT 2010: 450-457 - [c26]Matthieu Geist, Olivier Pietquin:
Eligibility traces through colored noises. ICUMT 2010: 458-465 - [c25]Senthilkumar Chandramohan, Matthieu Geist, Olivier Pietquin:
Optimizing spoken dialogue management with fitted value iteration. INTERSPEECH 2010: 86-89 - [c24]Stéphane Rossignol, Olivier Pietquin:
Single-speaker/multi-speaker co-channel speech classification. INTERSPEECH 2010: 2322-2325 - [c23]Senthilkumar Chandramohan, Olivier Pietquin:
User and Noise Adaptive Dialogue Management Using Hybrid System Actions. IWSDS 2010: 13-24 - [c22]Stéphane Rossignol, Olivier Pietquin, Michel Ianotto:
Simulation of the Grounding Process in Spoken Dialog Systems with Bayesian Networks. IWSDS 2010: 110-121 - [c21]Matthieu Geist, Olivier Pietquin:
Revisiting Natural Actor-Critics with Value Function Approximation. MDAI 2010: 207-218 - [c20]Senthilkumar Chandramohan, Matthieu Geist, Olivier Pietquin:
Sparse Approximate Dynamic Programming for Dialog Management. SIGDIAL Conference 2010: 107-115
2000 – 2009
- 2009
- [c19]Matthieu Geist, Olivier Pietquin, Gabriel Fricout:
Kalman Temporal Differences: The deterministic case. ADPRL 2009: 185-192 - [c18]Matthieu Geist, Olivier Pietquin, Gabriel Fricout:
Kernelizing Vector Quantization Algorithms. ESANN 2009 - [c17]Julien Oster, Olivier Pietquin, Roger Abächerli, Michel Kraemer, Jacques Felblinger:
A specific QRS detector for electrocardiography during MRI: Using wavelets and local regularity characterization. ICASSP 2009: 341-344 - [c16]Matthieu Geist, Olivier Pietquin, Gabriel Fricout:
Tracking in Reinforcement Learning. ICONIP (1) 2009: 502-511 - 2008
- [c15]Beatrice Chevaillier, Damien Mandry, Yannick Ponvianne, Jean-Luc Collette, Michel Claudon, Olivier Pietquin:
Functional semi-automated segmentation of renal DCE-MRI sequences using a Growing Neural Gas algorithm. EUSIPCO 2008: 1-5 - [c14]Matthieu Geist, Olivier Pietquin, Gabriel Fricout:
Bayesian Reward Filtering. EWRL 2008: 96-109 - [c13]Julien Oster, Olivier Pietquin, Gilles Bosser:
Adaptive RR prediction for cardiac MRI. ICASSP 2008: 513-516 - [c12]Beatrice Chevaillier, Yannick Ponvianne, Jean-Luc Collette, Damien Mandry, Michel Claudon, Olivier Pietquin:
Functional semi-automated segmentation of renal DCE-MRI sequences. ICASSP 2008: 525-528 - 2007
- [c11]Olivier Pietquin:
Learning to Ground in Spoken Dialogue Systems. ICASSP (4) 2007: 165-168 - [c10]Oliver Lemon, Olivier Pietquin:
Machine learning for spoken dialogue systems. INTERSPEECH 2007: 2685-2688 - 2006
- [j1]Olivier Pietquin, Thierry Dutoit:
A probabilistic framework for dialog simulation and optimal strategy learning. IEEE Trans. Speech Audio Process. 14(2): 589-599 (2006) - [c9]Olivier Pietquin:
Machine Learning for Spoken Dialogue Management: An Experiment with Speech-Based Database Querying. AIMSA 2006: 172-180 - [c8]Olivier Pietquin, Thierry Dutoit:
Dynamic Bayesian Networks for NLU Simulation with Applications to Dialog Optimal Strategy Learning. ICASSP (1) 2006: 49-52 - [c7]Olivier Pietquin:
Consistent Goal-Directed User Model for Realisitc Man-Machine Task-Oriented Spoken Dialogue Simulation. ICME 2006: 425-428 - 2005
- [c6]Olivier Pietquin:
A Probabilistic Description of Man-Machine Spoken Communication. ICME 2005: 410-413 - [c5]Olivier Pietquin, Richard Beaufort:
Comparing ASR modeling methods for spoken dialogue simulation and optimal strategy learning. INTERSPEECH 2005: 861-864 - [c4]Olivier Pietquin:
Réseau bayesien pour un modèle d'utilisateur et un module de compréhension pour l'optimisation des systèmes de dialogues. TALN (Articles courts) 2005: 481-486 - 2004
- [c3]Olivier Pietquin:
Une description probabiliste de la communication parlée entre homme et machine. IHM 2004: 247-250 - 2003
- [c2]Olivier Pietquin, Thierry Dutoit:
Aided design of finite-state dialogue management systems. ICME 2003: 545-548 - 2002
- [c1]Olivier Pietquin, Steve Renals:
ASR system modeling for automatic evaluation and optimization of dialogue systems. ICASSP 2002: 45-48
Coauthor Index
aka: Romuald Élie
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-15 20:35 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint