default search action
Yash Chandak
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c22]Shiv Shankar, Ritwik Sinha, Yash Chandak, Saayan Mitra, Madalina Fiterau:
A/B testing under Interference with Partial Network Information. AISTATS 2024: 19-27 - [c21]Yannis Flet-Berliac, Nathan Grinsztajn, Florian Strub, Eugene Choi, Bill Wu, Chris Cremer, Arash Ahmadian, Yash Chandak, Mohammad Gheshlaghi Azar, Olivier Pietquin, Matthieu Geist:
Contrastive Policy Gradient: Aligning LLMs on sequence-level scores in a supervised-friendly fashion. EMNLP 2024: 21353-21370 - [c20]Yash Chandak, Shiv Shankar, Vasilis Syrgkanis, Emma Brunskill:
Adaptive Instrument Design for Indirect Experiments. ICLR 2024 - [c19]Amelia Leon, Allen Nie, Yash Chandak, Emma Brunskill:
Estimating the Causal Treatment Effect of Unproductive Persistence. LAK 2024: 843-849 - [i30]Shiv Shankar, Ritwik Sinha, Yash Chandak, Saayan Mitra, Madalina Fiterau:
A/B testing under Interference with Partial Network Information. CoRR abs/2404.10547 (2024) - [i29]Allen Nie, Yash Chandak, Christina J. Yuan, Anirudhan Badrinath, Yannis Flet-Berliac, Emma Brunskill:
OPERA: Automatic Offline Policy Evaluation with Re-weighted Aggregates of Multiple Estimators. CoRR abs/2405.17708 (2024) - [i28]Yannis Flet-Berliac, Nathan Grinsztajn, Florian Strub, Eugene Choi, Chris Cremer, Arash Ahmadian, Yash Chandak, Mohammad Gheshlaghi Azar, Olivier Pietquin, Matthieu Geist:
Contrastive Policy Gradient: Aligning LLMs on sequence-level scores in a supervised-friendly fashion. CoRR abs/2406.19185 (2024) - [i27]Nathan Grinsztajn, Yannis Flet-Berliac, Mohammad Gheshlaghi Azar, Florian Strub, Bill Wu, Eugene Choi, Chris Cremer, Arash Ahmadian, Yash Chandak, Olivier Pietquin, Matthieu Geist:
Averaging log-likelihoods in direct alignment. CoRR abs/2406.19188 (2024) - [i26]Hyunji Alex Nam, Yash Chandak, Emma Brunskill:
Short-Long Policy Evaluation with Novel Actions. CoRR abs/2407.03674 (2024) - [i25]Allen Nie, Yash Chandak, Miroslav Suzara, Ali Malik, Juliette Woodrow, Matt Peng, Mehran Sahami, Emma Brunskill, Chris Piech:
The GPT Surprise: Offering Large Language Model Chat in a Massive Coding Class Reduced Engagement but Increased Adopters Exam Performances. CoRR abs/2407.09975 (2024) - 2023
- [c18]Vincent Liu, Yash Chandak, Philip S. Thomas, Martha White:
Asymptotically Unbiased Off-Policy Policy Evaluation when Reusing Old Data in Nonstationary Environments. AISTATS 2023: 5474-5492 - [c17]Yash Chandak, Shantanu Thakoor, Zhaohan Daniel Guo, Yunhao Tang, Rémi Munos, Will Dabney, Diana L. Borsa:
Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition. ICML 2023: 4009-4034 - [c16]Yunhao Tang, Zhaohan Daniel Guo, Pierre Harvey Richemond, Bernardo Ávila Pires, Yash Chandak, Rémi Munos, Mark Rowland, Mohammad Gheshlaghi Azar, Charline Le Lan, Clare Lyle, András György, Shantanu Thakoor, Will Dabney, Bilal Piot, Daniele Calandriello, Michal Valko:
Understanding Self-Predictive Learning for Reinforcement Learning. ICML 2023: 33632-33656 - [c15]Jonathan Lee, Annie Xie, Aldo Pacchiano, Yash Chandak, Chelsea Finn, Ofir Nachum, Emma Brunskill:
Supervised Pretraining Can Learn In-Context Reinforcement Learning. NeurIPS 2023 - [c14]Dhawal Gupta, Yash Chandak, Scott M. Jordan, Philip S. Thomas, Bruno C. da Silva:
Behavior Alignment via Reward Function Optimization. NeurIPS 2023 - [i24]Yash Chandak, Shiv Shankar, Nathaniel D. Bastian, Bruno Castro da Silva, Emma Brunskill, Philip S. Thomas:
Off-Policy Evaluation for Action-Dependent Non-Stationary Environments. CoRR abs/2301.10330 (2023) - [i23]Yash Chandak, Shiv Shankar, Venkata Gandikota, Philip S. Thomas, Arya Mazumdar:
Optimization using Parallel Gradient Evaluations on Multiple Parameters. CoRR abs/2302.03161 (2023) - [i22]Vincent Liu, Yash Chandak, Philip S. Thomas, Martha White:
Asymptotically Unbiased Off-Policy Policy Evaluation when Reusing Old Data in Nonstationary Environments. CoRR abs/2302.11725 (2023) - [i21]Yash Chandak, Shantanu Thakoor, Zhaohan Daniel Guo, Yunhao Tang, Rémi Munos, Will Dabney, Diana L. Borsa:
Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition. CoRR abs/2305.00654 (2023) - [i20]James E. Kostas, Scott M. Jordan, Yash Chandak, Georgios Theocharous, Dhawal Gupta, Martha White, Bruno Castro da Silva, Philip S. Thomas:
Coagent Networks: Generalized and Scaled. CoRR abs/2305.09838 (2023) - [i19]Jonathan N. Lee, Annie Xie, Aldo Pacchiano, Yash Chandak, Chelsea Finn, Ofir Nachum, Emma Brunskill:
Supervised Pretraining Can Learn In-Context Reinforcement Learning. CoRR abs/2306.14892 (2023) - [i18]Dhawal Gupta, Yash Chandak, Scott M. Jordan, Philip S. Thomas, Bruno Castro da Silva:
Behavior Alignment via Reward Function Optimization. CoRR abs/2310.19007 (2023) - [i17]Yash Chandak, Shiv Shankar, Vasilis Syrgkanis, Emma Brunskill:
Adaptive Instrument Design for Indirect Experiments. CoRR abs/2312.02438 (2023) - 2022
- [j1]Priyesh Vijayan, Yash Chandak, Mitesh M. Khapra, Srinivasan Parthasarathy, Balaraman Ravindran:
Scaling Graph Propagation Kernels for Predictive Learning. Frontiers Big Data 5: 616617 (2022) - [c13]Weihao Tan, David Koleczek, Siddhant Pradhan, Nicholas Perello, Vivek Chettiar, Vishal Rohra, Aaslesha Rajaram, Soundararajan Srinivasan, H. M. Sajjad Hossain, Yash Chandak:
On Optimizing Interventions in Shared Autonomy. AAAI 2022: 5341-5349 - [c12]Yash Chandak, Shiv Shankar, Nathaniel D. Bastian, Bruno C. da Silva, Emma Brunskill, Philip S. Thomas:
Off-Policy Evaluation for Action-Dependent Non-stationary Environments. NeurIPS 2022 - [c11]Tong Mu, Yash Chandak, Tatsunori B. Hashimoto, Emma Brunskill:
Factored DRO: Factored Distributionally Robust Policies for Contextual Bandits. NeurIPS 2022 - [i16]Yunhao Tang, Zhaohan Daniel Guo, Pierre Harvey Richemond, Bernardo Ávila Pires, Yash Chandak, Rémi Munos, Mark Rowland, Mohammad Gheshlaghi Azar, Charline Le Lan, Clare Lyle, András György, Shantanu Thakoor, Will Dabney, Bilal Piot, Daniele Calandriello, Michal Valko:
Understanding Self-Predictive Learning for Reinforcement Learning. CoRR abs/2212.03319 (2022) - 2021
- [c10]Yash Chandak, Shiv Shankar, Philip S. Thomas:
High-Confidence Off-Policy (or Counterfactual) Variance Estimation. AAAI 2021: 6939-6947 - [c9]James E. Kostas, Yash Chandak, Scott M. Jordan, Georgios Theocharous, Philip S. Thomas:
High Confidence Generalization for Reinforcement Learning. ICML 2021: 5764-5773 - [c8]Christina J. Yuan, Yash Chandak, Stephen Giguere, Philip S. Thomas, Scott Niekum:
SOPE: Spectrum of Off-Policy Estimators. NeurIPS 2021: 18958-18969 - [c7]Yash Chandak, Scott Niekum, Bruno C. da Silva, Erik G. Learned-Miller, Emma Brunskill, Philip S. Thomas:
Universal Off-Policy Evaluation. NeurIPS 2021: 27475-27490 - [i15]Yash Chandak, Shiv Shankar, Philip S. Thomas:
High-Confidence Off-Policy (or Counterfactual) Variance Estimation. CoRR abs/2101.09847 (2021) - [i14]Yash Chandak, Scott Niekum, Bruno Castro da Silva, Erik G. Learned-Miller, Emma Brunskill, Philip S. Thomas:
Universal Off-Policy Evaluation. CoRR abs/2104.12820 (2021) - [i13]Christina J. Yuan, Yash Chandak, Stephen Giguere, Philip S. Thomas, Scott Niekum:
SOPE: Spectrum of Off-Policy Estimators. CoRR abs/2111.03936 (2021) - [i12]Weihao Tan, David Koleczek, Siddhant Pradhan, Nicholas Perello, Vivek Chettiar, Vishal Rohra, Aaslesha Rajaram, Soundararajan Srinivasan, H. M. Sajjad Hossain, Yash Chandak:
On Optimizing Interventions in Shared Autonomy. CoRR abs/2112.09169 (2021) - 2020
- [c6]Yash Chandak, Georgios Theocharous, Chris Nota, Philip S. Thomas:
Lifelong Learning with a Changing Action Set. AAAI 2020: 3373-3380 - [c5]Yash Chandak, Georgios Theocharous, Blossom Metevier, Philip S. Thomas:
Reinforcement Learning When All Actions Are Not Always Available. AAAI 2020: 3381-3388 - [c4]Yash Chandak, Georgios Theocharous, Shiv Shankar, Martha White, Sridhar Mahadevan, Philip S. Thomas:
Optimizing for the Future in Non-Stationary MDPs. ICML 2020: 1414-1425 - [c3]Scott M. Jordan, Yash Chandak, Daniel Cohen, Mengxue Zhang, Philip S. Thomas:
Evaluating the Performance of Reinforcement Learning Algorithms. ICML 2020: 4962-4973 - [c2]Yash Chandak, Scott M. Jordan, Georgios Theocharous, Martha White, Philip S. Thomas:
Towards Safe Policy Improvement for Non-Stationary MDPs. NeurIPS 2020 - [i11]Yash Chandak, Georgios Theocharous, Shiv Shankar, Martha White, Sridhar Mahadevan, Philip S. Thomas:
Optimizing for the Future in Non-Stationary MDPs. CoRR abs/2005.08158 (2020) - [i10]Scott M. Jordan, Yash Chandak, Daniel Cohen, Mengxue Zhang, Philip S. Thomas:
Evaluating the Performance of Reinforcement Learning Algorithms. CoRR abs/2006.16958 (2020) - [i9]Georgios Theocharous, Yash Chandak, Philip S. Thomas, Frits de Nijs:
Reinforcement Learning for Strategic Recommendations. CoRR abs/2009.07346 (2020) - [i8]Yash Chandak, Scott M. Jordan, Georgios Theocharous, Martha White, Philip S. Thomas:
Towards Safe Policy Improvement for Non-Stationary MDPs. CoRR abs/2010.12645 (2020)
2010 – 2019
- 2019
- [c1]Yash Chandak, Georgios Theocharous, James E. Kostas, Scott M. Jordan, Philip S. Thomas:
Learning Action Representations for Reinforcement Learning. ICML 2019: 941-950 - [i7]Yash Chandak, Georgios Theocharous, James E. Kostas, Scott M. Jordan, Philip S. Thomas:
Learning Action Representations for Reinforcement Learning. CoRR abs/1902.00183 (2019) - [i6]Yash Chandak, Georgios Theocharous, Chris Nota, Philip S. Thomas:
Lifelong Learning with a Changing Action Set. CoRR abs/1906.01770 (2019) - [i5]Yash Chandak, Georgios Theocharous, Blossom Metevier, Philip S. Thomas:
Reinforcement Learning When All Actions are Not Always Available. CoRR abs/1906.01772 (2019) - [i4]Philip S. Thomas, Scott M. Jordan, Yash Chandak, Chris Nota, James E. Kostas:
Classical Policy Gradient: Preserving Bellman's Principle of Optimality. CoRR abs/1906.03063 (2019) - 2018
- [i3]Priyesh Vijayan, Yash Chandak, Mitesh M. Khapra, Balaraman Ravindran:
HOPF: Higher Order Propagation Framework for Deep Collective Classification. CoRR abs/1805.12421 (2018) - [i2]Priyesh Vijayan, Yash Chandak, Mitesh M. Khapra, Balaraman Ravindran:
Fusion Graph Convolutional Networks. CoRR abs/1805.12528 (2018) - 2015
- [i1]Andreas Veit, Michael J. Wilber, Rajan Vaish, Serge J. Belongie, James Davis, Vishal Anand, Anshu Aviral, Prithvijit Chakrabarty, Yash Chandak, Sidharth Chaturvedi, Chinmaya Devaraj, Ankit Dhall, Utkarsh Dwivedi, Sanket Gupte, Sharath N. Sridhar, Karthik Paga, Anuj Pahuja, Aditya Raisinghani, Ayush Sharma, Shweta Sharma, Darpana Sinha, Nisarg Thakkar, K. Bala Vignesh, Utkarsh Verma, Kanniganti Abhishek, Amod Agrawal, Arya Aishwarya, Aurgho Bhattacharjee, Sarveshwaran Dhanasekar, Venkata Karthik Gullapalli, Shuchita Gupta, Chandana G, Kinjal Jain, Simran Kapur, Meghana Kasula, Shashi Kumar, Parth Kundaliya, Utkarsh Mathur, Alankrit Mishra, Aayush Mudgal, Aditya Nadimpalli, Munakala Sree Nihit, Akanksha Periwal, Ayush Sagar, Ayush Shah, Vikas Sharma, Yashovardhan Sharma, Faizal Siddiqui, Virender Singh, Abhinav S., Pradyumna Tambwekar, Rashida Taskin, Ankit Tripathi, Anurag D. Yadav:
On Optimizing Human-Machine Task Assignments. CoRR abs/1509.07543 (2015)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-15 20:36 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint