default search action
Tengyu Xu
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j3]Tianjiao Li, Ziwei Guan, Shaofeng Zou, Tengyu Xu, Yingbin Liang, Guanghui Lan:
Faster algorithm and sharper analysis for constrained Markov decision process. Oper. Res. Lett. 54: 107107 (2024) - [j2]Tengyu Xu, Yue Wang, Shaofeng Zou, Yingbin Liang:
Provably Efficient Offline Reinforcement Learning With Trajectory-Wise Reward. IEEE Trans. Inf. Theory 70(9): 6481-6518 (2024) - [i20]Tengyu Xu, Eryk Helenowski, Karthik Abinav Sankararaman, Di Jin, Kaiyan Peng, Eric Han, Shaoliang Nie, Chen Zhu, Hejia Zhang, Wenxuan Zhou, Zhouhao Zeng, Yun He, Karishma Mandyam, Arya Talabzadeh, Madian Khabsa, Gabriel Cohen, Yuandong Tian, Hao Ma, Sinong Wang, Han Fang:
The Perfect Blend: Redefining RLHF with Mixture of Judges. CoRR abs/2409.20370 (2024) - [i19]Yun He, Di Jin, Chaoqi Wang, Chloe Bi, Karishma Mandyam, Hejia Zhang, Chen Zhu, Ning Li, Tengyu Xu, Hongjiang Lv, Shruti Bhosale, Chenguang Zhu, Karthik Abinav Sankararaman, Eryk Helenowski, Melanie Kambadur, Aditya Tayade, Hao Ma, Han Fang, Sinong Wang:
Multi-IF: Benchmarking LLMs on Multi-Turn and Multilingual Instructions Following. CoRR abs/2410.15553 (2024) - 2023
- [j1]Xiumin Shang, Tengyu Xu, Ioannis Karamouzas, Marcelo Kallmann:
Constraint-based multi-agent reinforcement learning for collaborative tasks. Comput. Animat. Virtual Worlds 34(3-4) (2023) - 2022
- [c14]Ziwei Guan, Tengyu Xu, Yingbin Liang:
PER-ETD: A Polynomially Efficient Emphatic Temporal Difference Learning Method. ICLR 2022 - [c13]Sen Lin, Jialin Wan, Tengyu Xu, Yingbin Liang, Junshan Zhang:
Model-Based Offline Meta-Reinforcement Learning with Regularization. ICLR 2022 - [c12]Tengyu Xu, Zhuoran Yang, Zhaoran Wang, Yingbin Liang:
A Unifying Framework of Off-Policy General Value Function Evaluation. NeurIPS 2022 - [c11]Huaqing Xiong, Tengyu Xu, Lin Zhao, Yingbin Liang, Wei Zhang:
Deterministic policy gradient: Convergence analysis. UAI 2022: 2159-2169 - [i18]Sen Lin, Jialin Wan, Tengyu Xu, Yingbin Liang, Junshan Zhang:
Model-Based Offline Meta-Reinforcement Learning with Regularization. CoRR abs/2202.02929 (2022) - [i17]Tengyu Xu, Yingbin Liang:
Provably Efficient Offline Reinforcement Learning with Trajectory-Wise Reward. CoRR abs/2206.06426 (2022) - 2021
- [c10]Huaqing Xiong, Tengyu Xu, Yingbin Liang, Wei Zhang:
Non-asymptotic Convergence of Adam-type Reinforcement Learning Algorithms under Markovian Sampling. AAAI 2021: 10460-10468 - [c9]Tengyu Xu, Yingbin Liang:
Sample Complexity Bounds for Two Timescale Value-based Reinforcement Learning Algorithms. AISTATS 2021: 811-819 - [c8]Ziwei Guan, Tengyu Xu, Yingbin Liang:
When Will Generative Adversarial Imitation Learning Algorithms Attain Global Convergence. AISTATS 2021: 1117-1125 - [c7]Ziyi Chen, Yi Zhou, Tengyu Xu, Yingbin Liang:
Proximal Gradient Descent-Ascent: Variable Convergence under KŁ Geometry. ICLR 2021 - [c6]Tengyu Xu, Yingbin Liang, Guanghui Lan:
CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee. ICML 2021: 11480-11491 - [c5]Tengyu Xu, Zhuoran Yang, Zhaoran Wang, Yingbin Liang:
Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality. ICML 2021: 11581-11591 - [i16]Ziyi Chen, Yi Zhou, Tengyu Xu, Yingbin Liang:
Proximal Gradient Descent-Ascent: Variable Convergence under KŁ Geometry. CoRR abs/2102.04653 (2021) - [i15]Tengyu Xu, Zhuoran Yang, Zhaoran Wang, Yingbin Liang:
Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality. CoRR abs/2102.11866 (2021) - [i14]Tengyu Xu, Zhuoran Yang, Zhaoran Wang, Yingbin Liang:
A Unified Off-Policy Evaluation Approach for General Value Function. CoRR abs/2107.02711 (2021) - [i13]Ziwei Guan, Tengyu Xu, Yingbin Liang:
PER-ETD: A Polynomially Efficient Emphatic Temporal Difference Learning Method. CoRR abs/2110.06906 (2021) - [i12]Tianjiao Li, Ziwei Guan, Shaofeng Zou, Tengyu Xu, Yingbin Liang, Guanghui Lan:
Faster Algorithm and Sharper Analysis for Constrained Markov Decision Process. CoRR abs/2110.10351 (2021) - 2020
- [c4]Tengyu Xu, Zhe Wang, Yi Zhou, Yingbin Liang:
Reanalysis of Variance Reduced Temporal Difference Learning. ICLR 2020 - [c3]Tengyu Xu, Zhe Wang, Yingbin Liang:
Improving Sample Complexity Bounds for (Natural) Actor-Critic Algorithms. NeurIPS 2020 - [i11]Tengyu Xu, Zhe Wang, Yi Zhou, Yingbin Liang:
Reanalysis of Variance Reduced Temporal Difference Learning. CoRR abs/2001.01898 (2020) - [i10]Huaqing Xiong, Tengyu Xu, Yingbin Liang, Wei Zhang:
Non-asymptotic Convergence of Adam-type Reinforcement Learning Algorithms under Markovian Sampling. CoRR abs/2002.06286 (2020) - [i9]Tengyu Xu, Zhe Wang, Yingbin Liang:
Improving Sample Complexity Bounds for Actor-Critic Algorithms. CoRR abs/2004.12956 (2020) - [i8]Tengyu Xu, Zhe Wang, Yingbin Liang:
Non-asymptotic Convergence Analysis of Two Time-scale (Natural) Actor-Critic Algorithms. CoRR abs/2005.03557 (2020) - [i7]Tengyu Xu, Zhe Wang, Yingbin Liang, H. Vincent Poor:
Enhanced First and Zeroth Order Variance Reduced Algorithms for Min-Max Optimization. CoRR abs/2006.09361 (2020) - [i6]Ziwei Guan, Tengyu Xu, Yingbin Liang:
When Will Generative Adversarial Imitation Learning Algorithms Attain Global Convergence. CoRR abs/2006.13506 (2020) - [i5]Tengyu Xu, Yingbin Liang:
Sample Complexity Bounds for Two Timescale Value-based Reinforcement Learning Algorithms. CoRR abs/2011.05053 (2020) - [i4]Tengyu Xu, Yingbin Liang, Guanghui Lan:
A Primal Approach to Constrained Policy Optimization: Global Optimality and Finite-Time Analysis. CoRR abs/2011.05869 (2020)
2010 – 2019
- 2019
- [c2]Shaofeng Zou, Tengyu Xu, Yingbin Liang:
Finite-Sample Analysis for SARSA with Linear Function Approximation. NeurIPS 2019: 8665-8675 - [c1]Tengyu Xu, Shaofeng Zou, Yingbin Liang:
Two Time-scale Off-Policy TD Learning: Non-asymptotic Analysis over Markovian Samples. NeurIPS 2019: 10633-10643 - [i3]Shaofeng Zou, Tengyu Xu, Yingbin Liang:
Finite-Sample Analysis for SARSA and Q-Learning with Linear Function Approximation. CoRR abs/1902.02234 (2019) - [i2]Tengyu Xu, Shaofeng Zou, Yingbin Liang:
Two Time-scale Off-Policy TD Learning: Non-asymptotic Analysis over Markovian Samples. CoRR abs/1909.11907 (2019) - 2018
- [i1]Tengyu Xu, Yi Zhou, Kaiyi Ji, Yingbin Liang:
Convergence of SGD in Learning ReLU Models with Separable Data. CoRR abs/1806.04339 (2018)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-27 21:25 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint