default search action
Ziniu Li
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
Journal Articles
- 2024
- [j3]Youlin Fan, Bo Jiu, Wenqiang Pu, Ziniu Li, Kang Li, Hongwei Liu:
Sensing Jamming Strategy From Limited Observations: An Imitation Learning Perspective. IEEE Trans. Signal Process. 72: 4098-4114 (2024) - 2022
- [j2]Tian Xu, Ziniu Li, Yang Yu:
Error Bounds of Imitating Policies and Environments for Reinforcement Learning. IEEE Trans. Pattern Anal. Mach. Intell. 44(10): 6968-6980 (2022) - 2020
- [j1]Xinjian Huang, Ziniu Li, Zhiyuan Liu, Bin Xiang, Yingsan Geng, Jianhua Wang:
Solving the Inverse Design Problem of Electrical Fuse With Machine Learning. IEEE Access 8: 74137-74144 (2020)
Conference and Workshop Papers
- 2024
- [c8]Heshen Zhan, Congliang Chen, Tian Ding, Ziniu Li, Ruoyu Sun:
Unlocking Black-Box Prompt Tuning Efficiency via Zeroth-Order Optimization. EMNLP (Findings) 2024: 14825-14838 - [c7]Ziniu Li, Tian Xu, Yang Yu:
When is RL better than DPO in RLHF? A Representation and Optimization Perspective. Tiny Papers @ ICLR 2024 - [c6]Ziniu Li, Tian Xu, Yushun Zhang, Zhihang Lin, Yang Yu, Ruoyu Sun, Zhi-Quan Luo:
ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models. ICML 2024 - 2023
- [c5]Ziniu Li, Tian Xu, Zeyu Qin, Yang Yu, Zhi-Quan Luo:
Imitation Learning from Imperfection: Theoretical Justifications and Algorithms. NeurIPS 2023 - [c4]Tian Xu, Ziniu Li, Yang Yu, Zhi-Quan Luo:
Provably Efficient Adversarial Imitation Learning with Unknown Transitions. UAI 2023: 2367-2378 - 2022
- [c3]Ziniu Li, Yingru Li, Yushun Zhang, Tong Zhang, Zhi-Quan Luo:
HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning. ICLR 2022 - 2020
- [c2]Ziniu Li, Xiong-Hui Chen:
Efficient Exploration by Novelty-Pursuit. DAI 2020: 85-102 - [c1]Tian Xu, Ziniu Li, Yang Yu:
Error Bounds of Imitating Policies and Environments. NeurIPS 2020
Informal and Other Publications
- 2024
- [i16]Yushun Zhang, Congliang Chen, Tian Ding, Ziniu Li, Ruoyu Sun, Zhi-Quan Luo:
Why Transformers Need Adam: A Hessian Perspective. CoRR abs/2402.16788 (2024) - [i15]Jiancong Xiao, Ziniu Li, Xingyu Xie, Emily J. Getzen, Cong Fang, Qi Long, Weijie J. Su:
On the Algorithmic Bias of Aligning Large Language Models with RLHF: Preference Collapse and Matching Regularization. CoRR abs/2405.16455 (2024) - [i14]Chengxing Jia, Pengyuan Wang, Ziniu Li, Yi-Chen Li, Zhilong Zhang, Nan Tang, Yang Yu:
BWArea Model: Learning World Model, Inverse Dynamics, and Policy for Controllable Language Generation. CoRR abs/2405.17039 (2024) - [i13]Yushun Zhang, Congliang Chen, Ziniu Li, Tian Ding, Chenwei Wu, Yinyu Ye, Zhi-Quan Luo, Ruoyu Sun:
Adam-mini: Use Fewer Learning Rates To Gain More. CoRR abs/2406.16793 (2024) - [i12]Ziniu Li, Congliang Chen, Tian Xu, Zeyu Qin, Jiancong Xiao, Ruoyu Sun, Zhi-Quan Luo:
Entropic Distribution Matching in Supervised Fine-tuning of LLMs: Less Overfitting and Better Diversity. CoRR abs/2408.16673 (2024) - 2023
- [i11]Ziniu Li, Tian Xu, Yang Yu, Zhi-Quan Luo:
Theoretical Analysis of Offline Imitation With Supplementary Dataset. CoRR abs/2301.11687 (2023) - [i10]Ziniu Li, Ke Xu, Liu Liu, Lanqing Li, Deheng Ye, Peilin Zhao:
Deploying Offline Reinforcement Learning with Human Feedback. CoRR abs/2303.07046 (2023) - [i9]Tian Xu, Ziniu Li, Yang Yu, Zhi-Quan Luo:
Provably Efficient Adversarial Imitation Learning with Unknown Transitions. CoRR abs/2306.06563 (2023) - [i8]Ziniu Li, Tian Xu, Yushun Zhang, Yang Yu, Ruoyu Sun, Zhi-Quan Luo:
ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models. CoRR abs/2310.10505 (2023) - [i7]Ziniu Li, Tian Xu, Yang Yu:
Policy Optimization in RLHF: The Impact of Out-of-preference Data. CoRR abs/2312.10584 (2023) - 2022
- [i6]Ziniu Li, Tian Xu, Yang Yu, Zhi-Quan Luo:
Rethinking ValueDice: Does It Really Improve Performance? CoRR abs/2202.02468 (2022) - [i5]Ziniu Li, Tian Xu, Yang Yu:
A Note on Target Q-learning For Solving Finite MDPs with A Generative Oracle. CoRR abs/2203.11489 (2022) - [i4]Tian Xu, Ziniu Li, Yang Yu, Zhi-Quan Luo:
Understanding Adversarial Imitation Learning in Small Sample Regime: A Stage-coupled Analysis. CoRR abs/2208.01899 (2022) - 2021
- [i3]Tian Xu, Ziniu Li, Yang Yu:
Nearly Minimax Optimal Adversarial Imitation Learning with Known and Unknown Transitions. CoRR abs/2106.10424 (2021) - 2020
- [i2]Tian Xu, Ziniu Li, Yang Yu:
Error Bounds of Imitating Policies and Environments. CoRR abs/2010.11876 (2020) - 2019
- [i1]Tian Xu, Ziniu Li, Yang Yu:
On Value Discrepancy of Imitation Learning. CoRR abs/1911.07027 (2019)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-15 20:39 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint