


default search action
Siwei Wang 0002
Person information
- affiliation: Tsinghua University, Beijing, China
Other persons with the same name
- Siwei Wang — disambiguation page
- Siwei Wang 0001
— College of Computer, National University of Defense Technology, Changsha, China (and 1 more)
- Siwei Wang 0003
— University of Chicago, Chicago, IL, USA
- Siwei Wang 0004
— Xiamen University, Xiamen, China
- Siwei Wang 0005
— Bytedance, USA
- Siwei Wang 0006 — Peking University, Beijing, China
- Siwei Wang 0007 — Guangdong Open University, Guangzhou, China
- Siwei Wang 0008 — Argonne National Laboratory, Lemont, IL, USA
- Siwei Wang 0009 — Dalian Minzu University, Dalian, China
- Siwei Wang 0010
— Hunan University, College of Finance and Statistics, China
- Siwei Wang 0011 — Zhejiang Normal University, College of Mathematics and Computer Science, Jinhua, China
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c18]Junning Shao, Siwei Wang, Zhixuan Fang:
Balanced and Incentivized Learning with Limited Shared Information in Multi-agent Multi-armed Bandit. AAMAS 2024: 2459-2461 - [c17]Yu Chen, Yihan Du, Pihe Hu, Siwei Wang, Desheng Wu, Longbo Huang:
Provably Efficient Iterated CVaR Reinforcement Learning with Function Approximation and Human Feedback. ICLR 2024 - [c16]Xutong Liu, Siwei Wang, Jinhang Zuo, Han Zhong, Xuchuang Wang, Zhiyong Wang, Shuai Li, Mohammad Hajiesmaili, John C. S. Lui, Wei Chen:
Combinatorial Multivariant Multi-Armed Bandits with Applications to Episodic Reinforcement Learning and Beyond. ICML 2024 - [c15]Yu Chen, Xiangcheng Zhang, Siwei Wang, Longbo Huang:
Provable Risk-Sensitive Distributional Reinforcement Learning with General Function Approximation. ICML 2024 - [c14]Siwei Wang, Yifei Shen, Shi Feng, Haoran Sun, Shang-Hua Teng, Wei Chen:
ALPINE: Unveiling The Planning Capability of Autoregressive Learning in Language Models. NeurIPS 2024 - [i19]Yu Chen, Xiangcheng Zhang, Siwei Wang, Longbo Huang:
Provable Risk-Sensitive Distributional Reinforcement Learning with General Function Approximation. CoRR abs/2402.18159 (2024) - [i18]Siwei Wang, Yifei Shen, Shi Feng, Haoran Sun, Shang-Hua Teng, Wei Chen:
ALPINE: Unveiling the Planning Capability of Autoregressive Learning in Language Models. CoRR abs/2405.09220 (2024) - [i17]Haoran Sun, Yurong Chen, Siwei Wang, Wei Chen, Xiaotie Deng:
Mechanism Design for LLM Fine-tuning with Multiple Reward Models. CoRR abs/2405.16276 (2024) - [i16]Xutong Liu, Siwei Wang, Jinhang Zuo, Han Zhong, Xuchuang Wang, Zhiyong Wang, Shuai Li, Mohammad Hajiesmaili, John C. S. Lui, Wei Chen:
Combinatorial Multivariant Multi-Armed Bandits with Applications to Episodic Reinforcement Learning and Beyond. CoRR abs/2406.01386 (2024) - [i15]Seockbean Song, Youngsik Yoon, Siwei Wang, Wei Chen, Jungseul Ok:
Combinatorial Rising Bandit. CoRR abs/2412.00798 (2024) - 2023
- [c13]Yihan Du, Siwei Wang, Longbo Huang:
Provably Efficient Risk-Sensitive Reinforcement Learning: Iterated CVaR and Worst Path. ICLR 2023 - [c12]Xutong Liu, Jinhang Zuo, Siwei Wang, John C. S. Lui, Mohammad Hajiesmaili, Adam Wierman, Wei Chen:
Contextual Combinatorial Bandits with Probabilistically Triggered Arms. ICML 2023: 22559-22593 - [i14]Xutong Liu, Jinhang Zuo, Siwei Wang, John C. S. Lui, Mohammad H. Hajiesmaili, Adam Wierman, Wei Chen:
Contextual Combinatorial Bandits with Probabilistically Triggered Arms. CoRR abs/2303.17110 (2023) - [i13]Jing Dong, Jingyu Wu, Siwei Wang, Baoxiang Wang, Wei Chen:
Taming the Exponential Action Set: Sublinear Regret and Fast Convergence to Nash Equilibrium in Online Congestion Games. CoRR abs/2306.13673 (2023) - 2022
- [j1]Siwei Wang
, Wei Chen
:
The pure exploration problem with general reward functions depending on full distributions. Mach. Learn. 111(9): 3279-3306 (2022) - [c11]Siwei Wang, Jun Zhu:
Thompson Sampling for (Combinatorial) Pure Exploration. ICML 2022: 23470-23483 - [c10]Qingsong Liu, Weihang Xu, Siwei Wang, Zhixuan Fang:
Combinatorial Bandits with Linear Constraints: Beyond Knapsacks and Fairness. NeurIPS 2022 - [c9]Xutong Liu, Jinhang Zuo, Siwei Wang, Carlee Joe-Wong, John C. S. Lui, Wei Chen:
Batch-Size Independent Regret Bounds for Combinatorial Semi-Bandits with Probabilistically Triggered Arms or Independent Arms. NeurIPS 2022 - [c8]Yirui Zhang, Siwei Wang, Zhixuan Fang:
Matching in Multi-arm Bandit with Collision. NeurIPS 2022 - [i12]Yihan Du, Siwei Wang, Longbo Huang:
Risk-Sensitive Reinforcement Learning: Iterated CVaR and the Worst Path. CoRR abs/2206.02678 (2022) - [i11]Siwei Wang, Jun Zhu:
Thompson Sampling for (Combinatorial) Pure Exploration. CoRR abs/2206.09150 (2022) - [i10]Qihan Guo, Siwei Wang, Jun Zhu:
Regret Analysis for Hierarchical Experts Bandit Problem. CoRR abs/2208.05622 (2022) - [i9]Xutong Liu, Jinhang Zuo, Siwei Wang, Carlee Joe-Wong, John C. S. Lui, Wei Chen
:
Batch-Size Independent Regret Bounds for Combinatorial Semi-Bandits with Probabilistically Triggered Arms or Independent Arms. CoRR abs/2208.14837 (2022) - [i8]Yihan Du, Siwei Wang, Longbo Huang:
Dueling Bandits: From Two-dueling to Multi-dueling. CoRR abs/2211.10293 (2022) - 2021
- [c7]Yihan Du, Siwei Wang, Longbo Huang:
A One-Size-Fits-All Solution to Conservative Bandit Problems. AAAI 2021: 7254-7261 - [c6]Siwei Wang, Haoyun Wang, Longbo Huang:
Adaptive Algorithms for Multi-armed Bandit with Composite and Anonymous Feedback. AAAI 2021: 10210-10217 - [c5]Yihan Du, Siwei Wang, Zhixuan Fang, Longbo Huang:
Continuous Mean-Covariance Bandits. NeurIPS 2021: 875-886 - [i7]Yihan Du, Siwei Wang, Zhixuan Fang, Longbo Huang:
Continuous Mean-Covariance Bandits. CoRR abs/2102.12090 (2021) - [i6]Siwei Wang, Wei Chen:
Pure Exploration Bandit Problem with General Reward Functions Depending on Full Distributions. CoRR abs/2105.03598 (2021) - 2020
- [c4]Yihan Du, Siwei Wang
, Longbo Huang:
Dueling Bandits: From Two-dueling to Multi-dueling. AAMAS 2020: 348-356 - [c3]Siwei Wang
, Longbo Huang, John C. S. Lui:
Restless-UCB, an Efficient and Low-complexity Algorithm for Online Restless Bandits. NeurIPS 2020 - [i5]Siwei Wang, Longbo Huang, John C. S. Lui:
Restless-UCB, an Efficient and Low-complexity Algorithm for Online Restless Bandits. CoRR abs/2011.02664 (2020) - [i4]Siwei Wang, Haoyun Wang, Longbo Huang:
Adaptive Algorithms for Multi-armed Bandit with Composite and Anonymous Feedback. CoRR abs/2012.07048 (2020) - [i3]Yihan Du, Siwei Wang, Longbo Huang:
A One-Size-Fits-All Solution to Conservative Bandit Problems. CoRR abs/2012.07341 (2020)
2010 – 2019
- 2018
- [c2]Siwei Wang
, Wei Chen:
Thompson Sampling for Combinatorial Semi-Bandits. ICML 2018: 5101-5109 - [c1]Siwei Wang
, Longbo Huang:
Multi-armed Bandits with Compensation. NeurIPS 2018: 5119-5128 - [i2]Siwei Wang, Wei Chen:
Thompson Sampling for Combinatorial Semi-Bandits. CoRR abs/1803.04623 (2018) - [i1]Siwei Wang, Longbo Huang:
Multi-armed Bandits with Compensation. CoRR abs/1811.01715 (2018)
Coauthor Index

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from ,
, and
to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and
to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2025-02-15 01:19 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint