"Softmax policy gradient methods can take exponential time to converge."

Gen Li et al. (2023)

Details and statistics

DOI: 10.1007/S10107-022-01920-6

access: open

type: Journal Article

metadata version: 2024-10-15