"A maximum-entropy approach to off-policy evaluation in average-reward MDPs."

Nevena Lazic et al. (2020)

Details and statistics

DOI:

access: open

type: Informal or Other Publication

metadata version: 2024-08-29