Paper

  • Policy Finetuning in Reinforcement Learning via Design of Experiments using Offline Data

    Neural Information Processing Systems Foundation, Inc. (NeurIPS)

    Policy Finetuning in Reinforcement Learning via Design of Experiments using Offline Data

    Andrea Zanette, Ruiqi Zhang p59953-59995 from Advances in Neural Information Processing Systems 36
    Our Price: $0.00
  • Policy Gradient for Rectangular Robust Markov Decision Processes

    Neural Information Processing Systems Foundation, Inc. (NeurIPS)

    Policy Gradient for Rectangular Robust Markov Decision Processes

    Esther Derman, Matthieu Geist, Navdeep Kumar, Kfir Y. Levy, Shie Mannor p59477-59501 from Advances in Neural Information Processing Systems 36
    Our Price: $0.00
  • Policy Gradient with Serial Markov Chain Reasoning

    Neural Information Processing Systems Foundation, Inc. (NeurIPS)

    Policy Gradient with Serial Markov Chain Reasoning

    Oya Celiktutan, Edoardo Cetin p8824-8839 from Advances in Neural Information Processing Systems 35
    Our Price: $0.00
  • Policy Improvement using Language Feedback Models

    Neural Information Processing Systems Foundation, Inc. (NeurIPS)

    Policy Improvement using Language Feedback Models

    Marc-Alexandre Côté, Dipendra Misra, Xingdi Yuan, Victor Zhong p43730-43758 from Advances in Neural Information Processing Systems 37
    Our Price: $0.00
  • Policy Learning from Tutorial Books via Understanding, Rehearsing and Introspecting

    Neural Information Processing Systems Foundation, Inc. (NeurIPS)

    Policy Learning from Tutorial Books via Understanding, Rehearsing and Introspecting

    Xiong-Hui Chen, Yali Du, Meng Fang, Shengyi Jiang, Jun Wang, Ziyan Wang, Yang Yu p18940-18987 from Advances in Neural Information Processing Systems 37
    Our Price: $0.00
  • Policy Mirror Descent with Lookahead

    Neural Information Processing Systems Foundation, Inc. (NeurIPS)

    Policy Mirror Descent with Lookahead

    Anas Barakat, Kimon Protopapas p26443-26481 from Advances in Neural Information Processing Systems 37
    Our Price: $0.00
  • Policy Optimization for Continuous Reinforcement Learning

    Neural Information Processing Systems Foundation, Inc. (NeurIPS)

    Policy Optimization for Continuous Reinforcement Learning

    Wenpin Tang, David Yao, Hanyang Zhao p13637-13663 from Advances in Neural Information Processing Systems 36
    Our Price: $0.00
  • Policy Optimization for Markov Games: Unified Framework and Faster Convergence

    Neural Information Processing Systems Foundation, Inc. (NeurIPS)

    Policy Optimization for Markov Games: Unified Framework and Faster Convergence

    Yu Bai, Na Li, Qinghua Liu, Huan Wang, Caiming Xiong, Runyu Zhang p21886-21899 from Advances in Neural Information Processing Systems 35
    Our Price: $0.00
  • Policy Optimization for Robust Average Reward MDPs

    Neural Information Processing Systems Foundation, Inc. (NeurIPS)

    Policy Optimization for Robust Average Reward MDPs

    Sihong He, Fei Miao, Zhongchang Sun, Shaofeng Zou p17348-17372 from Advances in Neural Information Processing Systems 37
    Our Price: $0.00
  • Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control

    Neural Information Processing Systems Foundation, Inc. (NeurIPS)

    Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control

    Pierre-Luc Bacon, Marc Bellemare, Pierluca D'Oro, Nate Rahn, Harley Wiltzer p30618-30640 from Advances in Neural Information Processing Systems 36
    Our Price: $0.00
  • Policy Optimization with Advantage Regularization for Long-Term Fairness in Decision Systems

    Neural Information Processing Systems Foundation, Inc. (NeurIPS)

    Policy Optimization with Advantage Regularization for Long-Term Fairness in Decision Systems

    Sicun Gao, Min Kyung Lee, Zhizhen Qin, Eric Yu p8211-8213 from Advances in Neural Information Processing Systems 35
    Our Price: $0.00
  • Policy Optimization with Linear Temporal Logic Constraints

    Neural Information Processing Systems Foundation, Inc. (NeurIPS)

    Policy Optimization with Linear Temporal Logic Constraints

    Swarat Chaudhuri, Hoang Le, Cameron Voloshin, Yisong Yue p17690-17702 from Advances in Neural Information Processing Systems 35
    Our Price: $0.00