Paper

  • Policy Finetuning in Reinforcement Learning via Design of Experiments using Offline Data

    Neural Information Processing Systems Foundation, Inc. (NeurIPS)

    Policy Finetuning in Reinforcement Learning via Design of Experiments using Offline Data

    Ruiqi Zhang, Andrea Zanette p59953-59995 from Advances in Neural Information Processing Systems 36
    Our Price: $0.00
  • Policy Gradient for Rectangular Robust Markov Decision Processes

    Neural Information Processing Systems Foundation, Inc. (NeurIPS)

    Policy Gradient for Rectangular Robust Markov Decision Processes

    Navdeep Kumar, Esther Derman, Matthieu Geist, Kfir Y. Levy, Shie Mannor p59477-59501 from Advances in Neural Information Processing Systems 36
    Our Price: $0.00
  • Policy Gradient with Serial Markov Chain Reasoning

    Neural Information Processing Systems Foundation, Inc. (NeurIPS)

    Policy Gradient with Serial Markov Chain Reasoning

    Edoardo Cetin, Oya Celiktutan p8824-8839 from Advances in Neural Information Processing Systems 35
    Our Price: $0.00
  • Policy Improvement using Language Feedback Models

    Neural Information Processing Systems Foundation, Inc. (NeurIPS)

    Policy Improvement using Language Feedback Models

    Victor Zhong, Dipendra Misra, Xingdi Yuan, Marc-Alexandre Côté p43730-43758 from Advances in Neural Information Processing Systems 37
    Our Price: $0.00
  • Policy Learning from Tutorial Books via Understanding, Rehearsing and Introspecting

    Neural Information Processing Systems Foundation, Inc. (NeurIPS)

    Policy Learning from Tutorial Books via Understanding, Rehearsing and Introspecting

    Xiong-Hui Chen, Ziyan Wang, Yali Du, Shengyi Jiang, Meng Fang, Yang Yu, Jun Wang p18940-18987 from Advances in Neural Information Processing Systems 37
    Our Price: $0.00
  • Policy Mirror Descent with Lookahead

    Neural Information Processing Systems Foundation, Inc. (NeurIPS)

    Policy Mirror Descent with Lookahead

    Kimon Protopapas, Anas Barakat p26443-26481 from Advances in Neural Information Processing Systems 37
    Our Price: $0.00
  • Policy Optimization for Continuous Reinforcement Learning

    Neural Information Processing Systems Foundation, Inc. (NeurIPS)

    Policy Optimization for Continuous Reinforcement Learning

    Hanyang Zhao, Wenpin Tang, David Yao p13637-13663 from Advances in Neural Information Processing Systems 36
    Our Price: $0.00
  • Policy Optimization for Markov Games: Unified Framework and Faster Convergence

    Neural Information Processing Systems Foundation, Inc. (NeurIPS)

    Policy Optimization for Markov Games: Unified Framework and Faster Convergence

    Runyu Zhang, Qinghua Liu, Huan Wang, Caiming Xiong, Na Li, Yu Bai p21886-21899 from Advances in Neural Information Processing Systems 35
    Our Price: $0.00
  • Policy Optimization for Robust Average Reward MDPs

    Neural Information Processing Systems Foundation, Inc. (NeurIPS)

    Policy Optimization for Robust Average Reward MDPs

    Zhongchang Sun, Sihong He, Fei Miao, Shaofeng Zou p17348-17372 from Advances in Neural Information Processing Systems 37
    Our Price: $0.00
  • Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control

    Neural Information Processing Systems Foundation, Inc. (NeurIPS)

    Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control

    Nate Rahn, Pierluca D'Oro, Harley Wiltzer, Pierre-Luc Bacon, Marc Bellemare p30618-30640 from Advances in Neural Information Processing Systems 36
    Our Price: $0.00
  • Policy Optimization with Advantage Regularization for Long-Term Fairness in Decision Systems

    Neural Information Processing Systems Foundation, Inc. (NeurIPS)

    Policy Optimization with Advantage Regularization for Long-Term Fairness in Decision Systems

    Eric Yu, Zhizhen Qin, Min Kyung Lee, Sicun Gao p8211-8213 from Advances in Neural Information Processing Systems 35
    Our Price: $0.00
  • Policy Optimization with Linear Temporal Logic Constraints

    Neural Information Processing Systems Foundation, Inc. (NeurIPS)

    Policy Optimization with Linear Temporal Logic Constraints

    Cameron Voloshin, Hoang Le, Swarat Chaudhuri, Yisong Yue p17690-17702 from Advances in Neural Information Processing Systems 35
    Our Price: $0.00