Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer

Item #:
079017-4399

Details

Description

 

Members/Attendees

 

Tab 4