Hummer: Towards Limited Competitive Preference Dataset
safety
Li Jiang*, Yusen Wu*, Junwu Xiong, Jingqing Ruan, Yichuan Ding, Qingpei Guo, Zujie Wen, Jun Zhou, Xiaotie Deng
TL;DRReduces conflicting preference signals across alignment objectives via a low-competition preference dataset, improving multi-attribute RLHF stability.