NOVER: Incentive Training for Language Models via Verifier-Free Reinforcement Learning
Published in arXiv preprint, 2025
Incentive training for language models using verifier-free reinforcement learning approach.
Recommended citation: W Liu, S Qi, X Wang, C Qian, Y Du, Y He. (2025). "NOVER: Incentive Training for Language Models via Verifier-Free Reinforcement Learning." arXiv preprint arXiv:2505.16022.