NOVER: Incentive Training for Language Models via Verifier-Free Reinforcement Learning
W Liu, S Qi, X Wang, C Qian, Y Du, Y He. (2025). "NOVER: Incentive Training for Language Models via Verifier-Free Reinforcement Learning." arXiv preprint arXiv:2505.16022.