RLVR Training of LLMs Does Not Improve Thinking Ability for General QA: Evaluation Method and a Simple Solution
Published:
Recommended citation: Kaiyuan Li, Jing-Cheng Pang and Yang Yu. RLVR Training of LLMs Does Not Improve Thinking Ability for General QA: Evaluation Method and a Simple Solution. ICML 2026 FoGen Workshop.
