The Synergy Dilemma of Long-CoT SFT and RL

Created on December 19, 2025

2025

Our paper “The Synergy Dilemma of Long-CoT SFT and RL: Investigating Post-Training Techniques for Reasoning VLMs” is accepted in Transactions on Machine Learning Research (TMLR).