The Synergy Dilemma of Long-CoT SFT and RL
Our paper “The Synergy Dilemma of Long-CoT SFT and RL: Investigating Post-Training Techniques for Reasoning VLMs” is accepted in Transactions on Machine Learning Research (TMLR).
Our paper “The Synergy Dilemma of Long-CoT SFT and RL: Investigating Post-Training Techniques for Reasoning VLMs” is accepted in Transactions on Machine Learning Research (TMLR).