news
| Apr 07, 2026 | Our paper “One Adapts to Any: Meta Reward Modeling for Personalized LLM Alignment” is accepted in SIGIR 2026. |
|---|---|
| Jan 08, 2026 | We release our survey on Agent-as-a-Judge. |
| Dec 19, 2025 | Our paper “The Synergy Dilemma of Long-CoT SFT and RL: Investigating Post-Training Techniques for Reasoning VLMs” is accepted in Transactions on Machine Learning Research (TMLR). |