Large Language Models

KORGym: A Dynamic Game Platform for LLM Reasoning Evaluation
KORGym offers over fifty games in textual or visual formats for interactive, multi-turn LLM reasoning evaluation with reinforcement learning scenarios.
DAPO: An Open-source LLM Reinforcement Learning System At Scale
We introduce DAPO, a Decoupled Clip and Dynamic sAmpling Policy Optimization algorithm, and fully open-source a state-of-the-art large-scale RL system that achieves 50 points on AIME 2024 using Qwen2.5-32B base model.