Jiangjie Chen (陈江捷) is a researcher at ByteDance Seed Team. In 2024, he earned his Ph.D. at Fudan University in the School of Computer Science, Shanghai, China. His current interested research topics are mostly around building reasoning models and autonomous agents:
Ph.D. in CS, 2019 - 2024
Fudan University
B.S. in CS (honors), 2014 - 2019
Fudan University
Jul. 2025: Our paper Past Meets Present won an Outstanding Paper Award at ACL 2025!
Jul. 2025: Check out our work on spatial cognition! Can LLMs Learn to Map the World from Local Descriptions? We demonstrate that LLMs can develop spatial awareness from fragmented local observations, successfully learning spatial perception and navigation in urban environments!
Jul. 2025: Check out ARIA! We propose a novel approach for training language agents with intention-driven reward aggregation. By projecting natural language actions into a lower-dimensional intention space, ARIA reduces reward variance and improves policy optimization, achieving 9.95% average performance gains across four downstream tasks.
Jul. 2025: Check out MemAgent! We introduce a novel approach to handling extremely long documents in language models through a multi-conv RL-based memory agent. MemAgent extends from 8K context to 3.5M QA tasks with <5% performance loss and achieves 95%+ on 512K RULER test!
May. 2025: Check out KORGym! We introduce a dynamic game platform offering over fifty games in textual or visual formats for interactive, multi-turn LLM reasoning evaluation with reinforcement learning scenarios. Our platform reveals consistent reasoning patterns within model families and demonstrates superior performance of closed-source models.
May. 2025: Check out Enigmata! We propose a comprehensive suite of puzzles for improving logical reasoning of reasoning models, tailored for RLVR training. We find that not only do such puzzles drastically improve puzzle reasoning of LLMs, but also improve SoTA models such as Seed1.5-Thinking on challenging reasoning tasks such as AIME and GPQA! This is a free lunch for SoTA models, since Enigmata is synthetic and can be generated at scale!
May. 2025: I was awarded with Nomination Award for Outstanding Doctoral Dissertation of Shanghai Computer Society!
May. 2025: Our papers DEEPER and HistoryAnalogy are accepted to ACL 2025!
May. 2025: Our paper CoSER is accepted to ICML 2025! Check out this comprehensive resource for role-playing agents!
Apr. 2025: Presenting Seed-Thinking-v1.5 from ByteDance Seed Team, a cutting-edge reasoning model that’s incredible in math, code, science, and logical reasoning!
Mar. 2025: DAPO is out! A new critic-free RL algorithm that directly trains a pre-trained base model to SoTA performance on AIME 2024 without any SFT.
Mar. 2025: Four papers accepted to NAACL 2025: SelfGoal, EvoAgent, EasyTool and Barrier in Language Agent Planning.
Oct. 2024: Three papers accepted to NeurIPS 2024 Workshop on Open-World Agents: EvoAgent, SelfGoal and AucArena. See you in Vancouver!
Sep. 2024: Our survey paper on role-playing agents is accepted to TMLR!
Sep. 2024: We have three accepted papers in EMNLP 2024! Two main papers are Segment+ on long-context processing with short-context models, and CROSS on role-playing evaluation, and one finding paper DetectBench on benchmarking detective reasoning.
Jul. 2024: Our work on Irrelevant Evidence got accepted in COLM 2024!
Jul. 2024: I have graduated from Fudan University, and will officially join ByteDance Seed Team as a Full-time researcher.
MemAgent introduces a multi-conv RL-based memory agent that enables language models to handle extremely long documents, extending from 8K to 3.5M tokens with minimal performance degradation.
We introduce Enigmata, the first comprehensive suite tailored for improving LLMs with puzzle reasoning skills.
We introduce Seed-Thinking-v1.5, a Mixture-of-Experts (MoE) model with a relatively small size, featuring 20B activated and 200B total parameters, capable of reasoning through thinking before responding, resulting in improved performance on a widerange of benchmarks.
We introduce DAPO, a Decoupled Clip and Dynamic sAmpling Policy Optimization algorithm, and fully open-source a state-of-the-art large-scale RL system that achieves 50 points on AIME 2024 using Qwen2.5-32B base model.
We introduce TravelAgent, an LLM-powered travel planning system that generates rational, comprehensive, and personalized itineraries through four modules, demonstrating effectiveness in dynamic scenarios.
We introduced TravelPlanner, a benchmark for assessing language agents’ planning abilities, showing that even advanced models like GPT-4 face difficulties with complex tasks.
We present the first comprehensive and controlled investigation into the behavior of large language models when encountering knowledge conflicts.