Jiangjie Chen (陈江捷) is a researcher at ByteDance Seed Team. In 2024, he earned his Ph.D. at Fudan University in the School of Computer Science, Shanghai, China. His interested research topics are mostly around autonomous generative agents, including (but are not limited to):
Ph.D. in CS, 2019 - 2024
Fudan University
B.S. in CS (honors), 2014 - 2019
Fudan University
Oct. 2024: Three papers accepted to NeurIPS 2024 Workshop on Open-World Agents: EvoAgent, SelfGoal and AucArena. See you in Vancouver!
Sep. 2024: Our survey paper on role-playing agents is accepted to TMLR!
Sep. 2024: We have three accepted papers in EMNLP 2024! Two main papers are Segment+ on long-context processing with short-context models, and CROSS on role-playing evaluation, and one finding paper DetectBench on benchmarking detective reasoning.
Jul. 2024: Our work on Irrelevant Evidence got accepted in COLM 2024!
Jul. 2024: I have graduated from Fudan University, and will officially join ByteDance Seed Team as a Full-time researcher.
Jun. 2024: How to automatically extend the specialized agent to multi-agent systems to improve task-solving capability? We propose EvoAgent, a generic method to automatically extend expert agents to multi-agent systems via the evolutionary algorithm. EvoAgent can be generalized to any LLM-based agent framework, and significantly enhance the task-solving capabilities of LLM-based agents!
Jun. 2024: Want your agents to win an auction for you? But does your agent know what it means by such a vague and high-level goal as “winning an auction”? Check out SelfGoal! We propose an automatic approach that enhances language agents’ capabilities to achieve high-level goals with limited instructions and delayed feedback by adaptively breaking down goals into practical subgoals. Really excited about automating agents to do high-level task with minimal human instructions!
Jun. 2024: TravelPlanner got a Spotlight recommendation at ICML 2024!
May 2024: Just defended my thesis, officially a Dr. :)
May 2024: Four papers are accepted to the main conference of ACL 2024! They are: TimeArena, AnalogyKB, InCharacter and GumbelSoft! See you in Bangkok :)
May 2024: Our TravelPlanner got accepted to ICML 2024!
Apr. 2024: The first Survey on Role-Playing Agents is out! Dive into our comprehensive survey of RPLA technologies, their applications, and the exciting potential for human-AI coexistence. Understanding role-playing paves the way for both personalized assistants and multi-agent society. Check our latest survey on role-playing agent!
Apr. 2024: Checkout SurveyAgent! This system stands out by offering a unified platform that supports researchers through various stages of their literature review process, facilitated by a conversational interface that prioritizes user interaction and personalization! Access via homepage and have fun!
Apr. 2024: In our new work, we extend our previous work on knowledge conflict during RAG, and find that LLMs are not robust to various types of irrelevant evidence in the context, which are very much lethal to RAG applications!
Mar. 2024: Agent Group Chat is out! In this paper, we build a multi-agent simulation that studies the impact of language on collective human behavior, using diverse scenarios such as inheritance disputes and philosophical debates. The simulation revealed that agents, when given complex language abilities and diverse personalities, can exhibit emergent behaviors that mirror human unpredictability. TLDR: Let’s scale up the number and diversity of agents and their interactions!
Mar. 2024: TravelPlanner and EasyTool are accepted to the LLM Agent Workshop @ICLR 2024 :)!
Feb. 2024: Check out TimeArena! In TimeArena, we built a simulated textual environment for language agents to complete multiple tasks in the shortest time, which is a very realistic setting of human world and also a great challenge for SoTA LLMs. Check out this project page for more details!
Feb. 2024: Check out GumbelSoft, a watermark for LLMs that allows for diversified text generation while remaining detectability!
Feb. 2024: Check out InCharacter! Self-assessments on RPAs are inherently flawed - which heavily depends on LLM’s own understanding of Personality. Instead, our work revolves around interviewing characters in 14 different psychological scales, providing a more objective description of LLM’s role play abilities. Check out this project demo!
Feb. 2024: TravelPlanner is out! We evaluate the planning ability of LLM agents using a real-world setup: Travel Planning, which is prominent for the challenge of integrating so many types of realistic constraints, like “tickets soldout” or “limited budget” - and GPT-4 is bad at this! Check out our paper for more information.
Jan. 2024: Our paper got accepted to ICLR 2024 as Spotlight! A huge honor, see you in Vienna, Austria!
Jan. 2024: New pre-print! We proposes EasyTool that improves your agents’ tool-usage with minimal efforts: by streamlining tool documents into tool instructions 🔨. Check out this tweet!
Dec. 2023: Our paper IdiomKB on Idiomatic Translation got accepted to AAAI 2024! We propose a multilingual idiom KB (IdiomKB) developed using LLMs to facilitate better idiomatic translation by smaller models by retrieving idioms’ figurative meanings.
Dec. 2023: Social@EMNLP 2023, Singapore, open to any chat!
Nov. 2023: Gave a talk at CMU LTI Li Lab, titled: “Say, Think, Act: Towards Human-like Autonomous Language Agents”.
Oct. 2023: Check out our newest pre-print Auction Arena! We explore the intriguing domain of how LLMs navigate the complex and dynamic environment of auctions. We introduce AucArena, a novel simulation environment tailored for assessing LLMs within the unpredictable yet strategic world of auctions. Play with arena demo and see if you can beat AI!
Oct. 2023: Our paper SCAR got accepted to EMNLP 2023 Findings! A nice addition to the analogical reasoning domain! See you in Singapore :).
July 2023: Our paper CoScript got an Outstanding Paper Award in ACL 2023!
June 2023: Coming to Seattle for a summer internship at Allen Institute for AI, working with the great Aristo Team!
May 2023: A pre-print on the knowledge conflict of large language models! See the tweet. Turns out ChatGPT and GPT-4 somehow stick to its own belief, are receptive/gullible to longer, better-formatted, more popular evidence, and follow the herd… All kinds of human-like, dangerous behaviors!
May 2023: Check out two pre-prints on Analogical Reasoning, which extend E-KAR! AnalogyKB is a million-scale analogy KB derived from existing KGs, to enable machines to achieve analogical reasoning skills. SCAR is a new challenge for evaluating LLMs’ structure abduction ability for scientific analogies, which is essential for human-like analogical reasoning.
May 2023: Got two papers about LLMs accepted to the main conference of ACL 2023! The first paper is about analyzing why LLMs fail to generate negative knowledge while being able to recognize them. The other is CoScript, studying how to generate plans under constraints with LLMs. See you in Toronto (hopefully :/)!
We introduce TravelAgent, an LLM-powered travel planning system that generates rational, comprehensive, and personalized itineraries through four modules, demonstrating effectiveness in dynamic scenarios.
We introduce SelfGoal, an automatic approach that enhances language agents’ capabilities to achieve high-level goals with limited instructions and delayed feedback by adaptively breaking down goals into practical subgoals.
TimeArena enhances LLMs with temporal dynamics for better multitasking, showing advanced models like GPT-4 still trail behind human temporal awareness.
We introduced TravelPlanner, a benchmark for assessing language agents’ planning abilities, showing that even advanced models like GPT-4 face difficulties with complex tasks.
We propose AucArena to tests LLMs in auctions, showing they can strategize but with variable success, indicating potential for enhancement.
We present the first comprehensive and controlled investigation into the behavior of large language models when encountering knowledge conflicts.
We propose an over-generate-then-filter approach to improve large language models (LLMs) on constrained language planning, and use it to distill a novel constrained language planning dataset, CoScript.