Jiangjie Chen (陈江捷) is a final year Ph.D. candidate at Fudan University in the School of Computer Science, Shanghai, China. His interested research topics are mostly around autonomous generative agents, including (but are not limited to):
( Download my resumé . Could be outdated. 😶)
Ph.D. in CS, 2019 - 2024 (estimated)
Fudan University
B.S. in CS (honors), 2014 - 2019
Fudan University
Apr. 2024: Checkout SurveyAgent! This system stands out by offering a unified platform that supports researchers through various stages of their literature review process, facilitated by a conversational interface that prioritizes user interaction and personalization! Access via homepage and have fun!
Apr. 2024: In our new work, we extend our previous work on knowledge conflict during RAG, and find that LLMs are not robust to various types of irrelevant evidence in the context, which are very much lethal to RAG applications!
Mar. 2024: Agent Group Chat is out! In this paper, we build a multi-agent simulation that studies the impact of language on collective human behavior, using diverse scenarios such as inheritance disputes and philosophical debates. The simulation revealed that agents, when given complex language abilities and diverse personalities, can exhibit emergent behaviors that mirror human unpredictability. TLDR: Let’s scale up the number and diversity of agents and their interactions!
Mar. 2024: TravelPlanner and EasyTool are accepted to the LLM Agent Workshop @ICLR 2024 :)!
Feb. 2024: Check out TimeArena! In TimeArena, we built a simulated textual environment for language agents to complete multiple tasks in the shortest time, which is a very realistic setting of human world and also a great challenge for SoTA LLMs. Check out this project page for more details!
Feb. 2024: Check out GumbelSoft, a watermark for LLMs that allows for diversified text generation while remaining detectability!
Feb. 2024: Check out InCharacter! Self-assessments on RPAs are inherently flawed - which heavily depends on LLM’s own understanding of Personality. Instead, our work revolves around interviewing characters in 14 different psychological scales, providing a more objective description of LLM’s role play abilities. Check out this project demo!
Feb. 2024: TravelPlanner is out! We evaluate the planning ability of LLM agents using a real-world setup: Travel Planning, which is prominent for the challenge of integrating so many types of realistic constraints, like “tickets soldout” or “limited budget” - and GPT-4 is bad at this! Check out our paper for more information.
Jan. 2024: Our paper got accepted to ICLR 2024 as Spotlight! A huge honor, see you in Vienna, Austria!
Jan. 2024: New pre-print! We proposes EasyTool that improves your agents’ tool-usage with minimal efforts: by streamlining tool documents into tool instructions 🔨. Check out this tweet!
Dec. 2023: Our paper IdiomKB on Idiomatic Translation got accepted to AAAI 2024! We propose a multilingual idiom KB (IdiomKB) developed using LLMs to facilitate better idiomatic translation by smaller models by retrieving idioms’ figurative meanings.
Dec. 2023: Social@EMNLP 2023, Singapore, open to any chat!
Nov. 2023: Gave a talk at CMU LTI Li Lab, titled: “Say, Think, Act: Towards Human-like Autonomous Language Agents”.
Oct. 2023: Check out our newest pre-print Auction Arena! We explore the intriguing domain of how LLMs navigate the complex and dynamic environment of auctions. We introduce AucArena, a novel simulation environment tailored for assessing LLMs within the unpredictable yet strategic world of auctions. Play with arena demo and see if you can beat AI!
Oct. 2023: Our paper SCAR got accepted to EMNLP 2023 Findings! A nice addition to the analogical reasoning domain! See you in Singapore :).
July 2023: Our paper CoScript got an Outstanding Paper Award in ACL 2023!
June 2023: Coming to Seattle for a summer internship at Allen Institute for AI, working with the great Aristo Team!
May 2023: A pre-print on the knowledge conflict of large language models! See the tweet. Turns out ChatGPT and GPT-4 somehow stick to its own belief, are receptive/gullible to longer, better-formatted, more popular evidence, and follow the herd… All kinds of human-like, dangerous behaviors!
May 2023: Check out two pre-prints on Analogical Reasoning, which extend E-KAR! AnalogyKB is a million-scale analogy KB derived from existing KGs, to enable machines to achieve analogical reasoning skills. SCAR is a new challenge for evaluating LLMs’ structure abduction ability for scientific analogies, which is essential for human-like analogical reasoning.
May 2023: Got two papers about LLMs accepted to the main conference of ACL 2023! The first paper is about analyzing why LLMs fail to generate negative knowledge while being able to recognize them. The other is CoScript, studying how to generate plans under constraints with LLMs. See you in Toronto (hopefully :/)!
We propose a novel conversational AI system that enhances researchers’ literature review processes by providing personalized knowledge management, literature recommendations, and query answering through a unified platform.
TimeArena enhances LLMs with temporal dynamics for better multitasking, showing advanced models like GPT-4 still trail behind human temporal awareness.
We introduced TravelPlanner, a benchmark for assessing language agents’ planning abilities, showing that even advanced models like GPT-4 face difficulties with complex tasks.
We proposes EASYTOOL, a method that simplifies tool documentation into concise instructions, improving tool use by language models.
We propose AucArena to tests LLMs in auctions, showing they can strategize but with variable success, indicating potential for enhancement.
We present the first comprehensive and controlled investigation into the behavior of large language models when encountering knowledge conflicts.
We propose an over-generate-then-filter approach to improve large language models (LLMs) on constrained language planning, and use it to distill a novel constrained language planning dataset, CoScript.