Jiangjie Chen
Jiangjie Chen
Home
News
Experience
Awards
Featured
Recent
Topics
Publications
CV
Light
Dark
Automatic
Agent
TimeArena: Shaping Efficient Multitasking Language Agents in a Time-Aware Simulation
TimeArena enhances LLMs with temporal dynamics for better multitasking, showing advanced models like GPT-4 still trail behind human temporal awareness.
Yikai Zhang
,
Siyu Yuan
,
Caiyu Hu
,
Kyle Richardson
,
Yanghua Xiao
,
Jiangjie Chen
PDF
Cite
Project
InCharacter: Evaluating Personality Fidelity in Role-Playing Agents through Psychological Interviews
We propose InCharacter, a method using psychological scales to evaluate the personality fidelity of role-playing agents (RPAs) powered by large language models.
Xintao Wang
,
Yunze Xiao
,
Jen-Tse Huang
,
Siyu Yuan
,
Rui Xu
,
Haoran Guo
,
Quan Tu
,
Yaying Fei
,
Ziang Leng
,
Wei Wang
,
Jiangjie Chen
,
Cheng Li
,
Yanghua Xiao
PDF
Cite
Code
Demo
TravelPlanner: A Benchmark for Real-World Planning with Language Agents
We introduced TravelPlanner, a benchmark for assessing language agents’ planning abilities, showing that even advanced models like GPT-4 face difficulties with complex tasks.
Jian Xie
,
Kai Zhang
,
Jiangjie Chen
,
Tinghui Zhu
,
Renze Lou
,
Yuandong Tian
,
Yanghua Xiao
,
Yu Su
PDF
Cite
Dataset
Code
Demo
Put Your Money Where Your Mouth Is: Evaluating Strategic Planning and Execution of LLM Agents in an Auction Arena
We propose AucArena to tests LLMs in auctions, showing they can strategize but with variable success, indicating potential for enhancement.
Jiangjie Chen
,
Siyu Yuan
,
Rong Ye
,
Bodhisattwa Prasad Majumder
,
Kyle Richardson
PDF
Cite
Demo
«
Cite
×