PhD Student
Shanghai Jiao Tong University · X-LANCE LabResearching multimodal agents with an emphasis on GUI interaction, domain adaptation, and real-environment evaluation.
I am interested in the moment when agent reasoning has to meet practical interfaces: messy software, professional tools, shifting workflows, and the need for reliable evaluation.
My research sits at the intersection of multimodal reasoning, real-environment evaluation, and agent interaction systems. I am especially interested in how agent capabilities can scale from software interfaces toward more general task competence in the real world.
Across recent projects, I have worked on plug-and-play improvement methods for GUI agents, real-environment benchmarks, large-scale trajectory data construction, multi-app desktop automation, and reinforcement learning pipelines for agent behavior.
My current work focuses on making agent systems more dependable in real environments by combining method design, benchmark construction, and systems-oriented implementation.
Researching multimodal agents with an emphasis on GUI interaction, domain adaptation, and real-environment evaluation.
Worked on multi-app desktop agents, early-stage reinforcement learning pipelines, and practical software interaction systems.
Contributed to large-scale trajectory data construction and mobile GUI evaluation benchmarks.
Built a strong foundation in systems, algorithms, architecture, and machine learning while moving into agent research.