Publications

Papers, collaborations, and evolving research directions.

This page keeps public-facing publication information concise. Works that are still under review use venue-neutral status labels until their outcomes can be shared.

Under Review First Author 2026

ASIL: Replacing Screenshot-and-Click with Structured State and Semantic Actions

Rui Xie, Lu Chen

An agent-native interface for software-operating agents that replaces screenshot observation and low-level GUI events with structured software state and code-executable semantic actions.

Under Review First Author 2026

GUIDE: Resolving Domain Bias in GUI Agents through Real-Time Web Video Retrieval and Plug-and-Play Annotation

Rui Xie, Zhi Gao, Chenrui Shi, Zirui Shang, Lu Chen, Qing Li

A plug-and-play framework that retrieves task-relevant tutorial videos and distills transferable planning and grounding knowledge for domain-specific GUI agents.

Under Review Co-first Author 2026

MatToolBench: A Real-Environment Benchmark for Evaluating Multimodal Agents on Professional Materials Science Software

Mei Wu, Rui Xie, Runyu Zhang, Lu Chen, Bo Chen, Kai Yu, Xin Chen

A real-environment benchmark that evaluates multimodal agents on professional materials science workflows spanning GUI tools, code execution, and cross-tool coordination.

AAAI 2026 Contributing Author 2026

TongUI: Internet-Scale Trajectories from Multimodal Web Tutorials for Generalized GUI Agents

Bofei Zhang, Zirui Shang, Zhi Gao, Wang Zhang, Rui Xie, Xiaojian Ma, Tao Yuan, Xinxiao Wu, Song-Chun Zhu, Qing Li

A large-scale data construction effort that turns multimodal web tutorials into GUI trajectories for generalized agent training and evaluation.

Under Review Third Author 2025

Mobile-Env: Building Qualified Evaluation Benchmarks for LLM-GUI Interaction

Danyang Zhang, Zhennan Shen, Rui Xie, Situo Zhang, Tianbao Xie, Zihan Zhao, Siyuan Chen, Lu Chen, Hongshen Xu, Ruisheng Cao, Kai Yu

A benchmark effort for evaluating LLM-based GUI interaction in mobile environments with isolated tasks, simulator support, and behavior analysis.