GUIDE
A plug-and-play framework for reducing planning and grounding bias in domain-specific GUI agents by retrieving and distilling task-relevant tutorial knowledge.
These projects span improvement methods for GUI agents, domain-specific benchmarks, multi-app automation, data construction, and reinforcement learning pipelines.
A plug-and-play framework for reducing planning and grounding bias in domain-specific GUI agents by retrieving and distilling task-relevant tutorial knowledge.
A real-environment benchmark for multimodal agents working with professional materials science software, covering GUI operation, code execution, and cross-tool workflows.
A large-scale trajectory data construction effort that transforms multimodal web tutorials into agent training data across operating systems and application types.
A multi-application desktop agent for macOS workflows, including browser use, calendar management, and messaging coordination.
Early-stage reinforcement learning work for GUI agents, including task generation and training-method adaptation.
A benchmark effort for evaluating LLM-based GUI interaction in mobile environments with isolated tasks, replay infrastructure, and behavior analysis.