Homepage - Yuquan Xie

Photo by Manja Vitolic on Unsplash (this caption is optional, comment it out to disable).

Yuquan Xie

Harbin Institute of Technology, Shenzhen

A bicolor cat is a cat with white fur combined with fur of some other colour, for example solid black, tabby, or colourpointed.

There are various patterns of a bicolour cat. These range from the Van-patterned (colour on the crown of the head and the tail only) to a solid colour with a throat locket or medallion. Bicolour coats are found in many cat breeds, as well as being common in domestic longhair and domestic shorthair cats.

xieyuquan20016(at)gmail.com Google Scholar GitHub

Education

Harbin Institute of Technology, Shenzhen

M.S. in Computer Science and Technology

Sep. 2023 - present
Central South University

B.S. in Computer Science and Technology

Sep. 2019 - Jul. 2023

Honors & Awards

Special Grand Prize Scholarship for Postgraduate Students

2023

News

2024

Our paper Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks is accepted to NeurIPS 2024!

Sep 26

Selected Publications (view all )

Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks

Zaijing Li, Yuquan Xie, Rui Shao, Gongwei Chen, Dongmei Jiang, Liqiang Nie

Conference on Neural Information Processing Systems(NeurIPS) 2024 Poster

Building a general-purpose agent is a long-standing vision in the field of artificial intelligence. Existing agents have made remarkable progress in many domains, yet they still struggle to complete long-horizon tasks in an open world. We attribute this to the lack of necessary world knowledge and multimodal experience that can guide agents through a variety of long-horizon tasks. In this paper, we propose a Hybrid Multimodal Memory module to address the above challenges. It 1) transforms knowledge into Hierarchical Directed Knowledge Graph that allows agents to explicitly represent and learn world knowledge, and 2) summarises historical information into Abstracted Multimodal Experience Pool that provide agents with rich references for in-context learning. On top of the Hybrid Multimodal Memory module, a multimodal agent, Optimus-1, is constructed with dedicated Knowledge-guided Planner and Experience-Driven Reflector, contributing to a better planning and reflection in the face of long-horizon tasks in Minecraft. Extensive experimental results show that Optimus-1 significantly outperforms all existing agents on challenging long-horizon task benchmarks, and exhibits near human-level performance on many tasks. In addition, we introduce various Multimodal Large Language Models (MLLMs) as the backbone of Optimus-1. Experimental results show that Optimus-1 exhibits strong generalization with the help of the Hybrid Multimodal Memory module, outperforming the GPT-4V baseline on many tasks.

[Paper] [Code]

Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks

Zaijing Li, Yuquan Xie, Rui Shao, Gongwei Chen, Dongmei Jiang, Liqiang Nie

Conference on Neural Information Processing Systems(NeurIPS) 2024 Poster

[Paper] [Code]

Warning

Action required

Education

Honors & Awards

News

Selected Publications (view all )

Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks

Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks

All publications