Home - Shaofeng Yin

Shaofeng Yin (殷绍峰)

I'm a junior undergraduate student at Peking University, majoring in artificial intelligence (Zhi Class). In Spring 2025, I visited Berkeley AI Research and had a wonderful time there.

My early research focused on human-object interaction detection (HOID) in computer vision. As vision-language models (VLMs) have grown impressively powerful, I've become deeply interested in the study of visual agents.

In life, I have a deep appreciation for art, literature, and philosophy. I'm also deeply commited to volunteer teaching programs. I used to be passionate about algorithm competitions and earned the title of Codeforces Master at the age of 15, but I later stepped away from it due to burnout.

Email / Scholar / Linkedin / Github

📚 Selected Publications

ToolVQA: A Dataset for Multi-step Reasoning VQA with External Tools
Shaofeng Yin, Ting Lei, Yang Liu
ICCV, 2025
project page / arXiv

Recent benchmarks reveal significant gaps in real-world tool-use proficiency, particularly in functionally diverse multimodal settings requiring multi-step reasoning. To bridge this gap, we propose ToolEngine, a novel data generation pipeline that employs Depth-First Search (DFS) with a dynamic in-context example matching mechanism to simulate human-like tool-use reasoning.

Exploring the Potential of Large Foundation Models for Open-Vocabulary HOI Detection
Ting Lei, Shaofeng Yin, Yang Liu
CVPR, 2024
project page / arXiv

In this paper, we introduce a novel end-to-end open vocabulary HOI detection framework with conditional multi-level decoding and fine-grained semantic enhancement (CMD-SE), harnessing the potential of Visual-Language Models (VLMs).

🏆 Selected Awards

2025: First Prize in CVPR International CulturalVQA Benchmark Challenge
2025: SenseTime Scholarship (30/year in China)
2024: National Scholarship (Highest honor for undergraduates)

📸 Selected Photography

The template is stole from Jon Barron.