Shaofeng Yin (殷绍峰)
I'm a junior undergraduate student at Peking University, majoring in artificial intelligence. I'm currently visiting Berkeley AI Research and have had a wonderful time working with Haven, Jiaxin and Zora.
My early research focused on generalization in computer vision. As vision-language models (VLMs) have grown increasingly powerful, I've become deeply interested in the study of visual agents.
In life, I have a deep appreciation for art, literature, and philosophy. I'm also deeply commited to volunteer teaching programs. I used to be passionate about algorithm competitions and earned the title of Codeforces Master at the age of 15, but I later stepped away from it due to burnout.
Email /
Scholar /
Linkedin /
Github
|
|
|
ToolVQA: A Dataset for Multi-step Reasoning VQA with External Tools
Shaofeng Yin,
Ting Lei,
Yang Liu
ICCV, 2025
project page (coming soon!) /
arXiv (coming soon!)
Recent benchmarks reveal significant gaps in real-world tool-use proficiency, particularly in functionally diverse multimodal settings requiring multi-step reasoning. To bridge this gap, we propose ToolEngine, a novel data generation pipeline that employs Depth-First Search (DFS) with a dynamic in-context example matching mechanism to simulate human-like tool-use reasoning.
|
|
Exploring Conditional Multi-Modal Prompts for Zero-shot HOI Detection
Ting Lei,
Shaofeng Yin,
Yuxin Peng,
Yang Liu
ECCV, 2024
project page /
arXiv
In this paper, we introduce a novel framework for zero-shot HOI detection using Conditional Multi-Modal Prompts, namely CMMP. This approach enhances the generalization of large foundation models, such as CLIP, when fine-tuned for HOI detection.
|
|
Exploring the Potential of Large Foundation Models for Open-Vocabulary HOI Detection
Ting Lei,
Shaofeng Yin,
Yang Liu
CVPR, 2024
project page /
arXiv
In this paper, we introduce a novel end-to-end open vocabulary HOI detection framework with conditional multi-level decoding and fine-grained semantic enhancement (CMD-SE), harnessing the potential of Visual-Language Models (VLMs).
|
|