Pengfei Song

Pengfei Song 宋鹏飞

📍 Nanjing, China | 南京

I am a Master's student at Southeast University, majoring in Intelligent Science and Technology. My research focuses on computer vision, multimodal learning, and large language models, particularly on test-time adaptation, semantic segmentation, and vision-language models.

I have interned at Microsoft (AI Agent & LLM) and Huawei (AI Infra & Multimodal LLM Training), gaining hands-on experience in model training, deployment, and AI coding workflows.

Education

Southeast University — M.S. in Intelligent Science and Technology | Sept 2024 – Jun 2027 (Expected)
School of Computer Science and Engineering. GPA: 87/100 (Top 8%)
Southeast University — B.S. in Applied Mathematics | Sept 2020 – Jun 2024
School of Mathematics. GPA: 88/100 (Top 15%)
UC Berkeley — Extension Program | Aug 2022 – Dec 2022
Fully funded by CSC (China Scholarship Council) national public dispatch scholarship (Top 5%)

Research Interests

My research interests focus on efficient and robust visual understanding, particularly:

Experience

Huawei — AI Infra / Multimodal LLM Training Intern | Nov 2025 – Apr 2026
Worked on 13B multimodal LLM adaptation for terminal-side AI assistants and domain-specific Q&A. Built end-to-end SFT pipelines with PyTorch, Transformers, and LoRA/QLoRA; performed inference validation on Ascend NPU. Participated in model compression (RQ-VAE quantization, BF16, distillation) and vLLM deployment with OpenAI Triton custom kernel development.
Microsoft — AI Large Model Intern | Sept 2025 – Nov 2025
Designed and deployed an intelligent agent for automated ticket processing based on real MS Teams enterprise workflow data, leveraging NVIDIA H100. Completed multi-step task orchestration, tool integration, state management, and Azure deployment. Proficient in Claude Code, GitHub Copilot, and AI coding-assisted development workflows.

Publications

DiCLIP: Diffusion Model Enhances CLIP's Dense Knowledge for Weakly Supervised Semantic Segmentation Accepted
Zhiwei Yang, Pengfei Song, Yucong Meng, Kexue Fu, Shuo Wang, Zhijian Song
IEEE Transactions on Image Processing (TIP), 2025
ASC for Training-Free Adaptation of VLMs Under Review
Pengfei Song, et al.
CCF-A Conference, 2025
Human Semantic Segmentation using Millimeter Wave Radar Point Clouds Accepted
Pengfei Song, et al.
IEEE CSCWD, 2023

Research Projects

Cross-Modal Semantic Enhanced Test-Time Adaptation Framework | Nov 2024 – Apr 2025
Proposed CSE, innovatively combining semantic prior knowledge with caching mechanisms to achieve efficient test-time adaptation without backpropagation. Computational efficiency is improved by over 10× compared to existing training-based methods. Designed cross-modal semantic enhancement to compensate for visual feature limitations via dictionary semantic priors. Proposed top-k label exploration and outlier rejection strategies, achieving 2.93% improvement over baseline on ImageNet and variants. Served as first author; one CCF-A paper under review.
Human Semantic Segmentation from mmWave Radar Sparse Point Clouds | Jun 2023 – Aug 2024
Compared to cameras and LiDAR, mmWave radar offers privacy protection advantages. Built a semantic segmentation framework with spatio-temporal feature extraction modules for radar point clouds. Served as primary contributor; one first-author paper accepted at IEEE CSCWD 2023. Further proposed DiCLIP, a novel WSSS method combining diffusion models to enhance CLIP dense representations; one TIP paper accepted in 2025.

Honors & Awards

Skills