Education
Southeast University — M.S. in Intelligent Science and Technology | Sept 2024 – Jun 2027 (Expected)
School of Computer Science and Engineering. GPA: 87/100 (Top 8%)
Southeast University — B.S. in Applied Mathematics | Sept 2020 – Jun 2024
School of Mathematics. GPA: 88/100 (Top 15%)
UC Berkeley — Extension Program | Aug 2022 – Dec 2022
Fully funded by CSC (China Scholarship Council) national public dispatch scholarship (Top 5%)
Research Interests
My research interests focus on efficient and robust visual understanding, particularly:
- Test-Time Adaptation (TTA) — Training-free adaptation of vision-language models with semantic prior knowledge
- Weakly Supervised Semantic Segmentation (WSSS) — Leveraging diffusion models to enhance CLIP dense representations
- Multimodal Learning — Cross-modal semantic enhancement and vision-language model alignment
- LLM Agent Systems — Multi-step task planning, tool integration, and intelligent workflow automation
- Model Compression & Inference Optimization — RQ-VAE quantization, BF16 mixed precision, distillation, vLLM deployment
Experience
Huawei — AI Infra / Multimodal LLM Training Intern | Nov 2025 – Apr 2026
Worked on 13B multimodal LLM adaptation for terminal-side AI assistants and domain-specific Q&A.
Built end-to-end SFT pipelines with PyTorch, Transformers, and LoRA/QLoRA; performed inference validation on Ascend NPU.
Participated in model compression (RQ-VAE quantization, BF16, distillation) and vLLM deployment with OpenAI Triton custom kernel development.
Microsoft — AI Large Model Intern | Sept 2025 – Nov 2025
Designed and deployed an intelligent agent for automated ticket processing based on real MS Teams enterprise workflow data,
leveraging NVIDIA H100. Completed multi-step task orchestration, tool integration, state management, and Azure deployment.
Proficient in Claude Code, GitHub Copilot, and AI coding-assisted development workflows.
Publications
DiCLIP: Diffusion Model Enhances CLIP's Dense Knowledge for Weakly Supervised Semantic Segmentation Accepted
Zhiwei Yang, Pengfei Song, Yucong Meng, Kexue Fu, Shuo Wang, Zhijian Song
IEEE Transactions on Image Processing (TIP), 2025
ASC for Training-Free Adaptation of VLMs Under Review
Pengfei Song, et al.
CCF-A Conference, 2025
Human Semantic Segmentation using Millimeter Wave Radar Point Clouds Accepted
Pengfei Song, et al.
IEEE CSCWD, 2023
Research Projects
Cross-Modal Semantic Enhanced Test-Time Adaptation Framework | Nov 2024 – Apr 2025
Proposed CSE, innovatively combining semantic prior knowledge with caching mechanisms to achieve efficient test-time adaptation without backpropagation.
Computational efficiency is improved by over 10× compared to existing training-based methods. Designed cross-modal semantic enhancement to compensate for visual feature limitations via dictionary semantic priors.
Proposed top-k label exploration and outlier rejection strategies, achieving 2.93% improvement over baseline on ImageNet and variants.
Served as first author; one CCF-A paper under review.
Human Semantic Segmentation from mmWave Radar Sparse Point Clouds | Jun 2023 – Aug 2024
Compared to cameras and LiDAR, mmWave radar offers privacy protection advantages. Built a semantic segmentation framework with spatio-temporal feature extraction modules for radar point clouds.
Served as primary contributor; one first-author paper accepted at IEEE CSCWD 2023.
Further proposed DiCLIP, a novel WSSS method combining diffusion models to enhance CLIP dense representations; one TIP paper accepted in 2025.
Honors & Awards
- National Scholarship — 2025
- 14th Huawei Cup Graduate Mathematical Modeling Competition — Second Prize (Top 5%), 2024
- Mathematical Contest in Modeling (MCM/ICM) — Meritorious Winner (First Prize), 2023
- CSC National Scholarship Council Scholarship — Top 5%, 2022
- Zhishan Excellence Scholarship — Top 10%, 2022 (Two consecutive terms)
Skills
- Deep Learning: PyTorch, Transformers, Hugging Face, LoRA/QLoRA, Distributed Training
- LLM & Agent: Claude Code, GitHub Copilot, vLLM, RQ-VAE, Model Distillation
- Hardware: NVIDIA H100, Ascend NPU, OpenAI Triton kernel development
- Programming: Python, C/C++, Shell, Git, Azure Deployment
- Language: CET-4 (603), CET-6 (616), IELTS 6.5