|
Dengyang Jiang
Hi, there! I'm Dengyang Jiang, currently an undergraduate student from Northwestern Polytechnical University, supervised by Prof. Lei Zhang.
My research interests encompass deep learning and computer vision. I am delighted to communicate and collaborate with anyone interested in this field.
Feel free to contact me via Email or Xiaohongshu(RedNote).
|
|
Education Experience
School of Automation, Northwestern Polytechnical University, 2022-2026
B.S.E., supervised by Prof. Lei Zhang, in the research team leaded by Prof. Yanning Zhang.
|
Work Experience
Shanghai Artificial Intelligence Laboratory, 2025.07-present
Research Intern, mentored by Dr. Bo Zhang,
also work with Dr. Peng Gao.
SGIT AI Lab, State Grid Corporation of China, 2024.06-2025.06
Research Intern, mentored by Prof. Mengmeng Wang,
also work with Dr. Jingdong Wang.
|
Research Interests
-
Visual Generation with Diffusion Model: pre-training acceleration, reinforce learning post-training, step distillation, image editing, synthetic dataset.
-
Visual Representation and Perception: self-supervised learning, segmentation, affordance grounding, detection.
-
Unified Modling: unified multimodal understanding and generation model, vision generalist, the intersection and synergy of representation learning and generative modeling.
|
Projects
|
Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding
Alpha VLLM Team, Shanghai AI Laboratory
An open-source foundational model with fully discrete diffusion modeling for seamless multi-modal generation and understanding.
|
|
Publications/Preprints (* Equal Contribution) (Google Scholar)
|
No Other Representation Component Is Needed: Diffusion Transformers Can Provide Representation Guidance by Themselves
Dengyang Jiang,
Mengmeng Wang, Liuzhuozheng Li, Lei Zhang, Haoyu Wang, Wei Wei, Guang Dai, Yanning Zhang, Jingdong Wang
Preprint, 2025
Self-representation alignment for enhancing representation learning and generation performance of diffusion transformers.
|
|
AutoMLGen: Navigating Fine-Grained Optimization for Coding Agents
Shangheng Du, Xiangchao Yan, Dengyang Jiang*, Jiakang Yuan, Yusong Hu, Xin Li, Liang He, Bo Zhang, Lei Bai
Preprint, 2025
The LLM-based agent which achieving leading machine learning engineering capability by combining a curated ML knowledge
base with proposed Monte Carlo Graph Search.
|
|
Deforming Videos to Masks: Flow Matching for Referring Video Segmentation
Zanyi Wang, Dengyang Jiang*, Liuzhuozheng Li, Sizhe Dang, Chengzu Li, Harry Yang, Guang Dai, Mengmeng Wang, Jingdong Wang
Preprint, 2025
Reformulating RVOS as a continuous, text-conditioned flow from video
to mask and achieving leading performance.
|
|
AffordanceSAM: Segment Anything Once More in Affordance Grounding
Dengyang Jiang,
Zanyi Wang, Hengzhuang Li, Sizhe Dang, Teli Ma, Wei Wei, Guang Dai, Lei Zhang, Mengmeng Wang
Preprint, 2025
Transferring SAM to affordance grounding task and showing robust performance for both seen and unseen actions.
|
|
Low-Biased General Annotated Dataset Generation
Dengyang Jiang*,
Haoyu Wang,
Lei Zhang,
Wei Wei,
Guang Dai,
Mengmeng Wang,
Jingdong Wang,
Yanning Zhang
CVPR, 2025
A low-biased general annotated dataset (e.g, ImageNet) generation framework helps to obtain more generalized visual backbones.
|
|
|