Pu Zhao

Research Assistant Professor

Northeastern University

Biography

My research interests focus on Inclusive AI: Make AI more open, robust, widely deployable, and efficient in inference:

DNN Efficiency for CNNs, Diffusion Models, and LLMs
- Pruning, architecture search, quantization, token reduction, etc.
- Achieve real-time inference for DNN models on edge devices
DNN Robustness
- Explore various DNN attack methods
- Investigate defense methods against various DNN attacks
LLM Openness
- Train fully open-source LLMs
- Adopt various LLM post-train techniques, such as SFT, DPO, GRPO, etc.
Optimization Methods
- Bi-level optimization, ADMM, Bayesian optimization, second-order optimization, zeroth-order optimization, min-max optimization etc.

Interests

Artificial Intelligence
Adversarial Robustness
Model Pruning & Quantization
Token Reduction
Real-Time Deployment
Optimization Methods
Diffusion Models and LLMs

Education

PhD in Computer Engineering, 2021
Northeastern University
MEng in Electrical Engineering, 2016
Shanghai Jiao Tong University
BSc in Electrical Engineering, 2013
Shanghai Jiao Tong University

Featured Publications

LazyDiT: Lazy Learning for the Acceleration of Diffusion Transformers

Diffusion Transformers have emerged as the preeminent models for a wide array of generative tasks, demonstrating superior performance …

Xuan Shen, Zhao Song, Yufa Zhou, Bo Chen, Yanyu Li, Yifan Gong, Kai Zhang, Hao Tan, Jason Kuen, Henghui Ding, Zhihao Shu, Wei Niu, Pu Zhao, Yanzhi Wang, Jiuxiang Gu

PDF Code Project DOI

LazyDiT: Lazy Learning for the Acceleration of Diffusion Transformers

Numerical Pruning for Efficient Autoregressive Models

Transformers have emerged as the leading architecture in deep learning, proving to be versatile and highly effective across diverse …

Xuan Shen, Zhao Song, Yufa Zhou, Bo Chen, Jing Liu, Ruiyi Zhang, Ryan A- Rossi, Hao Tan, Tong Yu, Xiang Chen, Yufan Zhou, Tong Sun, Pu Zhao, Yanzhi Wang, Jiuxiang Gu

PDF Project DOI

Numerical Pruning for Efficient Autoregressive Models

Exploring Token Pruning in Vision State Space Models

State Space Models (SSMs) have the advantage of keeping linear computational complexity compared to attention modules in transformers, …

Zheng Zhan, Zhenglun Kong, Yifan Gong, Yushu Wu, Zichong Meng, Hangyu Zheng, Xuan Shen, Stratis Ioannidis, Wei Niu, Pu Zhao, Yanzhi Wang

PDF Code Project Poster

Exploring Token Pruning in Vision State Space Models

Fast and Memory-Efficient Video Diffusion Using Streamlined Inference

The rapid progress in artificial intelligence-generated content (AIGC), especially with diffusion models, has significantly advanced …

Zheng Zhan, Yushu Wu, Yifan Gong, Zichong Meng, Zhenglun Kong, Changdi Yang, Geng Yuan, Pu Zhao, Wei Niu, Yanzhi Wang

PDF Code Project

Fast and Memory-Efficient Video Diffusion Using Streamlined Inference

Rethinking Token Reduction for State Space Models

Recent advancements in State Space Models (SSMs) have attracted significant interest, particularly in models optimized for parallel …

Zheng Zhan, Yushu Wu, Zhenglun Kong, Changdi Yang, Yifan Gong, Xuan Shen, Xue Lin, Pu Zhao, Yanzhi Wang

PDF Code Project DOI

Rethinking Token Reduction for State Space Models

DiffClass: Diffusion-Based Class Incremental Learning

Class Incremental Learning (CIL) is challenging due to catastrophic forgetting. On top of that, exemplar-free CIL is even more …

Zichong Meng, Jie Zhang, Changdi Yang, Zheng Zhan, Pu Zhao, Yanzhi Wang

PDF Code Project DOI

DiffClass: Diffusion-Based Class Incremental Learning

InstructGIE: Towards Generalizable Image Editing

Recent advances in image editing have been driven by the development of denoising diffusion models, marking a significant leap forward …

Zichong Meng, Changdi Yang, Jun Liu, Hao Tang, Pu Zhao, Yanzhi Wang

PDF Code Project Poster DOI

InstructGIE: Towards Generalizable Image Editing

Condense: A Framework for Device and Frequency Adaptive Neural Network Models on the Edge

Yifan Gong, Pu Zhao, Zheng Zhan, Yushu Wu, Chao Wu, Zhenglun Kong, Minghai Qin, Caiwen Ding, Yanzhi Wang

PDF Project DOI

Condense: A Framework for Device and Frequency Adaptive Neural Network Models on the Edge

All-in-One: A Highly Representative DNN Pruning Framework for Edge Devices with Dynamic Power Management

During the deployment of deep neural networks (DNNs) on edge devices, many research efforts are devoted to the limited hardware …

Yifan Gong, Zheng Zhan, Pu Zhao, Yushu Wu, Chao Wu, Caiwen Ding, Weiwen Jiang, Minghai Qin, Yanzhi Wang

PDF Project DOI

All-in-One: A Highly Representative DNN Pruning Framework for Edge Devices with Dynamic Power Management

CoCoPIE: Enabling Real-Time AI on Off-the-Shelf Mobile Devices via Compression-Compilation Co-Design

Assuming hardware is the major constraint for enabling real-time mobile intelligence, the industry has mainly dedicated their efforts …

Hui Guan, Shaoshan Liu, Xiaolong Ma, Wei Niu, Bin Ren, Xipeng Shen, Yanzhi Wang, Pu Zhao

PDF Project Poster DOI

CoCoPIE: Enabling Real-Time AI on Off-the-Shelf Mobile Devices via Compression-Compilation Co-Design

Towards Real-Time DNN Inference on Mobile Platforms with Model Pruning and Compiler Optimization

High-end mobile platforms rapidly serve as primary computing devices for a wide range of Deep Neural Network (DNN) applications. …

Wei Niu, Pu Zhao, Zheng Zhan, Xue Lin, Yanzhi Wang, Bin Ren

PDF Project DOI

Towards Real-Time DNN Inference on Mobile Platforms with Model Pruning and Compiler Optimization

Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness

Mode connectivity provides novel geometric insights on analyzing loss landscapes and enables building high-accuracy pathways between …

Pu Zhao, Pin-Yu Chen, Payel Das, Karthikeyan Natesan Ramamurthy, Xue Lin

PDF Code Project

Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness

Towards query-efficient black-box adversary with zeroth-order natural gradient descent

Despite the great achievements of the modern deep neural networks (DNNs), the vulnerability/robustness of state-ofthe-art DNNs raises …

Pu Zhao, Pin-Yu Chen, Siyue Wang, Xue Lin

PDF Project DOI

Towards query-efficient black-box adversary with zeroth-order natural gradient descent

Structured Adversarial Attack: Towards General Implementation and Better Interpretability

When generating adversarial examples to attack deep neural networks (DNNs), Lp norm of the added perturbation is usually used to …

Kaidi Xu, Sijia Liu, Pu Zhao, Pin-Yu Chen, Huan Zhang, Quanfu Fan, Deniz Erdogmus, Yanzhi Wang, Xue Lin

PDF Code Project

Structured Adversarial Attack: Towards General Implementation and Better Interpretability

Recent Publications

Quickly discover relevant content by filtering publications.

Lin Zhao, Yushu Wu, Xinru Jiang, Jianyang Gu, Yanzhi Wang, Xiaolin Xu, Pu Zhao, Xue Lin (2025). Taming Diffusion for Dataset Distillation with High Representativeness. In ICML 2025.

PDF Code Project Poster

Xuan Shen, Weize Ma, Jing Liu, Changdi Yang, Rui Ding, Quanyi Wang, Henghui Ding, Wei Niu, Yanzhi Wang, Pu Zhao, Jun Lin, Jiuxiang Gu (2025). QuartDepth: Post-Training Quantization for Real-Time Depth Estimation on the Edge. In CVPR 2025.

PDF Code Poster DOI

Xuan Shen, Hangyu Zheng, Yifan Gong, Zhenglun Kong, Changdi Yang, Zheng Zhan, Yushu Wu, Xue Lin, Yanzhi Wang, Pu Zhao, Wei Niu (2025). Sparse Learning for State Space Models on Mobile. In ICLR 2025.

PDF Project

Jun Liu, Zhenglun Kong, Peiyan Dong, Xuan Shen, Pu Zhao, Hao Tang, Geng Yuan, Wei Niu, Wenbin Zhang, Xue Lin, Dong Huang, Yanzhi Wang (2025). RoRA: Efficient Fine-Tuning of LLM with Reliability Optimization for Rank Adaptation. In ICASSP 2025.

PDF Project DOI

Jun Liu, Zhenglun Kong, Pu Zhao, Changdi Yang, Hao Tang, Xuan Shen, Geng Yuan, Wei Niu, Wenbin Zhang, Xue Lin, Dong Huang, Yanzhi Wang (2025). Toward Adaptive Large Language Models Structured Pruning via Hybrid-grained Weight Importance Assessment. In AAAI 2025.

PDF Project DOI

Pinrui Yu, Zhenglun Kong, Pu Zhao, Peiyan Dong, Hao Tang, Fei Sun, Xue Lin, Yanzhi Wang (2025). Q-TempFusion: Quantization-Aware Temporal Multi-Sensor Fusion on Bird's-Eye View Representation. In WACV 2025.

PDF Project

Xuan Shen, Pu Zhao, Yifan Gong, Zhenglun Kong, Zheng Zhan, Yushu Wu, Ming Lin, Chao Wu, Xue Lin, Yanzhi Wang (2024). Search for Efficient Large Language Models. In NeurIPS 2024.

PDF Code Project

Pu Zhao, Fei Sun, Xuan Shen, Pinrui Yu, Zhenglun Kong, Yanzhi Wang, Xue Lin (2024). Pruning Foundation Models for High Accuracy without Retraining. In EMNLP 2024 Findings.

PDF Project DOI

Jun Liu, Zhenglun Kong, Pu Zhao, Weihao Zeng, Hao Tang, Xuan Shen, Changdi Yang, Wenbin Zhang, Geng Yuan, Wei Niu, Xue Lin, Yanzhi Wang (2024). TSLA: A Task-Specific Learning Adaptation for Semantic Segmentation on Autonomous Vehicles Platform. In TCAD.

PDF Project DOI

Xuan Shen, Zhaoyang Han, Lei Lu, Zhenglun Kong, Peiyan Dong, Zhengang Li, Yanyue Xie, Chao Wu, Miriam Leeser, Pu Zhao, Xue Lin, Yanzhi Wang (2024). HotaQ: Hardware Oriented Token Adaptive Quantization for Large Language Models. In TCAD.

PDF Project DOI

See all publications

Demos

Towards Real-Time DNN Inference on Mobile Platforms with Model Pruning and Compiler Optimization

A demonstration on IJCAI-PRICAI 2020 demo track for real-time on-mobile inference of different DNN applications.

Last updated on May 5, 2025

Real-Time Inference on Mobile for Various DNN Applications

A demonstration on ECCV 2020 demonstration track for real-time inference of various DNN applications on mobile.

Last updated on May 5, 2025

Real-Time Inference on Mobile for Various DNN Applications

CoCoPIE --- A Framework of Compression-Compilation Co-design towards Ultra-high Energy Efficiency and Real-Time DNN Inference on Mobile Devices

ISLPED 2020 design contest First Prize with CoCoPIE demo

Last updated on May 5, 2025

CoCoPIE --- A Framework of Compression-Compilation Co-design towards Ultra-high Energy Efficiency and Real-Time DNN Inference on Mobile Devices

Accomplishments

Third Place in U.S. Department of Transportation’s Inclusive Design Challenge

U.S. Department of Transportation Aug 2022

Third Place in the second phase of the national 2022 Inclusive Design Challenge from U.S. Department of Transportation

See certificate

First Place in ISLPED 2020 Design Contest

ISLPED Aug 2020

CoCoPIE: A Framework of Compression-Compilation Co-design Towards Ultra-High Energy Efficiency and Real Time DNN Inference on Mobile Devices

See certificate

Best Paper Nomination in ICCAD 2018

ICCAD Mar 2018

Defensive Dropout for Hardening Deep Neural Networks under Adversarial Attacks

First Prize at 26th National Physics Contest for College Students

Shanghai Jiao Tong University Dec 2009

Experience

Research Assistant Professor

Northeastern University

Sep 2021 – Present Boston, MA

Research Internship

Google

May 2020 – Sep 2020 California

Explore the image search with text assistance.

Research Fall Internship-Graduate

IBM Thomas J. Watson Research Center

Dec 2018 – Mar 2019 New York

Investigate the defense performance of model connection against three kinds of attacks including backdoor attack, fault injection attack and adversarial attack.