Pu Zhao

Pu Zhao

Research Assistant Professor

Northeastern University

Biography

My research interests focus on Inclusive AI: Make AI more open, robust, widely deployable, and efficient in inference:

  • DNN Efficiency for CNNs, Diffusion Models, and LLMs
    • Pruning, architecture search, quantization, token reduction, etc.
    • Achieve real-time inference for DNN models on edge devices
  • DNN Robustness
    • Explore various DNN attack methods
    • Investigate defense methods against various DNN attacks
  • LLM Openness
    • Train fully open-source LLMs
    • Adopt various LLM post-train techniques, such as SFT, DPO, GRPO, etc.
  • Optimization Methods
    • Bi-level optimization, ADMM, Bayesian optimization, second-order optimization, zeroth-order optimization, min-max optimization etc.

Interests

  • Artificial Intelligence
  • Adversarial Robustness
  • Model Pruning & Quantization
  • Token Reduction
  • Real-Time Deployment
  • Optimization Methods
  • Diffusion Models and LLMs

Education

  • PhD in Computer Engineering, 2021

    Northeastern University

  • MEng in Electrical Engineering, 2016

    Shanghai Jiao Tong University

  • BSc in Electrical Engineering, 2013

    Shanghai Jiao Tong University

Recent Publications

Quickly discover relevant content by filtering publications.
(2025). Sparse Learning for State Space Models on Mobile. In ICLR 2025.

PDF Project

(2025). RoRA: Efficient Fine-Tuning of LLM with Reliability Optimization for Rank Adaptation. In ICASSP 2025.

PDF Project DOI

(2025). QuartDepth: Post-Training Quantization for Real-Time Depth Estimation on the Edge. In CVPR 2025.

PDF Code Poster

(2025). Toward Adaptive Large Language Models Structured Pruning via Hybrid-grained Weight Importance Assessment. In AAAI 2025.

PDF Project DOI

(2025). Q-TempFusion: Quantization-Aware Temporal Multi-Sensor Fusion on Bird's-Eye View Representation. In WACV 2025.

PDF Project

(2024). Search for Efficient Large Language Models. In NeurIPS 2024.

PDF Code Project

(2024). Pruning Foundation Models for High Accuracy without Retraining. In EMNLP 2024 Findings.

PDF Project DOI

(2024). TSLA: A Task-Specific Learning Adaptation for Semantic Segmentation on Autonomous Vehicles Platform. In TCAD.

PDF Project DOI

(2024). HotaQ: Hardware Oriented Token Adaptive Quantization for Large Language Models. In TCAD.

PDF Project DOI

(2024). Neural architecture search for adversarial robustness via learnable pruning. In Front. HPC.

PDF Project DOI

Accomplish­ments

Third Place in U.S. Department of Transportation’s Inclusive Design Challenge

Third Place in the second phase of the national 2022 Inclusive Design Challenge from U.S. Department of Transportation
See certificate

First Place in ISLPED 2020 Design Contest

CoCoPIE: A Framework of Compression-Compilation Co-design Towards Ultra-High Energy Efficiency and Real Time DNN Inference on Mobile Devices
See certificate

Best Paper Nomination in ICCAD 2018

Defensive Dropout for Hardening Deep Neural Networks under Adversarial Attacks

National Scholarship for Postgraduate Students

National Scholarship for Postgraduate Students

Academic Excellence Scholarship for Postgraduate Students

Academic Excellence Scholarship for Postgraduate Students

Academic Excellence Scholarship

Academic Excellence Scholarship

First Prize at 26th National Physics Contest for College Students

Experience

 
 
 
 
 

Research Assistant Professor

Northeastern University

Sep 2021 – Present Boston, MA
 
 
 
 
 

Research Internship

Google

May 2020 – Sep 2020 California
Explore the image search with text assistance.
 
 
 
 
 

Research Fall Internship-Graduate

IBM Thomas J. Watson Research Center

Dec 2018 – Mar 2019 New York
Investigate the defense performance of model connection against three kinds of attacks including backdoor attack, fault injection attack and adversarial attack.
 
 
 
 
 

Teaching Assistance

Computer Architecture

Sep 2017 – Dec 2017 Boston
Help with the teaching of computer architecture such as designing the course project and answering questions during office hours.
 
 
 
 
 

Research Assistance

Computer Engineering

Sep 2016 – Aug 2021 Boston
Research on adversarial robustness and model efficiency.

Contact