1

FasterVD: On Acceleration of Video Diffusion Models

Equipped with Denoising Diffusion Probabilistic Models, video content generation has gained significant research interest recently. However, diffusion pipelines call for intensive computation and model storage, which poses challenges for their wide …

Lotus: learning-based online thermal and latency variation management for two-stage detectors on edge devices

Two-stage object detectors exhibit high accuracy and precise localization, especially for identifying small objects that are favorable for various edge applications. However, the high computation costs associated with two-stage detection methods …

Pruning Parameterization with Bi-level Optimization for Efficient Semantic Segmentation on the Edge

With the ever-increasing popularity of edge devices, it is necessary to implement real-time segmentation on the edge for autonomous driving and many other applications. Vision Transformers (ViTs) have shown considerably stronger results for many …

Condense: A Framework for Device and Frequency Adaptive Neural Network Models on the Edge

With the popularity of battery-powered edge computing, an important yet under-explored problem is the supporting of DNNs for diverse edge devices. On the one hand, different edge platforms have various runtime requirements and computation/memory …

Towards Real-Time Segmentation on the Edge

The research in real-time segmentation mainly focuses on desktop GPUs. However, autonomous driving and many other applications rely on real-time segmentation on the edge, and current arts are far from the goal. In addition, recent advances in vision …

Advancing Model Pruning via Bi-level Optimization

The deployment constraints in practical applications necessitate the pruning of large-scale deep learning models, i.e., promoting their weight sparsity. As illustrated by the Lottery Ticket Hypothesis (LTH), pruning also has the potential of …

All-in-One: A Highly Representative DNN Pruning Framework for Edge Devices with Dynamic Power Management

During the deployment of deep neural networks (DNNs) on edge devices, many research efforts are devoted to the limited hardware resource. However, little attention is paid to the influence of dynamic power management. As edge devices typically only …

Compiler-Aware Neural Architecture Search for On-Mobile Real-time Super-Resolution

Deep learning-based super-resolution (SR) has gained tremendous popularity in recent years because of its high image quality performance and wide application scenarios. However, prior methods typically suffer from large amounts of computations and …

Learning to Generate Image Source-Agnostic Universal Adversarial Perturbations

Adversarial perturbations are critical for certifying the robustness of deep learning models. A ``universal adversarial perturbation'' (UAP) can simultaneously attack multiple images, and thus offers a more unified threat model, obviating an …

Pruning-as-Search: Effcient Neural Architecture Search via Channel Pruning and Structural Reparameterization

Neural architecture search (NAS) and network pruning are widely studied efficient AI techniques, but not yet perfect. NAS performs exhaustive candidate architecture search, incurring tremendous search cost. Though (structured) pruning can simply …