1

Taming Diffusion for Dataset Distillation with High Representativeness

Recent deep learning models demand larger datasets, driving the need for dataset distillation to create compact, cost-efficient datasets while maintaining performance. Due to the powerful image generation capability of diffusion, it has been …

QuartDepth: Post-Training Quantization for Real-Time Depth Estimation on the Edge

Monocular Depth Estimation (MDE) has emerged as a pivotal task in computer vision, supporting numerous real-world applications. However, deploying accurate depth estimation models on resource-limited edge devices, especially Application-Specific …

Sparse Learning for State Space Models on Mobile

Transformer models have been widely investigated in different domains by providing long-range dependency handling and global contextual awareness, driving the development of popular AI applications such as ChatGPT, Gemini, and Alexa. State Space …

RoRA: Efficient Fine-Tuning of LLM with Reliability Optimization for Rank Adaptation

Fine-tuning helps large language models (LLM) recover degraded information and enhance task performance. Although Low-Rank Adaptation (LoRA) is widely used and effective for fine-tuning, we have observed that its scaling factor can limit or even …

LazyDiT: Lazy Learning for the Acceleration of Diffusion Transformers

Diffusion Transformers have emerged as the preeminent models for a wide array of generative tasks, demonstrating superior performance and efficacy across various applications. The promising results come at the cost of slow inference, as each …

Numerical Pruning for Efficient Autoregressive Models

Transformers have emerged as the leading architecture in deep learning, proving to be versatile and highly effective across diverse domains beyond language and image processing. However, their impressive performance often incurs high computational …

Toward Adaptive Large Language Models Structured Pruning via Hybrid-grained Weight Importance Assessment

Structured pruning for large language models (LLMs) has garnered significant academic interest due to its ability to efficiently compress and accelerate LLMs by eliminating redundant weight groups at a coarse-grained granularity. Current structured …

Q-TempFusion: Quantization-Aware Temporal Multi-Sensor Fusion on Bird's-Eye View Representation

Recent advancements in bird's-eye view (BEV) perception models have highlighted the superior performance of LiDAR-camera fusion systems over single-modality approaches garnering considerable interest in the field. Despite the progress the integration …

Exploring Token Pruning in Vision State Space Models

State Space Models (SSMs) have the advantage of keeping linear computational complexity compared to attention modules in transformers, and have been applied to vision tasks as a new type of powerful vision foundation model. Inspired by the …

Fast and Memory-Efficient Video Diffusion Using Streamlined Inference

The rapid progress in artificial intelligence-generated content (AIGC), especially with diffusion models, has significantly advanced development of high-quality video generation. However, current video diffusion models exhibit demanding computational …