首页 正文

MC#: Mixture Compressor for Mixture-of-Experts Large Models

{{output}}
Mixture-of-Experts (MoE) has emerged as an effective and efficient scaling mechanism for large language models (LLMs) and vision-language models (VLMs). By expanding a single feed-forward network into multiple expert branches, MoE increases model capacity whil... ...