AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
· 11 min read
Abstractact
in terms of memory size and bandwidth, pose significant deployment challenges.
Model quantization techniques
View All Tagsin terms of memory size and bandwidth, pose significant deployment challenges.