AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Abstractact
in terms of memory size and bandwidth, pose significant deployment challenges.
Model quantization techniques
View All Tagsin terms of memory size and bandwidth, pose significant deployment challenges.