flash_xxx.py files. by @Narsil in https://github.com/huggingface/text-generation-inference/pull/2166prefix in model constructors by @danieldk in https://github.com/huggingface/text-generation-inference/pull/2191Weights class by @danieldk in https://github.com/huggingface/text-generation-inference/pull/2194quantize subcommand by @danieldk in https://github.com/huggingface/text-generation-inference/pull/2120server quantize: expose groupsize option by @danieldk in https://github.com/huggingface/text-generation-inference/pull/2225quantize argument in get_weights_col_packed_qkv by @danieldk in https://github.com/huggingface/text-generation-inference/pull/2237VlmCausalLM by @danieldk in https://github.com/huggingface/text-generation-inference/pull/2258Full Changelog: https://github.com/huggingface/text-generation-inference/compare/v2.1.1...v2.2.0
Fetched April 7, 2026