releases.shpreview

v2.2.0

$npx -y @buildinternet/releases show rel_x4Xuyop6jZiP_W4o-MgAl

Notable changes

  • Llama 3.1 support (including 405B, FP8 support in a lot of mixed configurations, FP8, AWQ, GPTQ, FP8+FP16).
  • Gemma2 softcap support
  • Deepseek v2 support.
  • Lots of internal reworks/cleanup (allowing for cool features)
  • Lots of AWQ/GPTQ work with marlin kernels (everything should be faster by default)
  • Flash decoding support (FLASH_DECODING=1 environment variables which will probably enable some nice improvements in the future)

What's Changed

New Contributors

Full Changelog: https://github.com/huggingface/text-generation-inference/compare/v2.1.1...v2.2.0

Fetched April 7, 2026