releases.shpreview

v2.1.0

$npx -y @buildinternet/releases show rel_5HF6_dH19Cu35zxcF8Hhj

Notable changes

  • New models : gemma2

  • Multi lora adapters. You can now run multiple loras on the same TGI deployment https://github.com/huggingface/text-generation-inference/pull/2010

  • Faster GPTQ inference and Marlin support (up to 2x speedup).

  • Reworked the entire scheduling logic (better block allocations, and allowing further speedups in new releases)

  • Lots of Rocm support and bugfixes,

  • Lots of new contributors ! Thanks a lot for these contributions

What's Changed

New Contributors

Full Changelog: https://github.com/huggingface/text-generation-inference/compare/v2.0.3...v2.1.0

Fetched April 7, 2026