releases.shpreview

v3.0.2

$npx -y @buildinternet/releases show rel_aC2VldelTfmyZTztrOMRf

Tl;dr

New transformers backend supporting flashattention at roughly same performance as pure TGI for all non officially supported models directly in TGI. Congrats @Cyrilvallez

New models unlocked: Cohere2, olmo, olmo2, helium.

What's Changed

New Contributors

Full Changelog: https://github.com/huggingface/text-generation-inference/compare/v3.0.1...v3.0.2

Fetched April 7, 2026