releases.shpreview

v0.9.1

$npx -y @buildinternet/releases show rel_H7HrNYzZq02gKgq7yziW8

Highlights

  • server: Non flash MPT
  • server: decrease memory fragmentation

Features

  • server: use latest flash attention
  • router: add argument for hostname in router
  • docs: Adding some help for the options in text-generation-benchmark

Fix

  • makefile: Update server/Makefile to include Makefile-vllm
  • server: Handle loading from local files for MPT
  • server: avoid errors for very small top_p values

Full Changelog: https://github.com/huggingface/text-generation-inference/compare/v0.9.0...v0.9.1

Fetched April 7, 2026