releases.shpreview

v0.6.0

$npx -y @buildinternet/releases show rel_lIkMl27xcB_rAUtUhSNoN

Features

  • server: flash attention past key values optimization (contributed by @njhill)
  • router: remove requests when client closes the connection (co-authored by @njhill)
  • server: support quantization for flash models
  • router: add info route
  • server: optimize token decode
  • server: support flash sharded santacoder
  • security: image signing with cosign
  • security: image analysis with trivy
  • docker: improve image size

Fix

  • server: check cuda capability before importing flash attention
  • server: fix hf_transfer issue with private repositories
  • router: add auth token for private tokenizers

Misc

  • rust: update to 1.69

Fetched April 7, 2026