v0.4.0
$npx -y @buildinternet/releases show rel_paTix8_--DT_FFL4jzKin Features
- router: support best_of sampling
- router: support left truncation
- server: support typical sampling
- launcher: allow local models
- clients: add text-generation Python client
- launcher: allow parsing num_shard from CUDA_VISIBLE_DEVICES
Fix
- server: do not warp prefill logits
- server: fix formatting issues in generate_stream tokens
- server: fix galactica batch
- server: fix index out of range issue with watermarking
Fetched April 7, 2026