v3.3.7
What's Changed
- misc(gha): expose action cache url and runtime as secrets by @mfuntowicz in https://github.com/huggingface/text-generation-inference/pull/2964
- feat: support max_image_fetch_size to limit by @drbh in https://github.com/huggingface/text-generation-inference/pull/3339
- Maintenance mode by @LysandreJik in https://github.com/huggingface/text-generation-inference/pull/3344
- Maintenance mode by @LysandreJik in https://github.com/huggingface/text-generation-inference/pull/3345
- fix(num_devices): fix num_shard/num device auto compute when NVIDIA_VISIBLE_DEVICES == "all" or "void" by @oOraph in https://github.com/huggingface/text-generation-inference/pull/3346
Full Changelog: https://github.com/huggingface/text-generation-inference/compare/v3.3.6...v3.3.7
Fetched April 7, 2026

