BREAKING CHANGE: Lots of modifications around tool calling. Tool calling now respects fully OpenAI return results (arguments return type is a string instead of a real JSON object). Lots of improvements around the tool calling and side effects fixed.
Added Gemma 3 support.
tool_calls a vector. by @Narsil in https://github.com/huggingface/text-generation-inference/pull/3075openai to impure shell for integration tests by @danieldk in https://github.com/huggingface/text-generation-inference/pull/3081--max-batch-total-tokens description by @alvarobartt in https://github.com/huggingface/text-generation-inference/pull/3083/v1/chat/completions endpoint by @aW3st in https://github.com/huggingface/text-generation-inference/pull/3000Full Changelog: https://github.com/huggingface/text-generation-inference/compare/v3.1.1...v3.2.0
Fetched April 7, 2026