max_input_length is bigger than max-batch-tokens by @kozistr in https://github.com/huggingface/text-embeddings-inference/pull/725modules.json for Dense modules in local models by @alvarobartt in https://github.com/huggingface/text-embeddings-inference/pull/738test_gemma3.rs for EmbeddingGemma by @alvarobartt in https://github.com/huggingface/text-embeddings-inference/pull/718HF_TOKEN in ApiBuilder for candle/tests by @alvarobartt in https://github.com/huggingface/text-embeddings-inference/pull/724cargo install commands for candle with CUDA by @alvarobartt in https://github.com/huggingface/text-embeddings-inference/pull/719version to 1.8.3 by @alvarobartt in https://github.com/huggingface/text-embeddings-inference/pull/745Full Changelog: https://github.com/huggingface/text-embeddings-inference/compare/v1.8.2...v1.8.3
Fetched April 7, 2026