What's Changed
- Use
rust-toolchain.tomlbeforerustuponDockerfile-{cuda,cuda-all}by @alvarobartt in https://github.com/huggingface/text-...
rust-toolchain.toml before rustup on Dockerfile-{cuda,cuda-all} by @alvarobartt in https://github.com/huggingface/text-...<img width="1800" height="972" alt="text-embeddings-inference-v1 9 0" src="https://github.com/user-attachments/assets/fe3751d1-1a3a-4b1f-8cf5-5c2326c1...
Since Text Embeddings Inference (TEI) v1.7.0, Intel MKL support had been broken due to changes in the candle depend...
<img width="1200" height="648" alt="text-embeddings-inference-v1 8 1-embedding-gemma(1)" src="https://github.com/user-attachments/assets/8ad8fb64-cee4...
<img width="3600" height="1944" alt="text-embeddings-inference-v1 8 0(2)" src="https://github.com/user-attachments/assets/50df05b6-3821-4e2a-8de0-3e5c...
Qwen3 was not working fine on CPU / MPS when sending batched requests on FP16 precision, due to the FP32 minimum value downca...
Qwen3 support included for Intel HPU, and fixed for CPU / Metal / CUDA.
...
model.onnx_data by @kozistr in https://github.com/huggingface/text-embeddings-inference/pull/343/similarity route*...
What's Changed Fix auto-truncate false setting by @vrdn-23 in https://github.com/huggingface/text-embeddings-inference/pull/836 Set as null…
Hugging Face · InferenceA small patch release containing these fixes: - 3161 - 3165 Full Changelog: https://github.com/huggingface/peft/compare/v0.19.0...v0.19.1
Hugging Face · Fine-tuning