Hugging Face/Text Embeddings Inference

Text Embeddings Inference

Sun

Mon

Tue

Wed

Thu

Fri

Sat

JunJulAugSepOctNovDecJanFebMarAprMayJun

Less

Releases3Avg0/wkVersionsv1.9.0v1.9.2

Highlights All Releases

Recent Highlights4 releases · last 90 days

Text Embeddings Inference shifted toward better container compatibility and model coverage. v1.9.0 made breaking changes to unify CPU and CUDA GeLU behavior by defaulting to the tanh approximation instead of exact GeLU, and flipped --auto-truncate to true by default. Subsequent releases fixed CUDA version mismatch errors in container deployments and added rope parameter support alongside backing for Perplexity's embedding models, while refinements addressed edge cases like nullable pad tokens and truncation setting persistence.

Generated Apr 7, 2026, 5:28 PM UTC

Mar 2026

Mar 20261 releases

Fixed Docker build toolchain resolution and improved error handling in device detection. The release standardized Rust toolchain specification in CUDA Dockerfiles and replaced bare exception catching with explicit Exception handling.

Last Checked

2h ago

Latest

v1.9.3

Source

@huggingface/text-embeddings-inference4.9k

Tracking since Oct 13, 2023

.json·.md·.atom