---
name: Inference
slug: inference
organization_slug: hugging-face
category: ai
source_count: 2
canonical: https://releases.sh/hugging-face/inference
---

# Inference

Optimized inference servers for text and embeddings

## Sources (2)

- [Text Embeddings Inference](https://releases.sh/hugging-face/text-embeddings-inference) — `github`
- [Text Generation Inference](https://releases.sh/hugging-face/text-generation-inference) — `github`

## Recent Releases

_Summaries below — fetch the release's `canonical` URL for full content, or `url` for the original source._

<Release source="text-embeddings-inference" version="v1.9.3" date="March 23, 2026" published="2026-03-23T11:57:19.000Z" url="https://github.com/huggingface/text-embeddings-inference/releases/tag/v1.9.3" canonical="https://releases.sh/release/rel_ZjYO-ZfPKrOPwOZSmU-m7" truncated="true">
## What's Changed
* Use `rust-toolchain.toml` before `rustup` on `Dockerfile-{cuda,cuda-all}` by @alvarobartt in https://github.com/huggingface/text-...
</Release>

<Release source="text-embeddings-inference" version="v1.9.2" date="February 25, 2026" published="2026-02-25T11:17:59.000Z" url="https://github.com/huggingface/text-embeddings-inference/releases/tag/v1.9.2" canonical="https://releases.sh/release/rel_u5i8Y5-E1Ty1fTgVeHnUB" truncated="true">
## What's Changed

* Fix auto-truncate false setting by @vrdn-23 in https://github.com/huggingface/text-embeddings-inference/pull/836
* Set `pad_to...
</Release>

<Release source="text-embeddings-inference" version="v1.9.1" date="February 17, 2026" published="2026-02-17T20:59:31.000Z" url="https://github.com/huggingface/text-embeddings-inference/releases/tag/v1.9.1" canonical="https://releases.sh/release/rel_jBoNJNG0GwO7Qj50CRnpr" truncated="true">
## What's Changed

### 🚨 Fix

* Fix support for containers w/ CUDA 13.0+ by @alvarobartt in https://github.com/huggingface/text-embeddings-infere...
</Release>

<Release source="text-embeddings-inference" version="v1.9.0" date="February 17, 2026" published="2026-02-17T13:42:14.000Z" url="https://github.com/huggingface/text-embeddings-inference/releases/tag/v1.9.0" canonical="https://releases.sh/release/rel_Y8D6Bpmw3UbyRwtU5a0H4" truncated="true">
<img width="1800" height="972" alt="text-embeddings-inference-v1 9 0" src="https://github.com/user-attachments/assets/fe3751d1-1a3a-4b1f-8cf5-5c2326c1...
</Release>

<Release source="text-generation-inference" version="v3.3.7" date="December 19, 2025" published="2025-12-19T14:35:25.000Z" url="https://github.com/huggingface/text-generation-inference/releases/tag/v3.3.7" canonical="https://releases.sh/release/rel_gWHs7FC2nAiWHjn56O08J" truncated="true">
## What's Changed
* misc(gha): expose action cache url and runtime as secrets by @mfuntowicz in https://github.com/huggingface/text-generation-infere...
</Release>

<Release source="text-embeddings-inference" version="v1.8.3" date="October 30, 2025" published="2025-10-30T09:08:18.000Z" url="https://github.com/huggingface/text-embeddings-inference/releases/tag/v1.8.3" canonical="https://releases.sh/release/rel_Nw3le7FfL8wAm4wA6OGOY" truncated="true">
## What's Changed

### Bug Fixes

* Fix error code for empty requests by @vrdn-23 in https://github.com/huggingface/text-embeddings-inference/pull...
</Release>

<Release source="text-generation-inference" version="v3.3.6" date="September 17, 2025" published="2025-09-17T00:48:54.000Z" url="https://github.com/huggingface/text-generation-inference/releases/tag/v3.3.6" canonical="https://releases.sh/release/rel_5Fl1DiTs67D6onh213me0" truncated="true">
## What's Changed
* Add missing backslash by @philsupertramp in https://github.com/huggingface/text-generation-inference/pull/3311
* Revert "feat: b...
</Release>

<Release source="text-embeddings-inference" version="v1.8.2" date="September 9, 2025" published="2025-09-09T14:45:29.000Z" url="https://github.com/huggingface/text-embeddings-inference/releases/tag/v1.8.2" canonical="https://releases.sh/release/rel_DtdRMbQKjMithLTXCaRWm" truncated="true">
## 🔧 Fixed Intel MKL Support

Since Text Embeddings Inference (TEI) v1.7.0, Intel MKL support had been broken due to changes in the `candle` depend...
</Release>

<Release source="text-embeddings-inference" version="v1.8.1" date="September 4, 2025" published="2025-09-04T15:22:14.000Z" url="https://github.com/huggingface/text-embeddings-inference/releases/tag/v1.8.1" canonical="https://releases.sh/release/rel_BwxGl_zBfEgFELWANCF50" truncated="true">
<img width="1200" height="648" alt="text-embeddings-inference-v1 8 1-embedding-gemma(1)" src="https://github.com/user-attachments/assets/8ad8fb64-cee4...
</Release>

<Release source="text-generation-inference" version="v3.3.5" date="September 2, 2025" published="2025-09-02T15:02:33.000Z" url="https://github.com/huggingface/text-generation-inference/releases/tag/v3.3.5" canonical="https://releases.sh/release/rel_hjBniksH_PR3kY-c8fL3_" truncated="true">
## What's Changed
* [gaudi] Refine rope memory, do not need to keep sin/cos cache per layer by @sywangyi in https://github.com/huggingface/text-gener...
</Release>

**Tags:** `inference`, `rust`, `serving`
