v1.4.0
Notable Changes
- Cuda support for the Qwen2 model architecture
What's Changed
- feat(candle): support Qwen2 on Cuda by @OlivierDehaene in https://github.com/huggingface/text-embeddings-inference/pull/316
- fix(candle): fix last token pooling
Full Changelog: https://github.com/huggingface/text-embeddings-inference/compare/v1.3.0...v1.4.0
Fetched April 7, 2026
