Fixed mistral tokenizer resolution when mistral-common is installed and updated the lower bound for PEFT. This is similar to v5.10.3 minus fixes already in the main release.
Transformers
This patch release fixes several regressions introduced by previous changes, including issues with {image/video/audio}_token_ids in ProcessorMixin, InternVL models, and offsets in processing. It also addresses a regression in the Mistral common backend and updates the peft lower bound.
This release introduces the MiniMax-M3-VL vision-language model, the PP-OCRv6 OCR system, and the Parakeet-RNNT model for speech processing. Several bug fixes and improvements were also made, including changes to CI, stop string matching, and model documentation.
New models DiffusionGemma and DeepSeek-V3.2 have been added, featuring optimizations for inference speed and efficient long-context handling. The Kernels API was extended for module fusion and parameter transformation, with added support for fp8/fp4 Triton kernels. Model parallel beam search bugs in Qwen2-VL model families were fixed.
Fixed a conversion bug for CLIP models that affected downstream models like SAM3.
Added Gemma4 12B Unified, an encoder-free multimodal model that projects raw vision and audio inputs directly into language model space; Sapiens2, a vision transformer family for human-centric tasks; DeepSeek-OCR-2 for document understanding; and Mellum, a code-focused mixture-of-experts model. Fixed numerous model parallelism bugs across tensor and expert parallelism, beam search under parallel settings, and loss over-counting; also fixed encoder-decoder cache initialization regression and BitsAndBytes quantization tensor-dropping bug.
Added support for Cohere2Moe (a Mixture-of-Experts model with sliding window and full attention), HRM-Text (hierarchical reasoning model with two transformer stacks), and Parakeet tdt speech model. SAM3, EdgeTAM, and SAM3-Lite-Text now expect full text embeddings instead of pooler outputs, requiring input updates. Fixed generation issues including inputs_embeds handling for Gemma4, an AttributeError in RAG's generate() caused by missing config fields, memory leaks from lru decorators in vision models, and improved audio/vision encoder compilability.
Fixed Deepseek V4 integration issues including CSA mask collapse and WeightConverter regex incorrectly matching shared_experts as experts. Also added fatal_error to ContinuousBatchingManager for serving operations.
Release v5.8.0
New Model additions
DeepSeek-V4
<img width="6604" height="3574" alt="image" src="https://github.com/user-attachment...
Release v5.7.0
New Model additions
Laguna
<img width="699" height="176" alt="image" src="https://github.com/user-attachments/asset...
Patch release v5.6.2
Qwen 3.5 and 3.6 MoE (text-only) were broken when using with FP8. It should now work again with this 🫡
- F...
Patch release v5.6.1
Flash attention path was broken! Sorry everyone for this one 🤗
- Fix AttributeError on s_aux=None in flash_attention_fo...
Release v5.6.0
New Model additions
OpenAI Privacy Filter
OpenAI Privacy Filter is a bidirectional token-classification model for p...
Patch release v5.5.4
This is mostly some fixes that are good to have asap, mostly for tokenizers; ** Fix Kimi-K2.5 tokenizer regression and _pat...
Small patch release to fix device_map support for Gemma4! It contains the following commit:
- [gemma4] Fix device map auto (#45347) by @Cyrilvall...
Small patch dedicated to optimizing gemma4, fixing inference with use_cache=False due to k/v states sharing between layers, as well as conversion ma...
Patch release v5.5.1
This patch is very small and focuses on vLLM and Gemma4!
** Fix export for gemma4 and add Integration tests (#45285) by ...
Release v5.5.0
<img width="2786" height="1504" alt="image" src="https://github.com/user-attachments/assets/6c8c878f-042b-4858-9f64-73fd9ccd7e4b" ...
New Model additions
VidEoMT
<img width="1480" height="460" alt="image" src="https://github.com/user-attachments/assets/bec6fc25-b0ab-4227...
v5.3.0: EuroBERT, VibeVoice ASR, TimesFM2.5, PP-DocLayoutV2, OlmoHybrid, ModernVBert, Higgs Audio V2
↗New Model additions
EuroBERT
<img width="1080" height="1080" alt="image" src="https://github.com/user-attachments/assets/33603f42-5435-42...

