releases.shpreview
Hugging Face/Transformers

Transformers

Mon
Wed
Fri
JunJulAugSepOctNovDecJanFebMarAprMayJun
Less
More
Releases17Avg6/moVersionsv5.4.0 to v5.12.0
v5.10.3

This patch release fixes several regressions introduced by previous changes, including issues with {image/video/audio}_token_ids in ProcessorMixin, InternVL models, and offsets in processing. It also addresses a regression in the Mistral common backend and updates the peft lower bound.

Read more →
v5.12.0

This release introduces the MiniMax-M3-VL vision-language model, the PP-OCRv6 OCR system, and the Parakeet-RNNT model for speech processing. Several bug fixes and improvements were also made, including changes to CI, stop string matching, and model documentation.

Read more →
v5.11.0

New models DiffusionGemma and DeepSeek-V3.2 have been added, featuring optimizations for inference speed and efficient long-context handling. The Kernels API was extended for module fusion and parameter transformation, with added support for fp8/fp4 Triton kernels. Model parallel beam search bugs in Qwen2-VL model families were fixed.

Read more →
v5.10.1

Added Gemma4 12B Unified, an encoder-free multimodal model that projects raw vision and audio inputs directly into language model space; Sapiens2, a vision transformer family for human-centric tasks; DeepSeek-OCR-2 for document understanding; and Mellum, a code-focused mixture-of-experts model. Fixed numerous model parallelism bugs across tensor and expert parallelism, beam search under parallel settings, and loss over-counting; also fixed encoder-decoder cache initialization regression and BitsAndBytes quantization tensor-dropping bug.

Read more →
v5.9.0

Added support for Cohere2Moe (a Mixture-of-Experts model with sliding window and full attention), HRM-Text (hierarchical reasoning model with two transformer stacks), and Parakeet tdt speech model. SAM3, EdgeTAM, and SAM3-Lite-Text now expect full text embeddings instead of pooler outputs, requiring input updates. Fixed generation issues including inputs_embeds handling for Gemma4, an AttributeError in RAG's generate() caused by missing config fields, memory leaks from lru decorators in vision models, and improved audio/vision encoder compilability.

Read more →
v5.8.1

Fixed Deepseek V4 integration issues including CSA mask collapse and WeightConverter regex incorrectly matching shared_experts as experts. Also added fatal_error to ContinuousBatchingManager for serving operations.

Read more →
v5.6.2

Patch release v5.6.2

Qwen 3.5 and 3.6 MoE (text-only) were broken when using with FP8. It should now work again with this 🫡

  • F...
Read more →
v5.6.1

Patch release v5.6.1

Flash attention path was broken! Sorry everyone for this one 🤗

  • Fix AttributeError on s_aux=None in flash_attention_fo...
Read more →
v5.6.0

Release v5.6.0

New Model additions

OpenAI Privacy Filter

OpenAI Privacy Filter is a bidirectional token-classification model for p...

Read more →
v5.5.4

Patch release v5.5.4

This is mostly some fixes that are good to have asap, mostly for tokenizers; ** Fix Kimi-K2.5 tokenizer regression and _pat...

Read more →
v5.5.2

Small patch dedicated to optimizing gemma4, fixing inference with use_cache=False due to k/v states sharing between layers, as well as conversion ma...

Read more →
v5.5.1

Patch release v5.5.1

This patch is very small and focuses on vLLM and Gemma4!

** Fix export for gemma4 and add Integration tests (#45285) by ...

Read more →
Last Checked
3h ago
Tracking since Apr 23, 2024