Patch release v5.5.4
This is mostly some fixes that are good to have asap, mostly for tokenizers; ** Fix Kimi-K2.5 tokenizer regression and _patch_mistral_regex Attribute… (#45305) by ArthurZucker
For training: ** Fix #45305 + add regression test GAS (#45349) by florian6973, SunMarc ** Fix IndexError with DeepSpeed ZeRO-3 when kernels rotary is active (#…) by ArthurZucker
And for Qwen2.5-VL : ** Fix Qwen2.5-VL temporal RoPE scaling applied to still images (#45330) by Kash6, zucchini-nlp
Fetched April 13, 2026