v1.11.0: TE MXFP8, FP16/BF16 with MPS, Python 3.10
We've added support for MXFP8 in our TransformerEngine integration. To use that, you need to set use_mxfp8_block_scaling in fp8_config. See nvidia docs [here]. (https://docs.nvidia.com/deeplearning/transformer-engine/user-guide/examples/fp8_primer.html#MXFP8-and-block-scaling)
BF16 and FP16 support for MPS devices is finally here. You can now pass mixed_precision = "fp16" or "bf16" when training on a mac (fp16 requires torch 2.8 and bf16 requires torch 2.6)
The following PRs add respectively support to ignored_params and no_sync() for FSDPv2:
Mixed precision can now be passed as a dtype string from accelerate cli flag or fsdp_config in accelerate config file:
Some minor updates concerning nd-parallelism.
We've dropped support for python 3.9 as it reached EOL in October.
cpu and offloaded to meta by @Qubitium in https://github.com/huggingface/accelerate/pull/3796with in Accelerator.autocast()instead of __enter__() and __exit__() for more elegant style. by @EquationWalker in https://github.com/huggingface/accelerate/pull/3767SWANLAB_MODE by @SunMarc in https://github.com/huggingface/accelerate/pull/3808Full Changelog: https://github.com/huggingface/accelerate/compare/v1.10.1...v1.11.0
Fetched April 7, 2026