v1.6.0: FSDPv2, DeepSpeed TP and XCCL backend support…

FSDPv2 support

This release introduces the support for FSDPv2 thanks to @S1ro1.

If you are using python code, you need to set fsdp_version=2 in FullyShardedDataParallelPlugin:

from accelerate import FullyShardedDataParallelPlugin, Accelerator

fsdp_plugin = FullyShardedDataParallelPlugin(
    fsdp_version=2
    # other options...
)
accelerator = Accelerator(fsdp_plugin=fsdp_plugin)

If want to convert a YAML config that contains the FSDPv1 config to FSDPv2 one , use our conversion tool:

accelerate to-fsdp2 --config_file config.yaml --output_file new_config.yaml`

To learn more about the difference between FSDPv1 and FSDPv2, read the following documentation.

DeepSpeed TP support

We have added initial support for DeepSpeed + TP. Not many changes were required as the DeepSpeed APIs was already compatible. We only needed to make sure that the dataloader was compatible with TP and that we were able to save the TP weights. Thanks @inkcherry for the work ! https://github.com/huggingface/accelerate/pull/3390.

To use TP with deepspeed, you need to update the setting in the deepspeed config file by including tensor_parallel key:

    ....
    "tensor_parallel":{
      "autotp_size": ${autotp_size}
    },
   ...

More details in this deepspeed PR.

Support for XCCL distributed backend

We've added support for XCCL which is an Intel distributed backend which can be used with XPU devices. More details in this torch PR. Thanks @dvrogozh for the integration !

What's Changed

Add log_artifact, log_artifacts and log_figure capabilities to the MLflowTracker. by @luiz0992 in https://github.com/huggingface/accelerate/pull/3419
tensor parallel dataloder for deepspeed accelerator by @inkcherry in https://github.com/huggingface/accelerate/pull/3390
Fix prod issues by @muellerzr in https://github.com/huggingface/accelerate/pull/3441
Fix attribute issue with deepspeed tp by @SunMarc in https://github.com/huggingface/accelerate/pull/3443
Fixed typo in the multi node FSDP slurm example script by @JacobB33 in https://github.com/huggingface/accelerate/pull/3447
feat: Add no_ssh and slurm multinode launcher options for deepspeed by @hsmallbone in https://github.com/huggingface/accelerate/pull/3329
Fixup ao module filter func by @muellerzr in https://github.com/huggingface/accelerate/pull/3450
remove device index workaround on xpu since xpu supports integer device index as cuda now by @yao-matrix in https://github.com/huggingface/accelerate/pull/3448
enable 2 UT cases on XPU by @yao-matrix in https://github.com/huggingface/accelerate/pull/3445
Fix AMD GPU support with should_reduce_batch_size() by @cameronshinn in https://github.com/huggingface/accelerate/pull/3405
Fix device KeyError in tied_params_map by @dvrogozh in https://github.com/huggingface/accelerate/pull/3403
Initial FSDP2 support by @S1ro1 in https://github.com/huggingface/accelerate/pull/3394
Fix: clip grad norm in fsdp2 by @S1ro1 in https://github.com/huggingface/accelerate/pull/3465
Update @ by @muellerzr in https://github.com/huggingface/accelerate/pull/3466
Fix seeding of new generator for multi GPU by @albertcthomas in https://github.com/huggingface/accelerate/pull/3459
Fix get_balanced_memory for MPS by @booxter in https://github.com/huggingface/accelerate/pull/3464
Update CometMLTracker to allow re-using experiment by @Lothiraldan in https://github.com/huggingface/accelerate/pull/3328
Apply ruff py39 fixes by @cyyever in https://github.com/huggingface/accelerate/pull/3461
xpu: enable xccl distributed backend by @dvrogozh in https://github.com/huggingface/accelerate/pull/3401
Update ruff target-version to py39 and apply more fixes by @cyyever in https://github.com/huggingface/accelerate/pull/3470
[MLU] fix deepspeed dependency by @huismiling in https://github.com/huggingface/accelerate/pull/3472
remove use_xpu to fix ut issues, we don't need this since XPU is OOB … by @yao-matrix in https://github.com/huggingface/accelerate/pull/3460
Bump ruff to 0.11.2 by @cyyever in https://github.com/huggingface/accelerate/pull/3471

New Contributors

@luiz0992 made their first contribution in https://github.com/huggingface/accelerate/pull/3419
@inkcherry made their first contribution in https://github.com/huggingface/accelerate/pull/3390
@JacobB33 made their first contribution in https://github.com/huggingface/accelerate/pull/3447
@hsmallbone made their first contribution in https://github.com/huggingface/accelerate/pull/3329
@yao-matrix made their first contribution in https://github.com/huggingface/accelerate/pull/3448
@cameronshinn made their first contribution in https://github.com/huggingface/accelerate/pull/3405
@S1ro1 made their first contribution in https://github.com/huggingface/accelerate/pull/3394
@albertcthomas made their first contribution in https://github.com/huggingface/accelerate/pull/3459
@booxter made their first contribution in https://github.com/huggingface/accelerate/pull/3464
@Lothiraldan made their first contribution in https://github.com/huggingface/accelerate/pull/3328
@cyyever made their first contribution in https://github.com/huggingface/accelerate/pull/3461

Full Changelog: https://github.com/huggingface/accelerate/compare/v1.5.2...v1.6.0

v1.6.0: FSDPv2, DeepSpeed TP and XCCL backend support