v1.10.0: N-D Parallelism (v1.10.0) — Accelerate

N-D Parallelism

Training large models across multiple GPUs can be complex, especially when combining different parallelism strategies (e.g TP, CP, DP). To simplify this process, we've collaborated with Axolotl to introduce an easy-to-use integration that allows you to apply any combination of parallelism strategies directly in your training script. Just pass a ParallelismConfig specifying the size of each parallelism type—it's that simple. Learn more about how it works in our latest blogpost.

parallelism_config = ParallelismConfig(
    dp_shard_size=2,
    dp_replicate_size=2,
    cp_size=2,
    tp_size=2,
)
accelerator = Accelerator(
    parallelism_config=parallelism_config,
   ...
)
model = AutoModelForCausalLM.from_pretrained("your-model-name", device_mesh=accelerator.torch_device_mesh)
model = accelerator.prepare(model)

Parallelism config + TP + HSDP + BYODM (Bring Your Own Device Mesh) by @SalmanMohammadi in https://github.com/huggingface/accelerate/pull/3682
Feat: context parallel v2.0 by @S1ro1 in https://github.com/huggingface/accelerate/pull/3700
set default submesh_tp_size to prevent unset local variable error by @winglian in https://github.com/huggingface/accelerate/pull/3687
Add Parallelism getter property to Accelerator class by @WoosungMyung in https://github.com/huggingface/accelerate/pull/3703
Fix: prepare works even if nothing except tp specified (rare) by @S1ro1 in https://github.com/huggingface/accelerate/pull/3707
Set parallelism_config in constructor due to Trainer reset of State by @winglian in https://github.com/huggingface/accelerate/pull/3713
Fix: tp size wouldn't read from env by @S1ro1 in https://github.com/huggingface/accelerate/pull/3716
Remove ParallelismConfig from PartialState by @SunMarc in https://github.com/huggingface/accelerate/pull/3720

FSDP improvements

We've fixed ignored modules attribute. With this, it is now possible to train PEFT model that moe layers that contrains q_proj and v_proj parameters. This is especially important for fine-tuning gpt-oss model.

ENH: Allow FSDP ignored modules to be regex by @BenjaminBossan in https://github.com/huggingface/accelerate/pull/3698
TST Add test for FSDP ignored_modules as str by @BenjaminBossan in https://github.com/huggingface/accelerate/pull/3719

Minor improvements

feature: CpuOffload pre_forward don't attempt to move if already on device by @JoeGaffney in https://github.com/huggingface/accelerate/pull/3695
Fix: Ensure environment variable values are case-insensitive in Accelerate by @jp1924 in https://github.com/huggingface/accelerate/pull/3712
remove use_ipex by @SunMarc in https://github.com/huggingface/accelerate/pull/3721

New Contributors

@SalmanMohammadi made their first contribution in https://github.com/huggingface/accelerate/pull/3682
@WoosungMyung made their first contribution in https://github.com/huggingface/accelerate/pull/3703
@jp1924 made their first contribution in https://github.com/huggingface/accelerate/pull/3712
@JoeGaffney made their first contribution in https://github.com/huggingface/accelerate/pull/3695

Full Changelog: https://github.com/huggingface/accelerate/compare/v1.9.0...v1.10.0

v1.10.0: N-D Parallelism

N-D Parallelism

FSDP improvements

Minor improvements

New Contributors

More from Hugging Face

More from Hugging Face