N-D Parallelism
Training large models across multiple GPUs can be complex, especially when combining different parallelism strategies (e.g TP, CP, DP). To simplify this process, we've collaborated with Axolotl to introduce an easy-to-use integration that allows you to apply any combination of parallelism strategies directly in your training script. Just pass a ParallelismConfig specifying the size of each parallelism type—it's that simple.
Learn more about how it works in our latest blogpost.
parallelism_config = ParallelismConfig(
dp_shard_size=2,
dp_replicate_size=2,
cp_size=2,
tp_size=2,
)
accelerator = Accelerator(
parallelism_config=parallelism_config,
...
)
model = AutoModelForCausalLM.from_pretrained("your-model-name", device_mesh=accelerator.torch_device_mesh)
model = accelerator.prepare(model)
- Parallelism config + TP + HSDP + BYODM (Bring Your Own Device Mesh) by @SalmanMohammadi in https://github.com/huggingface/accelerate/pull/3682
- Feat: context parallel v2.0 by @S1ro1 in https://github.com/huggingface/accelerate/pull/3700
- set default submesh_tp_size to prevent unset local variable error by @winglian in https://github.com/huggingface/accelerate/pull/3687
- Add Parallelism getter property to Accelerator class by @WoosungMyung in https://github.com/huggingface/accelerate/pull/3703
- Fix: prepare works even if nothing except tp specified (rare) by @S1ro1 in https://github.com/huggingface/accelerate/pull/3707
- Set parallelism_config in constructor due to Trainer reset of State by @winglian in https://github.com/huggingface/accelerate/pull/3713
- Fix: tp size wouldn't read from env by @S1ro1 in https://github.com/huggingface/accelerate/pull/3716
- Remove
ParallelismConfigfromPartialStateby @SunMarc in https://github.com/huggingface/accelerate/pull/3720
FSDP improvements
We've fixed ignored modules attribute. With this, it is now possible to train PEFT model that moe layers that contrains q_proj and v_proj parameters. This is especially important for fine-tuning gpt-oss model.
- ENH: Allow FSDP ignored modules to be regex by @BenjaminBossan in https://github.com/huggingface/accelerate/pull/3698
- TST Add test for FSDP ignored_modules as str by @BenjaminBossan in https://github.com/huggingface/accelerate/pull/3719
Minor improvements
- feature: CpuOffload pre_forward don't attempt to move if already on device by @JoeGaffney in https://github.com/huggingface/accelerate/pull/3695
- Fix: Ensure environment variable values are case-insensitive in Accelerate by @jp1924 in https://github.com/huggingface/accelerate/pull/3712
- remove use_ipex by @SunMarc in https://github.com/huggingface/accelerate/pull/3721
New Contributors
- @SalmanMohammadi made their first contribution in https://github.com/huggingface/accelerate/pull/3682
- @WoosungMyung made their first contribution in https://github.com/huggingface/accelerate/pull/3703
- @jp1924 made their first contribution in https://github.com/huggingface/accelerate/pull/3712
- @JoeGaffney made their first contribution in https://github.com/huggingface/accelerate/pull/3695
Full Changelog: https://github.com/huggingface/accelerate/compare/v1.9.0...v1.10.0
Fetched April 7, 2026
