v0.27.0: PyTorch 2.2.0 Support, PyTorch-Native Pipeline Parallism, DeepSpeed XPU support, and Bug Fixes
With the latest release of PyTorch 2.2.0, we've guaranteed that there are no breaking changes regarding it
With this release we are excited to announce support for pipeline-parallel inference by integrating PyTorch's PiPPy framework (so no need to use Megatron or DeepSpeed)! This supports automatic model-weight splitting to each device using a similar API to device_map="auto". This is still under heavy development, however the inference side is stable enough that we are ready for a release. Read more about it in our docs and check out the example zoo.
Requires pippy of version 0.2.0 or later (pip install torchpippy -U)
Example usage (combined with accelerate launch or torchrun):
from accelerate import PartialState, prepare_pippy
model = AutoModelForSequenceClassification.from_pretrained("gpt2")
model = prepare_pippy(model, split_points="auto", example_args=(input,))
input = input.to("cuda:0")
with torch.no_grad():
output = model(input)
# The outputs are only on the final process by default
# You can pass in `gather_outputs=True` to prepare_pippy to
# make them available on all processes
if PartialState().is_last_process:
output = torch.stack(tuple(output[0]))
print(output.shape)
This release provides support for utilizing DeepSpeed on XPU devices thanks to @faaany
dispatch_model, and in forward with offloading by @fxmarty in https://github.com/huggingface/accelerate/pull/2330accelerate config by @faaany in https://github.com/huggingface/accelerate/pull/2346block_size picking in megatron_lm_gpt_pretraining example. by @nilq in https://github.com/huggingface/accelerate/pull/2342FP8RecipeKwargs by @sudhakarsingh27 in https://github.com/huggingface/accelerate/pull/2355add_hook_to_module and remove_hook_from_module compatibility with fx.GraphModule by @fxmarty in https://github.com/huggingface/accelerate/pull/2369requires_grad to kwargs when registering empty parameters. by @BlackSamorez in https://github.com/huggingface/accelerate/pull/2376adapter_only option to save_fsdp_model and load_fsdp_model to only save/load PEFT weights by @AjayP13 in https://github.com/huggingface/accelerate/pull/2321split_batches by @izhx in https://github.com/huggingface/accelerate/pull/2344nproc_per_node in the multi gpu test by @faaany in https://github.com/huggingface/accelerate/pull/2422Accelerator to prepare models in eval mode for XPU&CPU by @faaany in https://github.com/huggingface/accelerate/pull/2426Full Changelog: https://github.com/huggingface/accelerate/compare/v0.26.1...v0.27.0
Fetched April 7, 2026