releases.shpreview

v0.8.0

v0.8.0: Big model inference

$npx -y @buildinternet/releases show rel_Az2NjvoIxO0ymOHuqpq37

Big model inference

To handle very large models, new functionality has been added in Accelerate:

  • a context manager to initalize empty models
  • a function to load a sharded checkpoint directly on the right devices
  • a set of custom hooks that allow execution of a model split on different devices, as well as CPU or disk offload
  • a magic method that auto-determines a device map for a given model, maximizing the GPU spaces, available RAM before using disk offload as a last resort.
  • a function that wraps the last three blocks in one simple call (load_checkpoint_and_dispatch)

See more in the documentation

  • Big model inference by @sgugger in #345

What's new

  • Create peak_memory_uasge_tracker.py by @pacman100 in #336
  • Fixed a typo to enable running accelerate correctly by @Idodox in #339
  • Introduce multiprocess logger by @muellerzr in #337
  • Refactor utils into its own module by @muellerzr in #340
  • Improve num_processes question in CLI by @muellerzr in #343
  • Handle Manual Wrapping in FSDP. Minor fix of fsdp example. by @pacman100 in #342
  • Better prompt for number of training devices by @muellerzr in #344
  • Fix prompt for num_processes by @pacman100 in #347
  • Fix sample calculation in examples by @muellerzr in #352
  • Fixing metric eval in distributed setup by @pacman100 in #355
  • DeepSpeed and FSDP plugin support through script by @pacman100 in #356

Fetched April 7, 2026