π³ Depth-Guided Stable Diffusion and 2.1 checkpoints
The new depth-guided stable diffusion model is fully supported in this release. The model is conditioned on monocular depth estimates inferred via MiDaS and can be used for structure-preserving img2img and shape-conditional synthesis.

Installing the transformers library from source is required for the MiDaS model:
pip install --upgrade git+https://github.com/huggingface/transformers/
import torch
import requests
from PIL import Image
from diffusers import StableDiffusionDepth2ImgPipeline
pipe = StableDiffusionDepth2ImgPipeline.from_pretrained(
"stabilityai/stable-diffusion-2-depth",
torch_dtype=torch.float16,
).to("cuda")
url = "http://images.cocodataset.org/val2017/000000039769.jpg"
init_image = Image.open(requests.get(url, stream=True).raw)
prompt = "two tigers"
n_propmt = "bad, deformed, ugly, bad anotomy"
image = pipe(prompt=prompt, image=init_image, negative_prompt=n_propmt, strength=0.7).images[0]
The updated Stable Diffusion 2.1 checkpoints are also released and fully supported:
:safety_vest: Safe Tensors
We now support SafeTensors: a new simple format for storing tensors safely (as opposed to pickle) that is still fast (zero-copy).
- [Proposal] Support loading from safetensors if file is present. by @Narsil in #1357
- [Proposal] Support saving to safetensors by @MatthieuBizien in #1494
| Format | Safe | Zero-copy | Lazy loading | No file size limit | Layout control | Flexibility | Bfloat16 |
|---|
| pickle (PyTorch) | β | β | β | β | β | β | β |
| H5 (Tensorflow) | β | β | β | β | ~ | ~ | β |
| SavedModel (Tensorflow) | β | β | β | β | β | β | β |
| MsgPack (flax) | β | β | β | β | β | β | β |
| SafeTensors | β | β | β | β | β | β | β |
**More details about the comparison here: https://github.com/huggingface/safetensors#yet-another-format-
pip install safetensors
from diffusers import StableDiffusionPipeline
pipe = StableDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1")
pipe.save_pretrained("./safe-stable-diffusion-2-1", safe_serialization=True)
# you can also push this checkpoint to the HF Hub and load from there
safe_pipe = StableDiffusionPipeline.from_pretrained("./safe-stable-diffusion-2-1")
New Pipelines
:paintbrush: Paint-by-example
An implementation of Paint by Example: Exemplar-based Image Editing with Diffusion Models by Binxin Yang, Shuyang Gu, Bo Zhang, Ting Zhang, Xuejin Chen, Xiaoyan Sun, Dong Chen, Fang Wen
- Add paint by example by @patrickvonplaten in #1533

import PIL
import requests
import torch
from io import BytesIO
from diffusers import DiffusionPipeline
def download_image(url):
response = requests.get(url)
return PIL.Image.open(BytesIO(response.content)).convert("RGB")
img_url = "https://raw.githubusercontent.com/Fantasy-Studio/Paint-by-Example/main/examples/image/example_1.png"
mask_url = "https://raw.githubusercontent.com/Fantasy-Studio/Paint-by-Example/main/examples/mask/example_1.png"
example_url = "https://raw.githubusercontent.com/Fantasy-Studio/Paint-by-Example/main/examples/reference/example_1.jpg"
init_image = download_image(img_url).resize((512, 512))
mask_image = download_image(mask_url).resize((512, 512))
example_image = download_image(example_url).resize((512, 512))
pipe = DiffusionPipeline.from_pretrained("Fantasy-Studio/Paint-by-Example", torch_dtype=torch.float16)
pipe = pipe.to("cuda")
image = pipe(image=init_image, mask_image=mask_image, example_image=example_image).images[0]
Audio Diffusion and Latent Audio Diffusion
Audio Diffusion leverages the recent advances in image generation using diffusion models by converting audio samples to and from mel spectrogram images.
- add AudioDiffusionPipeline and LatentAudioDiffusionPipeline #1334 by @teticio in #1426
from IPython.display import Audio
from diffusers import DiffusionPipeline
pipe = DiffusionPipeline.from_pretrained("teticio/audio-diffusion-ddim-256").to("cuda")
output = pipe()
display(output.images[0])
display(Audio(output.audios[0], rate=pipe.mel.get_sample_rate()))
[Experimental] K-Diffusion pipeline for Stable Diffusion
This pipeline is added to support the latest schedulers from @crowsonkb's k-diffusion
The purpose of this pipeline is to compare scheduler implementations and updates, so new features from other pipelines are unlikely to be supported!
- [K Diffusion] Add k diffusion sampler natively by @patrickvonplaten in #1603
pip install k-diffusion
from diffusers import StableDiffusionKDiffusionPipeline
import torch
pipe = StableDiffusionKDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1-base")
pipe = pipe.to("cuda")
pipe.set_scheduler("sample_heun")
image = pipe("astronaut riding horse", num_inference_steps=25).images[0]
New Schedulers
Heun scheduler inspired by Karras et. al
Algorithm 1 of Karras et. al. Scheduler ported from @crowsonkbβs k-diffusion
- Add 2nd order heun scheduler by @patrickvonplaten in #1336
from diffusers import HeunDiscreteScheduler
pipe = StableDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1")
pipe.scheduler = HeunDiscreteScheduler.from_config(pipe.scheduler.config)
Single step DPM-Solver
Original paper can be found here and the improved version. The original implementation can be found here.
- Add Singlestep DPM-Solver (singlestep high-order schedulers) by @LuChengTHU in #1442
from diffusers import DPMSolverSinglestepScheduler
pipe = StableDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1")
pipe.scheduler = DPMSolverSinglestepScheduler.from_config(pipe.scheduler.config)
:memo: Changelog
- [Proposal] Support loading from safetensors if file is present. by @Narsil in #1357
- Hotfix for AttributeErrors in OnnxStableDiffusionInpaintPipelineLegacy by @anton-l in #1448
- Speed up test and remove kwargs from call by @patrickvonplaten in #1446
- v-prediction training support by @patil-suraj in #1455
- Fix Flax
from_pt by @pcuenca in #1436
- Ensure Flax pipeline always returns numpy array by @pcuenca in #1435
- Add 2nd order heun scheduler by @patrickvonplaten in #1336
- fix slow tests by @patrickvonplaten in #1467
- Flax support for Stable Diffusion 2 by @pcuenca in #1423
- Updates Image to Image Inpainting community pipeline README by @vvvm23 in #1370
- StableDiffusion: Decode latents separately to run larger batches by @kig in #1150
- Fix bug in half precision for DPMSolverMultistepScheduler by @rtaori in #1349
- [Train unconditional] Unwrap model before EMA by @anton-l in #1469
- Add
ort_nightly_directml to the onnxruntime candidates by @anton-l in #1458
- Allow saving trained betas by @patrickvonplaten in #1468
- Fix dtype model loading by @patrickvonplaten in #1449
- [Dreambooth] Make compatible with alt diffusion by @patrickvonplaten in #1470
- Add better docs xformers by @patrickvonplaten in #1487
- Remove reminder comment by @pcuenca in #1489
- Bump to 0.10.0.dev0 + deprecations by @anton-l in #1490
- Add doc for Stable Diffusion on Habana Gaudi by @regisss in #1496
- Replace deprecated hub utils in
train_unconditional_ort by @anton-l in #1504
- [Deprecate] Correct stacklevel by @patrickvonplaten in #1483
- simplyfy AttentionBlock by @patil-suraj in #1492
- Standardize on using
image argument in all pipelines by @fboulnois in #1361
- support v prediction in other schedulers by @patil-suraj in #1505
- Fix Flax flip_sin_to_cos by @akashgokul in #1369
- Add an explicit
--image_size to the conversion script by @anton-l in #1509
- fix heun scheduler by @patil-suraj in #1512
- [docs] [dreambooth training] accelerate.utils.write_basic_config by @williamberman in #1513
- [docs] [dreambooth training] num_class_images clarification by @williamberman in #1508
- [From pretrained] Allow returning local path by @patrickvonplaten in #1450
- Update conversion script to correctly handle SD 2 by @patrickvonplaten in #1511
- [refactor] Making the xformers mem-efficient attention activation recursive by @blefaudeux in #1493
- Do not use torch.long in mps by @pcuenca in #1488
- Fix Imagic example by @dhruvrnaik in #1520
- Fix training docs to install datasets by @pedrogengo in #1476
- Finalize 2nd order schedulers by @patrickvonplaten in #1503
- Fixed mask+masked_image in sd inpaint pipeline by @antoche in #1516
- Create train_dreambooth_inpaint.py by @thedarkzeno in #1091
- Update FlaxLMSDiscreteScheduler by @dzlab in #1474
- [Proposal] Support saving to safetensors by @MatthieuBizien in #1494
- Add xformers attention to VAE by @kig in #1507
- [CI] Add slow MPS tests by @anton-l in #1104
- [Stable Diffusion Inpaint] Allow tensor as input image & mask by @patrickvonplaten in #1527
- Compute embedding distances with torch.cdist by @blefaudeux in #1459
- [Upscaling] Fix batch size by @patrickvonplaten in #1525
- Update bug-report.yml by @patrickvonplaten in #1548
- [Community Pipeline] Checkpoint Merger based on Automatic1111 by @Abhinay1997 in #1472
- [textual_inversion] Add an option for only saving the embeddings by @allo- in #781
- [examples] use from_pretrained to load scheduler by @patil-suraj in #1549
- fix mask discrepancies in train_dreambooth_inpaint by @thedarkzeno in #1529
- [refactor] make set_attention_slice recursive by @patil-suraj in #1532
- Research folder by @patrickvonplaten in #1553
- add AudioDiffusionPipeline and LatentAudioDiffusionPipeline #1334 by @teticio in #1426
- [Community download] Fix cache dir by @patrickvonplaten in #1555
- [Docs] Correct docs by @patrickvonplaten in #1554
- Fix typo by @pcuenca in #1558
- [docs] [dreambooth training] default accelerate config by @williamberman in #1564
- Mega community pipeline by @patrickvonplaten in #1561
- [examples] add check_min_version by @patil-suraj in #1550
- [dreambooth] make collate_fn global by @patil-suraj in #1547
- Standardize fast pipeline tests with PipelineTestMixin by @anton-l in #1526
- Add paint by example by @patrickvonplaten in #1533
- [Community Pipeline] fix lpw_stable_diffusion by @SkyTNT in #1570
- [Paint by Example] Better default for image width by @patrickvonplaten in #1587
- Add from_pretrained telemetry by @anton-l in #1461
- Correct order height & width in pipeline_paint_by_example.py by @Fantasy-Studio in #1589
- Fix common tests for FP16 by @anton-l in #1588
- [UNet2DConditionModel] add an option to upcast attention to fp32 by @patil-suraj in #1590
- Flax: avoid recompilation when params change by @pcuenca in #1096
- Add Singlestep DPM-Solver (singlestep high-order schedulers) by @LuChengTHU in #1442
- fix upcast in slice attention by @patil-suraj in #1591
- Update scheduling_repaint.py by @Randolph-zeng in #1582
- Update RL docs for better sharing / adding models by @natolambert in #1563
- Make cross-attention check more robust by @pcuenca in #1560
- [ONNX] Fix flaky tests by @anton-l in #1593
- Trivial fix for undefined symbol in train_dreambooth.py by @bcsherma in #1598
- [K Diffusion] Add k diffusion sampler natively by @patrickvonplaten in #1603
- [Versatile Diffusion] add upcast_attention by @patil-suraj in #1605
- Fix PyCharm/VSCode static type checking for dummy objects by @anton-l in #1596