releases.shpreview

v0.10.0

v0.10.0: Depth Guidance and Safer Checkpoints

$npx -y @buildinternet/releases show rel_M3goajZva-rY-XI5YEz5v

🐳 Depth-Guided Stable Diffusion and 2.1 checkpoints

The new depth-guided stable diffusion model is fully supported in this release. The model is conditioned on monocular depth estimates inferred via MiDaS and can be used for structure-preserving img2img and shape-conditional synthesis.

image

Installing the transformers library from source is required for the MiDaS model:

pip install --upgrade git+https://github.com/huggingface/transformers/
import torch
import requests
from PIL import Image
from diffusers import StableDiffusionDepth2ImgPipeline

pipe = StableDiffusionDepth2ImgPipeline.from_pretrained(
   "stabilityai/stable-diffusion-2-depth",
   torch_dtype=torch.float16,
).to("cuda")

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
init_image = Image.open(requests.get(url, stream=True).raw)

prompt = "two tigers"
n_propmt = "bad, deformed, ugly, bad anotomy"
image = pipe(prompt=prompt, image=init_image, negative_prompt=n_propmt, strength=0.7).images[0]

The updated Stable Diffusion 2.1 checkpoints are also released and fully supported:

:safety_vest: Safe Tensors

We now support SafeTensors: a new simple format for storing tensors safely (as opposed to pickle) that is still fast (zero-copy).

  • [Proposal] Support loading from safetensors if file is present. by @Narsil in #1357
  • [Proposal] Support saving to safetensors by @MatthieuBizien in #1494
FormatSafeZero-copyLazy loadingNo file size limitLayout controlFlexibilityBfloat16
pickle (PyTorch)βœ—βœ—βœ—βœ“βœ—βœ“βœ“
H5 (Tensorflow)βœ“βœ—βœ“βœ“~~βœ—
SavedModel (Tensorflow)βœ“βœ—βœ—βœ“βœ“βœ—βœ“
MsgPack (flax)βœ“βœ“βœ—βœ“βœ—βœ—βœ“
SafeTensorsβœ“βœ“βœ“βœ“βœ“βœ—βœ“

**More details about the comparison here: https://github.com/huggingface/safetensors#yet-another-format-

pip install safetensors
from diffusers import StableDiffusionPipeline

pipe = StableDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1")
pipe.save_pretrained("./safe-stable-diffusion-2-1", safe_serialization=True)

# you can also push this checkpoint to the HF Hub and load from there
safe_pipe = StableDiffusionPipeline.from_pretrained("./safe-stable-diffusion-2-1")

New Pipelines

:paintbrush: Paint-by-example

An implementation of Paint by Example: Exemplar-based Image Editing with Diffusion Models by Binxin Yang, Shuyang Gu, Bo Zhang, Ting Zhang, Xuejin Chen, Xiaoyan Sun, Dong Chen, Fang Wen

  • Add paint by example by @patrickvonplaten in #1533

image

import PIL
import requests
import torch
from io import BytesIO
from diffusers import DiffusionPipeline

def download_image(url):
    response = requests.get(url)
    return PIL.Image.open(BytesIO(response.content)).convert("RGB")

img_url = "https://raw.githubusercontent.com/Fantasy-Studio/Paint-by-Example/main/examples/image/example_1.png"
mask_url = "https://raw.githubusercontent.com/Fantasy-Studio/Paint-by-Example/main/examples/mask/example_1.png"
example_url = "https://raw.githubusercontent.com/Fantasy-Studio/Paint-by-Example/main/examples/reference/example_1.jpg"

init_image = download_image(img_url).resize((512, 512))
mask_image = download_image(mask_url).resize((512, 512))
example_image = download_image(example_url).resize((512, 512))

pipe = DiffusionPipeline.from_pretrained("Fantasy-Studio/Paint-by-Example", torch_dtype=torch.float16)
pipe = pipe.to("cuda")

image = pipe(image=init_image, mask_image=mask_image, example_image=example_image).images[0]

Audio Diffusion and Latent Audio Diffusion

Audio Diffusion leverages the recent advances in image generation using diffusion models by converting audio samples to and from mel spectrogram images.

  • add AudioDiffusionPipeline and LatentAudioDiffusionPipeline #1334 by @teticio in #1426
from IPython.display import Audio
from diffusers import DiffusionPipeline

pipe = DiffusionPipeline.from_pretrained("teticio/audio-diffusion-ddim-256").to("cuda")

output = pipe()
display(output.images[0])
display(Audio(output.audios[0], rate=pipe.mel.get_sample_rate()))

[Experimental] K-Diffusion pipeline for Stable Diffusion

This pipeline is added to support the latest schedulers from @crowsonkb's k-diffusion The purpose of this pipeline is to compare scheduler implementations and updates, so new features from other pipelines are unlikely to be supported!

  • [K Diffusion] Add k diffusion sampler natively by @patrickvonplaten in #1603
pip install k-diffusion
from diffusers import StableDiffusionKDiffusionPipeline
import torch

pipe = StableDiffusionKDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1-base")
pipe = pipe.to("cuda")

pipe.set_scheduler("sample_heun")
image = pipe("astronaut riding horse", num_inference_steps=25).images[0]

New Schedulers

Heun scheduler inspired by Karras et. al

Algorithm 1 of Karras et. al. Scheduler ported from @crowsonkb’s k-diffusion

  • Add 2nd order heun scheduler by @patrickvonplaten in #1336
from diffusers import HeunDiscreteScheduler

pipe = StableDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1")
pipe.scheduler = HeunDiscreteScheduler.from_config(pipe.scheduler.config)

Single step DPM-Solver

Original paper can be found here and the improved version. The original implementation can be found here.

  • Add Singlestep DPM-Solver (singlestep high-order schedulers) by @LuChengTHU in #1442
from diffusers import DPMSolverSinglestepScheduler

pipe = StableDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1")
pipe.scheduler = DPMSolverSinglestepScheduler.from_config(pipe.scheduler.config)

:memo: Changelog

  • [Proposal] Support loading from safetensors if file is present. by @Narsil in #1357
  • Hotfix for AttributeErrors in OnnxStableDiffusionInpaintPipelineLegacy by @anton-l in #1448
  • Speed up test and remove kwargs from call by @patrickvonplaten in #1446
  • v-prediction training support by @patil-suraj in #1455
  • Fix Flax from_pt by @pcuenca in #1436
  • Ensure Flax pipeline always returns numpy array by @pcuenca in #1435
  • Add 2nd order heun scheduler by @patrickvonplaten in #1336
  • fix slow tests by @patrickvonplaten in #1467
  • Flax support for Stable Diffusion 2 by @pcuenca in #1423
  • Updates Image to Image Inpainting community pipeline README by @vvvm23 in #1370
  • StableDiffusion: Decode latents separately to run larger batches by @kig in #1150
  • Fix bug in half precision for DPMSolverMultistepScheduler by @rtaori in #1349
  • [Train unconditional] Unwrap model before EMA by @anton-l in #1469
  • Add ort_nightly_directml to the onnxruntime candidates by @anton-l in #1458
  • Allow saving trained betas by @patrickvonplaten in #1468
  • Fix dtype model loading by @patrickvonplaten in #1449
  • [Dreambooth] Make compatible with alt diffusion by @patrickvonplaten in #1470
  • Add better docs xformers by @patrickvonplaten in #1487
  • Remove reminder comment by @pcuenca in #1489
  • Bump to 0.10.0.dev0 + deprecations by @anton-l in #1490
  • Add doc for Stable Diffusion on Habana Gaudi by @regisss in #1496
  • Replace deprecated hub utils in train_unconditional_ort by @anton-l in #1504
  • [Deprecate] Correct stacklevel by @patrickvonplaten in #1483
  • simplyfy AttentionBlock by @patil-suraj in #1492
  • Standardize on using image argument in all pipelines by @fboulnois in #1361
  • support v prediction in other schedulers by @patil-suraj in #1505
  • Fix Flax flip_sin_to_cos by @akashgokul in #1369
  • Add an explicit --image_size to the conversion script by @anton-l in #1509
  • fix heun scheduler by @patil-suraj in #1512
  • [docs] [dreambooth training] accelerate.utils.write_basic_config by @williamberman in #1513
  • [docs] [dreambooth training] num_class_images clarification by @williamberman in #1508
  • [From pretrained] Allow returning local path by @patrickvonplaten in #1450
  • Update conversion script to correctly handle SD 2 by @patrickvonplaten in #1511
  • [refactor] Making the xformers mem-efficient attention activation recursive by @blefaudeux in #1493
  • Do not use torch.long in mps by @pcuenca in #1488
  • Fix Imagic example by @dhruvrnaik in #1520
  • Fix training docs to install datasets by @pedrogengo in #1476
  • Finalize 2nd order schedulers by @patrickvonplaten in #1503
  • Fixed mask+masked_image in sd inpaint pipeline by @antoche in #1516
  • Create train_dreambooth_inpaint.py by @thedarkzeno in #1091
  • Update FlaxLMSDiscreteScheduler by @dzlab in #1474
  • [Proposal] Support saving to safetensors by @MatthieuBizien in #1494
  • Add xformers attention to VAE by @kig in #1507
  • [CI] Add slow MPS tests by @anton-l in #1104
  • [Stable Diffusion Inpaint] Allow tensor as input image & mask by @patrickvonplaten in #1527
  • Compute embedding distances with torch.cdist by @blefaudeux in #1459
  • [Upscaling] Fix batch size by @patrickvonplaten in #1525
  • Update bug-report.yml by @patrickvonplaten in #1548
  • [Community Pipeline] Checkpoint Merger based on Automatic1111 by @Abhinay1997 in #1472
  • [textual_inversion] Add an option for only saving the embeddings by @allo- in #781
  • [examples] use from_pretrained to load scheduler by @patil-suraj in #1549
  • fix mask discrepancies in train_dreambooth_inpaint by @thedarkzeno in #1529
  • [refactor] make set_attention_slice recursive by @patil-suraj in #1532
  • Research folder by @patrickvonplaten in #1553
  • add AudioDiffusionPipeline and LatentAudioDiffusionPipeline #1334 by @teticio in #1426
  • [Community download] Fix cache dir by @patrickvonplaten in #1555
  • [Docs] Correct docs by @patrickvonplaten in #1554
  • Fix typo by @pcuenca in #1558
  • [docs] [dreambooth training] default accelerate config by @williamberman in #1564
  • Mega community pipeline by @patrickvonplaten in #1561
  • [examples] add check_min_version by @patil-suraj in #1550
  • [dreambooth] make collate_fn global by @patil-suraj in #1547
  • Standardize fast pipeline tests with PipelineTestMixin by @anton-l in #1526
  • Add paint by example by @patrickvonplaten in #1533
  • [Community Pipeline] fix lpw_stable_diffusion by @SkyTNT in #1570
  • [Paint by Example] Better default for image width by @patrickvonplaten in #1587
  • Add from_pretrained telemetry by @anton-l in #1461
  • Correct order height & width in pipeline_paint_by_example.py by @Fantasy-Studio in #1589
  • Fix common tests for FP16 by @anton-l in #1588
  • [UNet2DConditionModel] add an option to upcast attention to fp32 by @patil-suraj in #1590
  • Flax: avoid recompilation when params change by @pcuenca in #1096
  • Add Singlestep DPM-Solver (singlestep high-order schedulers) by @LuChengTHU in #1442
  • fix upcast in slice attention by @patil-suraj in #1591
  • Update scheduling_repaint.py by @Randolph-zeng in #1582
  • Update RL docs for better sharing / adding models by @natolambert in #1563
  • Make cross-attention check more robust by @pcuenca in #1560
  • [ONNX] Fix flaky tests by @anton-l in #1593
  • Trivial fix for undefined symbol in train_dreambooth.py by @bcsherma in #1598
  • [K Diffusion] Add k diffusion sampler natively by @patrickvonplaten in #1603
  • [Versatile Diffusion] add upcast_attention by @patil-suraj in #1605
  • Fix PyCharm/VSCode static type checking for dummy objects by @anton-l in #1596

Fetched April 7, 2026