v0.10.0 — Diffusers — releases.sh

🐳 Depth-Guided Stable Diffusion and 2.1 checkpoints

The new depth-guided stable diffusion model is fully supported in this release. The model is conditioned on monocular depth estimates inferred via MiDaS and can be used for structure-preserving img2img and shape-conditional synthesis.

Installing the transformers library from source is required for the MiDaS model:

pip install --upgrade git+https://github.com/huggingface/transformers/

import torch
import requests
from PIL import Image
from diffusers import StableDiffusionDepth2ImgPipeline

pipe = StableDiffusionDepth2ImgPipeline.from_pretrained(
   "stabilityai/stable-diffusion-2-depth",
   torch_dtype=torch.float16,
).to("cuda")

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
init_image = Image.open(requests.get(url, stream=True).raw)

prompt = "two tigers"
n_propmt = "bad, deformed, ugly, bad anotomy"
image = pipe(prompt=prompt, image=init_image, negative_prompt=n_propmt, strength=0.7).images[0]

The updated Stable Diffusion 2.1 checkpoints are also released and fully supported:

:safety_vest: Safe Tensors

We now support SafeTensors: a new simple format for storing tensors safely (as opposed to pickle) that is still fast (zero-copy).

[Proposal] Support loading from safetensors if file is present. by @Narsil in #1357
[Proposal] Support saving to safetensors by @MatthieuBizien in #1494

Format	Safe	Zero-copy	Lazy loading	No file size limit	Layout control	Flexibility	Bfloat16
pickle (PyTorch)	✗	✗	✗	✓	✗	✓	✓
H5 (Tensorflow)	✓	✗	✓	✓	~	~	✗
SavedModel (Tensorflow)	✓	✗	✗	✓	✓	✗	✓
MsgPack (flax)	✓	✓	✗	✓	✗	✗	✓
SafeTensors	✓	✓	✓	✓	✓	✗	✓

**More details about the comparison here: https://github.com/huggingface/safetensors#yet-another-format-

pip install safetensors

from diffusers import StableDiffusionPipeline

pipe = StableDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1")
pipe.save_pretrained("./safe-stable-diffusion-2-1", safe_serialization=True)

# you can also push this checkpoint to the HF Hub and load from there
safe_pipe = StableDiffusionPipeline.from_pretrained("./safe-stable-diffusion-2-1")

New Pipelines

:paintbrush: Paint-by-example

An implementation of Paint by Example: Exemplar-based Image Editing with Diffusion Models by Binxin Yang, Shuyang Gu, Bo Zhang, Ting Zhang, Xuejin Chen, Xiaoyan Sun, Dong Chen, Fang Wen

Add paint by example by @patrickvonplaten in #1533

import PIL
import requests
import torch
from io import BytesIO
from diffusers import DiffusionPipeline

def download_image(url):
    response = requests.get(url)
    return PIL.Image.open(BytesIO(response.content)).convert("RGB")

img_url = "https://raw.githubusercontent.com/Fantasy-Studio/Paint-by-Example/main/examples/image/example_1.png"
mask_url = "https://raw.githubusercontent.com/Fantasy-Studio/Paint-by-Example/main/examples/mask/example_1.png"
example_url = "https://raw.githubusercontent.com/Fantasy-Studio/Paint-by-Example/main/examples/reference/example_1.jpg"

init_image = download_image(img_url).resize((512, 512))
mask_image = download_image(mask_url).resize((512, 512))
example_image = download_image(example_url).resize((512, 512))

pipe = DiffusionPipeline.from_pretrained("Fantasy-Studio/Paint-by-Example", torch_dtype=torch.float16)
pipe = pipe.to("cuda")

image = pipe(image=init_image, mask_image=mask_image, example_image=example_image).images[0]

Audio Diffusion and Latent Audio Diffusion

Audio Diffusion leverages the recent advances in image generation using diffusion models by converting audio samples to and from mel spectrogram images.

add AudioDiffusionPipeline and LatentAudioDiffusionPipeline #1334 by @teticio in #1426

from IPython.display import Audio
from diffusers import DiffusionPipeline

pipe = DiffusionPipeline.from_pretrained("teticio/audio-diffusion-ddim-256").to("cuda")

output = pipe()
display(output.images[0])
display(Audio(output.audios[0], rate=pipe.mel.get_sample_rate()))

[Experimental] K-Diffusion pipeline for Stable Diffusion

This pipeline is added to support the latest schedulers from @crowsonkb's k-diffusion The purpose of this pipeline is to compare scheduler implementations and updates, so new features from other pipelines are unlikely to be supported!

[K Diffusion] Add k diffusion sampler natively by @patrickvonplaten in #1603

pip install k-diffusion

from diffusers import StableDiffusionKDiffusionPipeline
import torch

pipe = StableDiffusionKDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1-base")
pipe = pipe.to("cuda")

pipe.set_scheduler("sample_heun")
image = pipe("astronaut riding horse", num_inference_steps=25).images[0]

New Schedulers

Heun scheduler inspired by Karras et. al

Algorithm 1 of Karras et. al. Scheduler ported from @crowsonkb’s k-diffusion

Add 2nd order heun scheduler by @patrickvonplaten in #1336

from diffusers import HeunDiscreteScheduler

pipe = StableDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1")
pipe.scheduler = HeunDiscreteScheduler.from_config(pipe.scheduler.config)

Single step DPM-Solver

Original paper can be found here and the improved version. The original implementation can be found here.

Add Singlestep DPM-Solver (singlestep high-order schedulers) by @LuChengTHU in #1442

from diffusers import DPMSolverSinglestepScheduler

pipe = StableDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1")
pipe.scheduler = DPMSolverSinglestepScheduler.from_config(pipe.scheduler.config)

:memo: Changelog

[Proposal] Support loading from safetensors if file is present. by @Narsil in #1357
Hotfix for AttributeErrors in OnnxStableDiffusionInpaintPipelineLegacy by @anton-l in #1448
Speed up test and remove kwargs from call by @patrickvonplaten in #1446
v-prediction training support by @patil-suraj in #1455
Fix Flax from_pt by @pcuenca in #1436
Ensure Flax pipeline always returns numpy array by @pcuenca in #1435
Add 2nd order heun scheduler by @patrickvonplaten in #1336
fix slow tests by @patrickvonplaten in #1467
Flax support for Stable Diffusion 2 by @pcuenca in #1423
Updates Image to Image Inpainting community pipeline README by @vvvm23 in #1370
StableDiffusion: Decode latents separately to run larger batches by @kig in #1150
Fix bug in half precision for DPMSolverMultistepScheduler by @rtaori in #1349
[Train unconditional] Unwrap model before EMA by @anton-l in #1469
Add ort_nightly_directml to the onnxruntime candidates by @anton-l in #1458
Allow saving trained betas by @patrickvonplaten in #1468
Fix dtype model loading by @patrickvonplaten in #1449
[Dreambooth] Make compatible with alt diffusion by @patrickvonplaten in #1470
Add better docs xformers by @patrickvonplaten in #1487
Remove reminder comment by @pcuenca in #1489
Bump to 0.10.0.dev0 + deprecations by @anton-l in #1490
Add doc for Stable Diffusion on Habana Gaudi by @regisss in #1496
Replace deprecated hub utils in train_unconditional_ort by @anton-l in #1504
[Deprecate] Correct stacklevel by @patrickvonplaten in #1483
simplyfy AttentionBlock by @patil-suraj in #1492
Standardize on using image argument in all pipelines by @fboulnois in #1361
support v prediction in other schedulers by @patil-suraj in #1505
Fix Flax flip_sin_to_cos by @akashgokul in #1369
Add an explicit --image_size to the conversion script by @anton-l in #1509
fix heun scheduler by @patil-suraj in #1512
[docs] [dreambooth training] accelerate.utils.write_basic_config by @williamberman in #1513
[docs] [dreambooth training] num_class_images clarification by @williamberman in #1508
[From pretrained] Allow returning local path by @patrickvonplaten in #1450
Update conversion script to correctly handle SD 2 by @patrickvonplaten in #1511
[refactor] Making the xformers mem-efficient attention activation recursive by @blefaudeux in #1493
Do not use torch.long in mps by @pcuenca in #1488
Fix Imagic example by @dhruvrnaik in #1520
Fix training docs to install datasets by @pedrogengo in #1476
Finalize 2nd order schedulers by @patrickvonplaten in #1503
Fixed mask+masked_image in sd inpaint pipeline by @antoche in #1516
Create train_dreambooth_inpaint.py by @thedarkzeno in #1091
Update FlaxLMSDiscreteScheduler by @dzlab in #1474
[Proposal] Support saving to safetensors by @MatthieuBizien in #1494
Add xformers attention to VAE by @kig in #1507
[CI] Add slow MPS tests by @anton-l in #1104
[Stable Diffusion Inpaint] Allow tensor as input image & mask by @patrickvonplaten in #1527
Compute embedding distances with torch.cdist by @blefaudeux in #1459
[Upscaling] Fix batch size by @patrickvonplaten in #1525
Update bug-report.yml by @patrickvonplaten in #1548
[Community Pipeline] Checkpoint Merger based on Automatic1111 by @Abhinay1997 in #1472
[textual_inversion] Add an option for only saving the embeddings by @allo- in #781
[examples] use from_pretrained to load scheduler by @patil-suraj in #1549
fix mask discrepancies in train_dreambooth_inpaint by @thedarkzeno in #1529
[refactor] make set_attention_slice recursive by @patil-suraj in #1532
Research folder by @patrickvonplaten in #1553
add AudioDiffusionPipeline and LatentAudioDiffusionPipeline #1334 by @teticio in #1426
[Community download] Fix cache dir by @patrickvonplaten in #1555
[Docs] Correct docs by @patrickvonplaten in #1554
Fix typo by @pcuenca in #1558
[docs] [dreambooth training] default accelerate config by @williamberman in #1564
Mega community pipeline by @patrickvonplaten in #1561
[examples] add check_min_version by @patil-suraj in #1550
[dreambooth] make collate_fn global by @patil-suraj in #1547
Standardize fast pipeline tests with PipelineTestMixin by @anton-l in #1526
Add paint by example by @patrickvonplaten in #1533
[Community Pipeline] fix lpw_stable_diffusion by @SkyTNT in #1570
[Paint by Example] Better default for image width by @patrickvonplaten in #1587
Add from_pretrained telemetry by @anton-l in #1461
Correct order height & width in pipeline_paint_by_example.py by @Fantasy-Studio in #1589
Fix common tests for FP16 by @anton-l in #1588
[UNet2DConditionModel] add an option to upcast attention to fp32 by @patil-suraj in #1590
Flax: avoid recompilation when params change by @pcuenca in #1096
Add Singlestep DPM-Solver (singlestep high-order schedulers) by @LuChengTHU in #1442
fix upcast in slice attention by @patil-suraj in #1591
Update scheduling_repaint.py by @Randolph-zeng in #1582
Update RL docs for better sharing / adding models by @natolambert in #1563
Make cross-attention check more robust by @pcuenca in #1560
[ONNX] Fix flaky tests by @anton-l in #1593
Trivial fix for undefined symbol in train_dreambooth.py by @bcsherma in #1598
[K Diffusion] Add k diffusion sampler natively by @patrickvonplaten in #1603
[Versatile Diffusion] add upcast_attention by @patil-suraj in #1605
Fix PyCharm/VSCode static type checking for dummy objects by @anton-l in #1596