v0.11.0 — Diffusers — releases.sh

:magic_wand: Karlo UnCLIP by Kakao Brain

Karlo is a text-conditional image generation model based on OpenAI's unCLIP architecture with the improvement over the standard super-resolution model from 64px to 256px, recovering high-frequency details in a small number of denoising steps.

This alpha version of Karlo is trained on 115M image-text pairs, including COYO-100M high-quality subset, CC3M, and CC12M. For more information about the architecture, see the Karlo repository: https://github.com/kakaobrain/karlo

pip install diffusers transformers safetensors accelerate

import torch
from diffusers import UnCLIPPipeline

pipe = UnCLIPPipeline.from_pretrained("kakaobrain/karlo-v1-alpha", torch_dtype=torch.float16)
pipe = pipe.to("cuda")

prompt = "a high-resolution photograph of a big red frog on a green leaf."
image = pipe(prompt).images[0]

:octocat: Community pipeline versioning

The community pipelines hosted in diffusers/examples/community will now follow the installed version of the library.

E.g. if you have diffusers==0.9.0 installed, the pipelines from the v0.9.0 branch will be used: https://github.com/huggingface/diffusers/tree/v0.9.0/examples/community

If you've installed diffusers from source, e.g. with pip install git+https://github.com/huggingface/diffusers then the latest versions of the pipelines will be fetched from the main branch.

To change the custom pipeline version, set the custom_revision variable like so:

pipeline = DiffusionPipeline.from_pretrained(
     "google/ddpm-cifar10-32", custom_pipeline="one_step_unet", custom_revision="0.10.2"
)

:safety_vest: safetensors

Many of the most important checkpoints now have https://github.com/huggingface/safetensors available. Upon installing safetensors with:

pip install safetensors

You will see a nice speed-up when loading your model :rocket:

Some of the most improtant checkpoints have safetensor weights added now:

Batched generation bug fixes :bug:

Make sure all pipelines can run with batched input by @patrickvonplaten in #1669

We fixed a lot of bugs for batched generation. All pipelines should now correctly process batches of prompts and images :hugs: Also we made it much easier to tweak images with reproducible seeds: https://huggingface.co/docs/diffusers/using-diffusers/reusing_seeds

:memo: Changelog

Remove spurious arg in training scripts by @pcuenca in #1644
dreambooth: fix #1566: maintain fp32 wrapper when saving a checkpoint to avoid crash when running fp16 by @timh in #1618
Allow k pipeline to generate > 1 images by @pcuenca in #1645
Remove unnecessary offset in img2img by @patrickvonplaten in #1653
Remove unnecessary kwargs in depth2img by @maruel in #1648
Add text encoder conversion by @lawfordp2017 in #1559
VersatileDiffusion: fix input processing by @LukasStruppek in #1568
tensor format ort bug fix by @prathikr in #1557
Deprecate init image correctly by @patrickvonplaten in #1649
fix bug if we don't do_classifier_free_guidance by @MKFMIKU in #1601
Handle missing global_step key in scripts/convert_original_stable_diffusion_to_diffusers.py by @Cyberes in #1612
[SD] Make sure scheduler is correct when converting by @patrickvonplaten in #1667
[Textual Inversion] Do not update other embeddings by @patrickvonplaten in #1665
Added Community pipeline for comparing Stable Diffusion v1.1-4 checkpoints by @suvadityamuk in #1584
Fix wrong type checking in convert_diffusers_to_original_stable_diffusion.py by @apolinario in #1681
[Version] Bump to 0.11.0.dev0 by @patrickvonplaten in #1682
Dreambooth: save / restore training state by @pcuenca in #1668
Disable telemetry when DISABLE_TELEMETRY is set by @w4ffl35 in #1686
Change one-step dummy pipeline for testing by @patrickvonplaten in #1690
[Community pipeline] Add github mechanism by @patrickvonplaten in #1680
Dreambooth: use warnings instead of logger in parse_args() by @pcuenca in #1688
manually update train_unconditional_ort by @prathikr in #1694
Remove all local telemetry by @anton-l in #1702
Update main docs by @patrickvonplaten in #1706
[Readme] Clarify package owners by @anton-l in #1707
Fix the bug that torch version less than 1.12 throws TypeError by @chinoll in #1671
RePaint fast tests and API conforming by @anton-l in #1701
Add state checkpointing to other training scripts by @pcuenca in #1687
Improve pipeline_stable_diffusion_inpaint_legacy.py by @cyber-meow in #1585
apply amp bf16 on textual inversion by @jiqing-feng in #1465
Add examples with Intel optimizations by @hshen14 in #1579
Added a README page for docs and a "schedulers" page by @yiyixuxu in #1710
Accept latents as optional input in Latent Diffusion pipeline by @daspartho in #1723
Fix ONNX img2img preprocessing and add fast tests coverage by @anton-l in #1727
Fix ldm tests on master by not running the CPU tests on GPU by @patrickvonplaten in #1729
Docs: recommend xformers by @pcuenca in #1724
Nightly integration tests by @anton-l in #1664
[Batched Generators] This PR adds generators that are useful to make batched generation fully reproducible by @patrickvonplaten in #1718
Fix ONNX img2img preprocessing by @peterto in #1736
Fix MPS fast test warnings by @anton-l in #1744
Fix/update the LDM pipeline and tests by @anton-l in #1743
kakaobrain unCLIP by @williamberman in #1428
[fix] pipeline_unclip generator by @williamberman in #1751
unCLIP docs by @williamberman in #1754
Correct help text for scheduler_type flag in scripts. by @msiedlarek in #1749
Add resnet_time_scale_shift to VD layers by @anton-l in #1757
Add attention mask to uclip by @patrickvonplaten in #1756
Support attn2==None for xformers by @anton-l in #1759
[UnCLIPPipeline] fix num_images_per_prompt by @patil-suraj in #1762
Add CPU offloading to UnCLIP by @anton-l in #1761
[Versatile] fix attention mask by @patrickvonplaten in #1763
[Revision] Don't recommend using revision by @patrickvonplaten in #1764
[Examples] Update train_unconditional.py to include logging argument for Wandb by @ash0ts in #1719
Transformers version req for UnCLIP by @anton-l in #1766