v0.28.0

Diffusion models are known for their abilities in the space of generative modeling. This release of diffusers introduces the first official pipeline (Marigold) for discriminative tasks such as depth estimation and surface normals’ estimation!

Starting this release, we will also highlight the changes and features from the library that make it easy to integrate community checkpoints, features, and so on. Read on!

Marigold

Proposed in Marigold: Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation, Marigold introduces a diffusion model and associated fine-tuning protocol for monocular depth estimation. It can also be extended to perform surface normals’ estimation.

marigold

(Image taken from the official repository)

The code snippet below shows how to use this pipeline for depth estimation:

import diffusers
import torch

pipe = diffusers.MarigoldDepthPipeline.from_pretrained(
    "prs-eth/marigold-depth-lcm-v1-0", variant="fp16", torch_dtype=torch.float16
).to("cuda")

image = diffusers.utils.load_image("https://marigoldmonodepth.github.io/images/einstein.jpg")
depth = pipe(image)

vis = pipe.image_processor.visualize_depth(depth.prediction)
vis[0].save("einstein_depth.png")

depth_16bit = pipe.image_processor.export_depth_to_16bit_png(depth.prediction)
depth_16bit[0].save("einstein_depth_16bit.png")

Check out the API documentation here. We also have a detailed guide about the pipeline here.

Thanks to @toshas, one of the authors of Marigold, who contributed this in #7847.

🌀 Massive Refactor of `from_single_file` 🌀

We have further refactored from_single_file to align its logic more closely to the from_pretrained method. The biggest benefit of doing this is that it allows us to expand single file loading support beyond Stable Diffusion-like pipelines and models. It also makes it easier to load models that are saved and shared in their original format.

Some of the changes introduced in this refactor:

When loading a single file checkpoint, we will attempt to use the keys present in the checkpoint to infer a model repository on the Hugging Face Hub that we can use to configure the pipeline. For example, if you are using a single file checkpoint based on SD 1.5, we would use the configuration files in the runwayml/stable-diffusion-v1-5 repository to configure the model components and pipeline.
Suppose this inferred configuration isn’t appropriate for your checkpoint. In that case, you can override it using the config argument and pass in either a path to a local model repo or a repo id on the Hugging Face Hub.

pipe = StableDiffusionPipeline.from_single_file("...", config=<model repo id or local repo path>)

Deprecation of model configuration arguments for the from_single_file method in Pipelines such as num_in_channels, scheduler_type , image_size and upcast_attention . This is an anti-pattern that we have supported in previous versions of the library when we assumed that it would only be relevant to Stable Diffusion based models. However, given that there is a demand to support other model types, we feel it is necessary for single-file loading behavior to adhere to the conventions set in our other loading methods. Configuring individual model components through a pipeline loading method is not something we support in from_pretrained, and therefore, we will be deprecating support for this behavior in from_single_file as well.

PixArt Sigma

PixArt Simga is the successor to PixArt Alpha. PixArt Sigma is capable of directly generating images at 4K resolution. It can also produce images of markedly higher fidelity and improved alignment with text prompts. It comes with a massive sequence length of 300 (for reference, PixArt Alpha has a maximum sequence length of 120)!

<div align="center"> <img src="https://github.com/huggingface/diffusers/assets/22957388/31f2b30b-e46f-4fc9-aeb7-a6dea50b474b" width=700/><br> <small>(Taken from the <a href="https://pixart-alpha.github.io/PixArt-sigma-project">project website</a>.)</small> </div> <br>

import torch
from diffusers import PixArtSigmaPipeline

# You can replace the checkpoint id with "PixArt-alpha/PixArt-Sigma-XL-2-512-MS" too.
pipe = PixArtSigmaPipeline.from_pretrained(
    "PixArt-alpha/PixArt-Sigma-XL-2-1024-MS", torch_dtype=torch.float16
)
# Enable memory optimizations.
pipe.enable_model_cpu_offload()

prompt = "A small cactus with a happy face in the Sahara desert."
image = pipe(prompt).images[0]

📃 Refer to the documentation here to learn more about PixArt Sigma.

Thanks to @lawrence-cj, one of the authors of PixArt Sigma, who contributed this in #7857.

AnimateDiff SDXL

@a-r-r-o-w contributed the Stable Diffusion XL (SDXL) version of AnimateDiff in #6721. However, note that this is currently an experimental feature, as only a beta release of the motion adapter checkpoint is available.

import torch
from diffusers.models import MotionAdapter
from diffusers import AnimateDiffSDXLPipeline, DDIMScheduler
from diffusers.utils import export_to_gif

adapter = MotionAdapter.from_pretrained("guoyww/animatediff-motion-adapter-sdxl-beta", torch_dtype=torch.float16)

model_id = "stabilityai/stable-diffusion-xl-base-1.0"
scheduler = DDIMScheduler.from_pretrained(
    model_id,
    subfolder="scheduler",
    clip_sample=False,
    beta_schedule="linear",
    steps_offset=1,
)
pipe = AnimateDiffSDXLPipeline.from_pretrained(
    model_id,
    motion_adapter=adapter,
    scheduler=scheduler,
    torch_dtype=torch.float16,
    variant="fp16",
).enable_model_cpu_offload()

# enable memory savings
pipe.enable_vae_slicing()
pipe.enable_vae_tiling()

output = pipe(
    prompt="a panda surfing in the ocean, realistic, high quality",
    negative_prompt="low quality, worst quality",
    num_inference_steps=20,
    guidance_scale=8,
    width=1024,
    height=1024,
    num_frames=16,
)

frames = output.frames[0]
export_to_gif(frames, "animation.gif")

📜 Refer to the documentation to learn more.

Block-wise LoRA

@UmerHA contributed the support to control the scales of different LoRA blocks in a granular manner in #7352. Depending on the LoRA checkpoint one is using, this granular control can significantly impact the quality of the generated outputs. Following code block shows how this feature can be used while performing inference:

...

adapter_weight_scales = { "unet": { "down": 0, "mid": 1, "up": 0} }
pipe.set_adapters("pixel", adapter_weight_scales)
image = pipe(
		prompt, num_inference_steps=30, generator=torch.manual_seed(0)
).images[0]

✍️ Refer to our documentation for more details and a full-fledged example.

InstantStyle

More granular control of scale could be extended to IP-Adapters too. @DannHuang contributed to the support of InstantStyle, aka granular control of IP-Adapter scales, in #7668. The following code block shows how this feature could be used when performing inference with IP-Adapters:

...

scale = {
    "down": {"block_2": [0.0, 1.0]},
    "up": {"block_0": [0.0, 1.0, 0.0]},
}
pipeline.set_ip_adapter_scale(scale)

This way, one can generate images following only the style or layout from the image prompt, with significantly improved diversity. This is achieved by only activating IP-Adapters to specific parts of the model.

Check out the documentation here.

ControlNetXS

ControlNet-XS was introduced in ControlNet-XS by Denis Zavadski and Carsten Rother. Based on the observation, the control model in the original ControlNet can be made much smaller and still produce good results. ControlNet-XS generates images comparable to a regular ControlNet, but it is 20-25% faster (see benchmark with StableDiffusion-XL) and uses ~45% less memory.

ControlNet-XS is supported for both Stable Diffusion and Stable Diffusion XL.

Thanks to @UmerHA for contributing ControlNet-XS in #5827 and #6772.

Custom Timesteps

We introduced custom timesteps support for some of our pipelines and schedulers. You can now set your scheduler with a list of arbitrary timesteps. For example, you can use the AYS timesteps schedule to achieve very nice results with only 10 denoising steps.

from diffusers.schedulers import AysSchedules
sampling_schedule = AysSchedules["StableDiffusionXLTimesteps"]
pipe = StableDiffusionXLPipeline.from_pretrained(
    "SG161222/RealVisXL_V4.0",
    torch_dtype=torch.float16,
    variant="fp16",
).to("cuda")

pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config, algorithm_type="sde-dpmsolver++")
prompt = "A cinematic shot of a cute little rabbit wearing a jacket and doing a thumbs up"
image = pipe(prompt=prompt, timesteps=sampling_schedule).images[0]

Check out the documentation here

`device_map` in Pipelines 🧪

We have introduced experimental support for device_map in our pipelines. This feature becomes relevant when you have multiple accelerators to distribute the components of a pipeline. Currently, we support only “balanced” device_map. However, we plan to support other device mapping strategies relevant to diffusion models in the future.

from diffusers import DiffusionPipeline
import torch

pipeline = DiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5", 
    torch_dtype=torch.float16, 
    device_map="balanced"
)
image = pipeline("a dog").images[0]

In cases where you might be limited to low VRAM accelerators, you can use device_map to benefit from them. Below, we simulate a situation where we have access to two GPUs, each having only a GB of VRAM (through the max_memory argument).

from diffusers import DiffusionPipeline
import torch

max_memory = {0:"1GB", 1:"1GB"}
pipeline = DiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    torch_dtype=torch.float16, 
    use_safetensors=True, 
    device_map="balanced",
		max_memory=max_memory
)
image = pipeline("a dog").images[0]

📜 Refer to the documentation to learn more about it.

VQGAN Training Script 📈

VQGAN, proposed in Taming Transformers for High-Resolution Image Synthesis, is a crucial component in the modern generative image modeling toolbox. Once it is trained, its encoder can be leveraged to compute general-purpose tokens from input images.

Thanks to @isamu-isozaki, who contributed a script and related utilities to train VQGANs in #5483. For details, refer to the official training directory.

`VideoProcessor` Class

Similar to the VaeImageProcessor class, we have introduced a VideoProcessor to help make the preprocessing and postprocessing of videos easier and a little more streamlined across the pipelines. Refer to the documentation to learn more.

New Guides 📑

Starting with this release, we provide guides and tutorials to help users get started with some of the most frequently used tasks in image and video generation. For this release, we have a series of three guides about outpainting with different techniques:

ControlNet Outpainting: Learn how to do outpainting with a specific ControlNet model trained for this task. This method is best for creative outpainting.
Differential Diffusion Outpainting: Use a novel framework that enables customization of the amount of change per pixel or per image region, allowing seamless outpainting. This can be used for expanding images beyond their initial size.
Outpainting using an Inpaint Model: Using various techniques, learn how to use a regular inpainting model to do outpainting while preserving the original subject intact. This is ideal for product catalogs.

Official Callbacks

We introduced official callbacks that you can conveniently plug into your pipeline. For example, to turn off classifier-free guidance after denoising steps with SDXLCFGCutoffCallback.

import torch
from diffusers import DiffusionPipeline
from diffusers.callbacks import SDXLCFGCutoffCallback

callback = SDXLCFGCutoffCallback(cutoff_step_ratio=0.4)
pipeline = StableDiffusionXLPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    torch_dtype=torch.float16,
    variant="fp16",
).to("cuda")
prompt = "a sports car at the road, best quality, high quality, high detail, 8k resolution"
out = pipeline(
    prompt=prompt,
    num_inference_steps=25,
    callback_on_step_end=callback,
)

Community Pipelines and `from_pipe` API

Starting with this release note, we will highlight the new community pipelines! More and more of our pipelines were added as community pipelines first and graduated as official pipelines once people started to use them a lot! We do not require community pipelines to follow diffusers’ coding style, so it is the easiest way to contribute to diffusers 😊

We also introduced a from_pipe API that’s very useful for the community pipelines that share checkpoints with our official pipelines and improve generation quality in some way:) You can use from_pipe(...) to load many community pipelines without additional memory requirements. With this API, you can easily switch between different pipelines to apply different techniques.

Read more about from_pipe API in our documentation 📃.

Here are four new community pipelines since our last release.

BoxDiff

BoxDiff lets you use bounding box coordinates for a more controlled generation. Here is an example of how you can apply this technique on a stable diffusion pipeline you had created (i.e. pipe_sd in the below example)

pipe_box = DiffusionPipeline.from_pipe(
    pipe_sd,
    custom_pipeline="pipeline_stable_diffusion_boxdiff",
)
pipe_box.enable_model_cpu_offload()
phrases = ["aurora","reindeer","meadow","lake","mountain"]
boxes = [[1,3,512,202], [75,344,421,495], [1,327,508,507], [2,217,507,341], [1,135,509,242]]
boxes = [[x / 512 for x in box] for box in boxes]

generator = torch.Generator(device="cpu").manual_seed(42)
images = pipe_box(
    prompt,
    boxdiff_phrases=phrases,
    boxdiff_boxes=boxes,
    boxdiff_kwargs={
        "attention_res": 16,
        "normalize_eot": True
    },
    num_inference_steps=50,
    generator=generator,
).images

Check out this community pipeline here

HD-Painter

HD-Painter can enhance inpainting pipelines with improved prompt faithfulness and generate higher resolution (up to 2k). You can switch from BoxDiff to HD-Painter like this

pipe = DiffusionPipeline.from_pipe(
    pipe_box,
    custom_pipeline="hd_painter"
)
pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config)

prompt = "wooden boat"
init_image = load_image("https://raw.githubusercontent.com/Picsart-AI-Research/HD-Painter/main/__assets__/samples/images/2.jpg")
mask_image = load_image("https://raw.githubusercontent.com/Picsart-AI-Research/HD-Painter/main/__assets__/samples/masks/2.png")

image = pipe (prompt, init_image, mask_image, use_rasg = True, use_painta = True, generator=torch.manual_seed(12345)).images[0]

Check out this community pipeline here

Differential Diffusion

Differential Diffusion enables customization of the amount of change per pixel or per image region. It’s very effective in inpainting and outpainting.

pipeline = DiffusionPipeline.from_pipe(
    pipe_sdxl,
    custom_pipeline="pipeline_stable_diffusion_xl_differential_img2img",
).to("cuda")
pipeline.scheduler = DPMSolverMultistepScheduler.from_config(pipeline.scheduler.config, use_karras_sigmas=True)

prompt = "a green pear"
negative_prompt = "blurry"

image = pipeline(
    prompt=prompt,
    negative_prompt=negative_prompt,
    guidance_scale=7.5,
    num_inference_steps=25,
    original_image=image,
    image=image,
    strength=1.0,
    map=mask,
).images[0]

Check out this community pipeline here.

FRESCO

FRESCO aka FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation enables zero-shot video-to-video translation. Learn more about it from here.

All Commits

clean dep installation step in push_tests by @sayakpaul in #7382
[LoRA test suite] refactor the test suite and cleanse it by @sayakpaul in #7316
[Custom Pipelines with Custom Components] fix multiple things by @sayakpaul in #7304
Fix typos by @standardAI in #7411
fix: enable unet_3d_condition to support time_cond_proj_dim by @yhZhai in #7364
add: space within docs to calculate mememory usage. by @sayakpaul (direct commit on v0.28.0-release)
Revert "add: space within docs to calculate mememory usage." by @sayakpaul (direct commit on v0.28.0-release)
[Docs] add missing output image by @sayakpaul in #7425
add a "Community Scripts" section by @yiyixuxu in #7358
add: space for calculating memory usagee. by @sayakpaul in #7414
[refactor] Fix FreeInit behaviour by @a-r-r-o-w in #7410
Remove distutils by @sayakpaul in #7455
[IP-Adapter] Fix IP-Adapter Support and Refactor Callback for StableDiffusionPanoramaPipeline by @standardAI in #7262
[Research Projects] ORPO diffusion for alignment by @sayakpaul in #7423
Additional Memory clean up for slow tests by @DN6 in #7436
Fix for str_to_bool definition in testing utils by @DN6 in #7461
[Docs] Fix typos by @standardAI in #7451
Fixed minor error in test_lora_layers_peft.py by @UmerHA in #7394
Small ldm3d fix by @estelleafl in #7464
[tests] skip dynamo tests when python is 3.12. by @sayakpaul in #7458
feat: support DoRA LoRA from community by @sayakpaul in #7371
Fix broken link by @salcc in #7472
Update train_dreambooth_lora_sd15_advanced.py by @ernestchu in #7433
[Training utils] add kohya conversion dict. by @sayakpaul in #7435
Fix Tiling in ConsistencyDecoderVAE by @standardAI in #7290
diffusers#7426 fix stable diffusion xl inference on MPS when dtypes shift unexpectedly due to pytorch bugs by @bghira in #7446
Fix missing raise statements in check_inputs by @TonyLianLong in #7473
Add device arg to offloading with combined pipelines by @Disty0 in #7471
fix torch.compile for multi-controlnet of sdxl inpaint by @yiyixuxu in #7476
[chore] make the istructions on fetching all commits clearer. by @sayakpaul in #7474
Skip test_lora_fuse_nan on mps by @UmerHA in #7481
[Chore] Fix Colab notebook links in README.md by @thliang01 in #7495
[Modeling utils chore] import load_model_dict_into_meta only once by @sayakpaul in #7437
Improve nightly tests by @sayakpaul in #7385
add: a helpful message when quality and repo consistency checks fail. by @sayakpaul in #7475
apple mps: training support for SDXL (ControlNet, LoRA, Dreambooth, T2I) by @bghira in #7447
cpu_offload: remove all hooks before offload by @yiyixuxu in #7448
Bug fix for controlnetpipeline check_image by @Fantast616 in #7103
fix OOM for test_vae_tiling by @yiyixuxu in #7510
[Tests] Speed up some fast pipeline tests by @sayakpaul in #7477
Memory clean up on all Slow Tests by @DN6 in #7514
Implements Blockwise lora by @UmerHA in #7352
Quick-Fix for #7352 block-lora by @UmerHA in #7523
add Instant id sdxl image2image pipeline by @linoytsaban in #7507
Perturbed-Attention Guidance by @HyoungwonCho in #7512
Add final_sigma_zero to UniPCMultistep by @Beinsezii in #7517
Fix IP Adapter Support for SAG Pipeline by @Stepheni12 in #7260
[Community pipeline] Marigold depth estimation update -- align with marigold v0.1.5 by @markkua in #7524
Fix typo in CPU offload test by @DN6 in #7542
Fix SVD bug (shape of time_context) by @KimbingNg in #7268
fix the cpu offload tests by @yiyixuxu in #7544
add HD-Painter pipeline by @haikmanukyan in #7520
add a from_pipe method to DiffusionPipeline by @yiyixuxu in #7241
[Community pipeline] SDXL Differential Diffusion Img2Img Pipeline by @asomoza in #7550
Fix FreeU tests by @DN6 in #7540
[Release tests] make nightly workflow dispatchable. by @sayakpaul in #7541
[Chore] remove class assignments for linear and conv. by @sayakpaul in #7553
[Tests] Speed up fast pipelines part II by @sayakpaul in #7521
7529 do not disable autocast for cuda devices by @bghira in #7530
add: utility to format our docs too 📜 by @sayakpaul in #7314
UniPC Multistep fix tensor dtype/device on order=3 by @Beinsezii in #7532
UniPC Multistep add rescale_betas_zero_snr by @Beinsezii in #7531
[Core] refactor transformers 2d into multiple init variants. by @sayakpaul in #7491
[Chore] increase number of workers for the tests. by @sayakpaul in #7558
Update pipeline_animatediff_video2video.py by @AbhinavGopal in #7457
Skip test_freeu_enabled on MPS by @UmerHA in #7570
[Tests] reduce block sizes of UNet and VAE tests by @sayakpaul in #7560
[IF| add set_begin_index for all IF pipelines by @yiyixuxu in #7577
Add AudioLDM2 TTS by @tuanh123789 in #5381
Allow more arguments to be passed to convert_from_ckpt by @w4ffl35 in #7222
[Docs] fix bugs in callback docs by @Adenialzz in #7594
Add missing restore() EMA call in train SDXL script by @christopher-beckham in #7599
disable test_conversion_when_using_device_map by @yiyixuxu in #7620
Multi-image masking for single IP Adapter by @fabiorigano in #7499
add utilities for updating diffusers pipeline metadata. by @sayakpaul in #7573
[Core] refactor transformer_2d forward logic into meaningful conditions. by @sayakpaul in #7489
[Workflows] remove installation of libsndfile1-dev and libgl1 from workflows by @sayakpaul in #7543
[Core] add "balanced" device_map support to pipelines by @sayakpaul in #6857
add the option of upsample function for tiny vae by @IDKiro in #7604
[docs] remove duplicate tip block. by @sayakpaul in #7625
Modularize instruct_pix2pix SD inferencing during and after training in examples by @satani99 in #7603
[Tests] reduce the model sizes in the SD fast tests by @sayakpaul in #7580
[docs] Prompt enhancer by @stevhliu in #7565
[docs] T2I by @stevhliu in #7623
Fix cpu offload related slow tests by @yiyixuxu in #7618
[Core] fix img2img pipeline for Playground by @sayakpaul in #7627
Skip PEFT LoRA Scaling if the scale is 1.0 by @stevenjlm in #7576
LCM Distill Scripts Fix Bug when Initializing Target U-Net by @dg845 in #6848
Fixed YAML loading. by @YiqinZhao in #7579
fix: Replaced deprecated logger.warn with logger.warning by @Sai-Suraj-27 in #7643
FIX Setting device for DoRA parameters by @BenjaminBossan in #7655
Add (Scheduled) Pseudo-Huber Loss training scripts to research projects by @kabachuha in #7527
make docker-buildx mandatory. by @sayakpaul in #7652
fix: metadata token by @sayakpaul in #7631
don't install peft from the source with uv for now. by @sayakpaul in #7679
Fixing implementation of ControlNet-XS by @UmerHA in #6772
[Core] is_cosxl_edit arg in SDXL ip2p. by @sayakpaul in #7650
[Docs] Add TGATE in section optimization by @WentianZhang-ML in #7639
fix: Updated ruff configuration to avoid deprecated configuration warning by @Sai-Suraj-27 in #7637
Don't install PEFT with UV in slow tests by @DN6 in #7697
[Workflows] remove installation of redundant modules from flax PR tests by @sayakpaul in #7662
[Docs] Update TGATE in section optimization. by @WentianZhang-ML in #7698
[docs] Pipeline loading by @stevhliu in #7684
Add tailscale action to push_test by @glegendre01 in #7709
Move IP Adapter Face ID to core by @fabiorigano in #7186
adding back test_conversion_when_using_device_map by @yiyixuxu in #7704
Cast height, width to int inside prepare latents by @DN6 in #7691
Cleanup ControlnetXS by @DN6 in #7701
fix: Fixed type annotations for compatability with python 3.8 by @Sai-Suraj-27 in #7648
fix/add tailscale key in case of failure by @glegendre01 in #7719
Animatediff Controlnet Community Pipeline IP Adapter Fix by @AbhinavGopal in #7413
Update Wuerschten Test by @DN6 in #7700
Fix Kandinksy V22 tests by @DN6 in #7699
[docs] AutoPipeline by @stevhliu in #7714
Remove redundant lines by @philipbutler in #7396
Support InstantStyle by @DannHuang in #7668
Restore AttnProcessor2_0 in unload_ip_adapter by @fabiorigano in #7727
fix: Fixed a wrong decorator by modifying it to @classmethod by @Sai-Suraj-27 in #7653
[Metadat utils] fix: json lines ordering. by @sayakpaul in #7744
[docs] Clean up toctree by @stevhliu in #7715
Fix failing VAE tiling test by @DN6 in #7747
Fix test for consistency decoder. by @DN6 in #7746
PixArt-Sigma Implementation by @lawrence-cj in #7654
[PixArt] fix small nits in pixart sigma by @sayakpaul in #7767
[Tests] mark UNetControlNetXSModelTests::test_forward_no_control to be flaky by @sayakpaul in #7771
Fix lora device test by @sayakpaul in #7738
[docs] Reproducible pipelines by @stevhliu in #7769
[docs] Refactor image quality docs by @stevhliu in #7758
Convert RGB to BGR for the SDXL watermark encoder by @btlorch in #7013
[docs] Fix AutoPipeline docstring by @stevhliu in #7779
Add PixArtSigmaPipeline to AutoPipeline mapping by @Beinsezii in #7783
[Docs] Update image masking and face id example by @fabiorigano in #7780
Add DREAM training by @AmericanPresidentJimmyCarter in #6381
[Scheduler] introduce sigma schedule. by @sayakpaul in #7649
Update InstantStyle usage in IP-Adapter documentation by @DannHuang in #7806
Check for latents, before calling prepare_latents - sdxlImg2Img by @nileshkokane01 in #7582
Add debugging workflow by @DN6 in #7778
[Pipeline] Fix error of SVD pipeline when num_videos_per_prompt > 1 by @wuyushuwys in #7786
Safetensor loading in AnimateDiff conversion scripts by @DN6 in #7764
Adding TextualInversionLoaderMixin for the controlnet_inpaint_sd_xl pipeline by @jschoormans in #7288
Added get_velocity function to EulerDiscreteScheduler. by @RuiningLi in #7733
Set main_input_name in StableDiffusionSafetyChecker to "clip_input" by @clinty in #7500
[Tests] reduce the model size in the ddim fast test by @ariG23498 in #7803
[Tests] reduce the model size in the ddpm fast test by @ariG23498 in #7797
[Tests] reduce the model size in the amused fast test by @ariG23498 in #7804
[Core] introduce _no_split_modules to ModelMixin by @sayakpaul in #6396
Add B-Lora training option to the advanced dreambooth lora script by @linoytsaban in #7741
SSH Runner Workflow Update by @DN6 in #7822
Fix CPU offload in docstring by @standardAI in #7827
[docs] Community pipelines by @stevhliu in #7819
Fix for pipeline slow test fetcher by @DN6 in #7824
[Tests] fix: device map tests for models by @sayakpaul in #7825
update the logic of is_sequential_cpu_offload by @yiyixuxu in #7788
[ip-adapter] fix ip-adapter for StableDiffusionInstructPix2PixPipeline by @yiyixuxu in #7820
[Tests] reduce the model size in the audioldm fast test by @ariG23498 in #7833
Fix key error for dictionary with randomized order in convert_ldm_unet_checkpoint by @yunseongcho in #7680
Fix hanging pipeline fetching by @DN6 in #7837
Update download diff format tests by @DN6 in #7831
Update CI cache by @DN6 in #7832
move to new runners by @glegendre01 in #7839
Change GPU Runners by @glegendre01 in #7840
Update deps for pipe test fetcher by @DN6 in #7838
[Tests] reduce the model size in the blipdiffusion fast test by @ariG23498 in #7849
Respect resume_download deprecation by @Wauplin in #7843
Remove installing python again in container by @DN6 in #7852
Add Ascend NPU support for SDXL fine-tuning and fix the model saving bug when using DeepSpeed. by @HelloWorldBeginner in #7816
[docs] LCM by @stevhliu in #7829
Ci - change cache folder by @glegendre01 in #7867
[docs] Distilled inference by @stevhliu in #7834
Fix for "no lora weight found module" with some loras by @asomoza in #7875
7879 - adjust documentation to use naruto dataset, since pokemon is now gated by @bghira in #7880
Modification on the PAG community pipeline (re) by @HyoungwonCho in #7876
Fix image upcasting by @standardAI in #7858
Check shape and remove deprecated APIs in scheduling_ddpm_flax.py by @ppham27 in #7703
[Pipeline] AnimateDiff SDXL by @a-r-r-o-w in #6721
fix offload test by @yiyixuxu in #7868
Allow users to save SDXL LoRA weights for only one text encoder by @dulacp in #7607
Remove dead code and fix f-string issue by @standardAI in #7720
Fix several imports by @standardAI in #7712
[Refactor] Better align from_single_file logic with from_pretrained by @DN6 in #7496
[Tests] fix things after #7013 by @sayakpaul in #7899
Set max parallel jobs on slow test runners by @DN6 in #7878
fix _optional_components in StableCascadeCombinedPipeline by @yiyixuxu in #7894
[scheduler] support custom timesteps and sigmas by @yiyixuxu in #7817
upgrade to python 3.10 in the Dockerfiles by @sayakpaul in #7893
add missing image processors to the docs by @sayakpaul in #7910
[Core] introduce videoprocessor. by @sayakpaul in #7776
#7535 Update FloatTensor type hints to Tensor by @vanakema in #7883
fix bugs when using deepspeed in sdxl by @HelloWorldBeginner in #7917
add custom sigmas and timesteps for StableDiffusionXLControlNet pipeline by @neuron-party in #7913
fix: Fixed a wrong link to supported python versions in contributing.md file by @Sai-Suraj-27 in #7638
[Core] fix offload behaviour when device_map is enabled. by @sayakpaul in #7919
Add Ascend NPU support for SDXL. by @HelloWorldBeginner in #7916
Official callbacks by @asomoza in #7761
fix AnimateDiff creation with a unet loaded with IP Adapter by @fabiorigano in #7791
[LoRA] Fix LoRA tests (side effects of RGB ordering) part ii by @sayakpaul in #7932
fix multicontrolnet save_pretrained logic for compatibility by @rebel-kblee in #7821
Update requirements.txt for text_to_image by @ktakita1011 in #7892
Bump transformers from 4.36.0 to 4.38.0 in /examples/research_projects/realfill by @dependabot[bot] in #7635
fix VAE loading issue in train_dreambooth by @bssrdf in #7632
Expansion proposal of diffusers-cli env by @standardAI in #7403
update to use hf-workflows for reporting the Docker build statuses by @sayakpaul in #7938
[Core] separate the loading utilities in modeling similar to pipelines. by @sayakpaul in #7943
Fix added_cond_kwargs when using IP-Adapter in StableDiffusionXLControlNetInpaintPipeline by @detkov in #7924
[Pipeline] Adding BoxDiff to community examples by @zjysteven in #7947
[tests] decorate StableDiffusion21PipelineSingleFileSlowTests with slow. by @sayakpaul in #7941
Adding VQGAN Training script by @isamu-isozaki in #5483
move to GH hosted M1 runner by @glegendre01 in #7949
[Workflows] add a workflow that can be manually triggered on a PR. by @sayakpaul in #7942
refactor: Refactored code by Merging isinstance calls by @Sai-Suraj-27 in #7710
Fix the text tokenizer name in logger warning of PixArt pipelines by @liang-hou in #7912
Fix AttributeError in train_lcm_distill_lora_sdxl_wds.py by @jainalphin in #7923
Consistent SDXL Controlnet callback tensor inputs by @asomoza in #7958
remove unsafe workflow. by @sayakpaul in #7967
[tests] fix Pixart Sigma tests by @sayakpaul in #7966
Fix typo in "attention" by @jacobmarks in #7977
Update pipeline_controlnet_inpaint_sd_xl.py by @detkov in #7983
[docs] add doc for PixArtSigmaPipeline by @lawrence-cj in #7857
Passing cross_attention_kwargs to StableDiffusionInstructPix2PixPipeline by @AlexeyZhuravlev in #7961
fix: Fixed few docstrings according to the Google Style Guide by @Sai-Suraj-27 in #7717
Make VAE compatible to torch.compile() by @rootonchair in #7984
[docs] VideoProcessor by @stevhliu in #7965
Use HF_TOKEN env var in CI by @Wauplin in #7993
fix: Attribute error in Logger object (logger.warning) by @AMohamedAakhil in #8183
Remove unnecessary single file tests for SD Cascade UNet by @DN6 in #7996
Fix resize issue in SVD pipeline with VideoProcessor by @DN6 in #8229
Create custom container for doc builder by @DN6 in #8263
Use freedesktop_os_release() in diffusers cli for Python >=3.10 by @DN6 in #8235
[Community Pipeline] FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation by @SingleZombie in #8239
[Chore] run the documentation workflow in a custom container. by @sayakpaul in #8266
Respect resume_download deprecation V2 by @Wauplin in #8267
Clean up from_single_file docs by @DN6 in #8268
sampling bug fix in diffusers tutorial "basic_training.md" by @yue-here in #8223
Fix a grammatical error in the raise messages by @standardAI in #8272
Fix CPU Offloading Usage & Typos by @standardAI in #8230
Add details about 1-stage implementation in I2VGen-XL docs by @dhaivat1729 in #8282
[Workflows] add a more secure way to run tests from a PR. by @sayakpaul in #7969
Add zip package to doc builder image by @DN6 in #8284
[Pipeline] Marigold depth and normals estimation by @toshas in #7847
Release: v0.28.0 by @sayakpaul (direct commit on v0.28.0-release)

Significant community contributions

The following contributors have made significant changes to the library over the last release:

@standardAI
- Fix typos (#7411)
- [IP-Adapter] Fix IP-Adapter Support and Refactor Callback for StableDiffusionPanoramaPipeline (#7262)
- [Docs] Fix typos (#7451)
- Fix Tiling in ConsistencyDecoderVAE (#7290)
- Fix CPU offload in docstring (#7827)
- Fix image upcasting (#7858)
- Remove dead code and fix f-string issue (#7720)
- Fix several imports (#7712)
- Expansion proposal of diffusers-cli env (#7403)
- Fix a grammatical error in the raise messages (#8272)
- Fix CPU Offloading Usage & Typos (#8230)
@a-r-r-o-w
- [refactor] Fix FreeInit behaviour (#7410)
- [Pipeline] AnimateDiff SDXL (#6721)
@UmerHA
- Fixed minor error in test_lora_layers_peft.py (#7394)
- Skip test_lora_fuse_nan on mps (#7481)
- Implements Blockwise lora (#7352)
- Quick-Fix for #7352 block-lora (#7523)
- Skip test_freeu_enabled on MPS (#7570)
- Fixing implementation of ControlNet-XS (#6772)
@bghira
- diffusers#7426 fix stable diffusion xl inference on MPS when dtypes shift unexpectedly due to pytorch bugs (#7446)
- apple mps: training support for SDXL (ControlNet, LoRA, Dreambooth, T2I) (#7447)
- 7529 do not disable autocast for cuda devices (#7530)
- 7879 - adjust documentation to use naruto dataset, since pokemon is now gated (#7880)
@HyoungwonCho
- Perturbed-Attention Guidance (#7512)
- Modification on the PAG community pipeline (re) (#7876)
@haikmanukyan
- add HD-Painter pipeline (#7520)
@fabiorigano
- Multi-image masking for single IP Adapter (#7499)
- Move IP Adapter Face ID to core (#7186)
- Restore AttnProcessor2_0 in unload_ip_adapter (#7727)
- [Docs] Update image masking and face id example (#7780)
- fix AnimateDiff creation with a unet loaded with IP Adapter (#7791)
@kabachuha
- Add (Scheduled) Pseudo-Huber Loss training scripts to research projects (#7527)
@lawrence-cj
- PixArt-Sigma Implementation (#7654)
- [docs] add doc for PixArtSigmaPipeline (#7857)
@vanakema
- #7535 Update FloatTensor type hints to Tensor (#7883)
@zjysteven
- [Pipeline] Adding BoxDiff to community examples (#7947)
@isamu-isozaki
- Adding VQGAN Training script (#5483)
@SingleZombie
- [Community Pipeline] FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation (#8239)
@toshas
- [Pipeline] Marigold depth and normals estimation (#7847)