releases.shpreview

v0.6.0

August 25, 2023TRLView original ↗
$npx -y @buildinternet/releases show rel_iMsBpLeigQ5KT1pqMcxZ9

DDPO for diffusion models

We are excited to welcome the first RLHF + diffusion models algorithm to refine the generations from diffusion models. Read more about it directly in the docs.

BeforeAfter DDPO finetuning
<div style="text-align: center"><img src="https://huggingface.co/datasets/trl-internal-testing/example-images/resolve/main/images/pre_squirrel.png"/></div><div style="text-align: center"><img src="https://huggingface.co/datasets/trl-internal-testing/example-images/resolve/main/images/post_squirrel.png"/></div>
<div style="text-align: center"><img src="https://huggingface.co/datasets/trl-internal-testing/example-images/resolve/main/images/pre_starfish.png"/></div><div style="text-align: center"><img src="https://huggingface.co/datasets/trl-internal-testing/example-images/resolve/main/images/post_starfish.png"/></div>

Bug fixes and other enhancements

The release also comes with multiple bug fixes reported and/or led by the community, check out the commit history below

What's Changed

New Contributors

Full Changelog: https://github.com/huggingface/trl/compare/v0.5.0...v0.6.0

Fetched April 7, 2026