releases.shpreview

v0.9.6

v0.9.6 release

$npx -y @buildinternet/releases show rel_Wf57a6Ll-eSszoUjg_QyE

We are excited to introduce the new v0.9.6 release. Many new exciting features and algorithms. The highlights are as follows:

  • Support for SimPO by @fe1ixxu, a reference-free method that also regularizes output length. To use this loss, the users can input loss_type="simpo" and cpo_alpha=0 in the CPOConfig and use it with the CPOTrainer.

    <img width="880" alt="image" src="https://github.com/huggingface/trl/assets/5555347/87551147-3f58-4c6a-9a78-70b513dea76e">
  • Added AlignProp by @mihirp1998, a method for finetuning Stable Diffusion model using reward gradients.

  • Added Efficient Exact Optimization (EXO) by @haozheji

We also included many important fixes and improvements such as fixing prints in the CLI with GCP containers by @alvarobartt. Enjoy the release!

What's Changed

New Contributors

Full Changelog: https://github.com/huggingface/trl/compare/v0.9.4...v0.9.6

Fetched April 7, 2026