releases.shpreview

v0.7.8

v0.7.8: Unsloth tag, DPO fixes, PEFT support for DDPO

January 9, 2024TRLView original ↗
$npx -y @buildinternet/releases show rel_yrKsxa8HNX8MDMj6xUMtF

v0.7.8: Unsloth tag, DPO fixes, PEFT support for DDPO

Unsloth tag for xxxTrainer

If users use Unsloth library, the unsloth tag gets automatically pushed on the Hub.

DPO fixes

Some important fixes for DPO has been introduced to address: https://twitter.com/jon_durbin/status/1743575483365699809 and to make DPO faster

DDPO + PEFT

Now DDPO supports PEFT

Other fixes

New Contributors

Full Changelog: https://github.com/huggingface/trl/compare/v0.7.7...v0.7.8

Fetched April 7, 2026