releases.shpreview

v0.7.5

v0.7.5: IPO & KTO & cDPO loss, `DPOTrainer` enhancements, automatic tags for `xxxTrainer`

December 22, 2023TRLView original ↗
$npx -y @buildinternet/releases show rel_lR1i0TIss1QxqNl1ek_R_

IPO & KTO & cDPO loss, DPOTrainer enhancements, automatic tags for xxxTrainer

Important enhancements for DPOTrainer

This release introduces many new features in TRL for DPOTrainer:

  • IPO-loss for a better generalization of DPO algorithm
  • KTO & cDPO loss
  • You can also pass pre-computed logits to DPOTrainer

Automatic xxxTrainer tagging on the Hub

Now, trainers from TRL pushes automatically tags trl-sft, trl-dpo, trl-ddpo when pushing models on the Hub

unsloth 🤝 TRL

We encourage users to try out unsloth library for faster LLM fine-tuning using PEFT & TRL's SFTTrainer and DPOTrainer

What's Changed

New Contributors

Full Changelog: https://github.com/huggingface/trl/compare/v0.7.4...v0.7.5

Fetched April 7, 2026