v0.8.2: ORPO & CPO Trainer / Vision LLMs support for `SFTTrainer`, KTO fixes
This release includes two new trainers: ORPO from KAIST and CPO
The release also includes Vision LLM such as Llava support for SFTTrainer, please see: https://github.com/huggingface/trl/blob/main/examples/scripts/vsft_llava.py for more details
use_cache=False in {ORPO,CPO}Trainer.concatenated_forward by @alvarobartt in https://github.com/huggingface/trl/pull/1478input_ids instead by @alvarobartt in https://github.com/huggingface/trl/pull/1516You can now use SFTTrainer to fine-tune VLLMs such as Llava !
See: https://github.com/huggingface/trl/blob/main/examples/scripts/vsft_llava.py for more details
Many fixes were introduced for the KTOTrainer:
RichProgressCallback by @eggry in https://github.com/huggingface/trl/pull/1496Full Changelog: https://github.com/huggingface/trl/compare/v0.8.1...v0.8.2
Fetched April 7, 2026