v0.8.0: KTOTrainer, TRL CLIs, QLoRA + FSDP !
We recently introduced the KTOTrainer in order to run KTO algorithms on LLMs !
Run SFT, DPO and chat with your aligned model directly from the terminal:
SFT:
trl sft --model_name_or_path facebook/opt-125m --dataset_name imdb --output_dir opt-sft-imdb
DPO:
trl dpo --model_name_or_path facebook/opt-125m --dataset_name trl-internal-testing/Anthropic-hh-rlhf-processed --output_dir opt-sft-hh-rlhf
Chat:
trl chat --model_name_or_path Qwen/Qwen1.5-0.5B-Chat
Read more about CLI in the relevant documentation section or use --help for more details.
model --> model_name_or_path by @lvwerra in https://github.com/huggingface/trl/pull/1452SFTTrainer now supports FSDP + QLoRA
SFTTrainer] Add eval_packing by @younesbelkada in https://github.com/huggingface/trl/pull/1369force_use_ref_model for power users by @younesbelkada in https://github.com/huggingface/trl/pull/1367RewardModeling] Fix RM script for PEFT by @younesbelkada in https://github.com/huggingface/trl/pull/1393Full Changelog: https://github.com/huggingface/trl/compare/v0.7.11...v0.8.0
Fetched April 7, 2026