releases.shpreview

v0.8.0

v0.8.0: KTOTrainer, TRL CLIs, QLoRA + FSDP !

March 19, 2024TRLView original ↗
$npx -y @buildinternet/releases show rel_yOx_kl5_s2yMjCsM82fIu

New Trainer: KTOTrainer:

We recently introduced the KTOTrainer in order to run KTO algorithms on LLMs !

TRL Command Line Interfaces (CLIs):

Run SFT, DPO and chat with your aligned model directly from the terminal:

SFT:

trl sft --model_name_or_path facebook/opt-125m --dataset_name imdb --output_dir opt-sft-imdb

DPO:

trl dpo --model_name_or_path facebook/opt-125m --dataset_name trl-internal-testing/Anthropic-hh-rlhf-processed --output_dir opt-sft-hh-rlhf 

Chat:

trl chat --model_name_or_path Qwen/Qwen1.5-0.5B-Chat

Read more about CLI in the relevant documentation section or use --help for more details.

FSDP + QLoRA:

SFTTrainer now supports FSDP + QLoRA

Other fixes

New Contributors

Full Changelog: https://github.com/huggingface/trl/compare/v0.7.11...v0.8.0

Fetched April 7, 2026