GRPOTrainer and SFTTrainer by @I-l-l-I in https://github.com/huggingface/trl/pull/3337TextEnvironment and tools by @lewtun in https://github.com/huggingface/trl/pull/3389formatting_func is used with completion_only_loss by @LeonEricsson in https://github.com/huggingface/trl/pull/3385setup.py to setup.cfg and make rich an optional dep by @qgallouedec in https://github.com/huggingface/trl/pull/3403trl env on xpu by @yao-matrix in https://github.com/huggingface/trl/pull/3438base_url parameter for vLLM client initialization by @re-imagined in https://github.com/huggingface/trl/pull/3324keep_end leading to zero'd out samples by @LeonEricsson in https://github.com/huggingface/trl/pull/3398Full Changelog: https://github.com/huggingface/trl/compare/v0.17.0...v0.18.0
Fetched April 7, 2026