token_type_ids in DPOTrainer by @aweers in https://github.com/huggingface/trl/pull/4285RichProgressCallback enhancement by @qgallouedec in https://github.com/huggingface/trl/pull/4245chat_template_kwargs in apply_chat_template by @cmpatino in https://github.com/huggingface/trl/pull/4233token_type_ids in DataCollatorForVisionLanguageModeling by @qgallouedec in https://github.com/huggingface/trl/pull/4190RewardTrainer refactor by @qgallouedec in https://github.com/huggingface/trl/pull/4093clone_chat_template by @qgallouedec in https://github.com/huggingface/trl/pull/4097vllm_enable_sleep_mode to RLOO Trainer by @sergiopaniego in https://github.com/huggingface/trl/pull/4107batchmean reduce op in GKDTrainer's loss by @cmpatino in https://github.com/huggingface/trl/pull/4105get_high_entropy_mask by @akakakakakaa in https://github.com/huggingface/trl/pull/4041DPOConfig.padding_value in favour or pad_token_id by @qgallouedec in https://github.com/huggingface/trl/pull/4006BestOfNSampler by @qgallouedec in https://github.com/huggingface/trl/pull/4291AlignPropTrainer, DDPOTrainer and IterativeSFTTrainer by @qgallouedec in https://github.com/huggingface/trl/pull/4068trl.experimental Submodule by @August-murr in https://github.com/huggingface/trl/pull/4073make_parser function in multiple scripts by @qgallouedec in https://github.com/huggingface/trl/pull/4050get_high_entropy_mask by @akakakakakaa in https://github.com/huggingface/trl/pull/4041trl.experimental Submodule by @August-murr in https://github.com/huggingface/trl/pull/4073AlignPropTrainer, DDPOTrainer and IterativeSFTTrainer by @qgallouedec in https://github.com/huggingface/trl/pull/4068set to list of tags by @qgallouedec in https://github.com/huggingface/trl/pull/4092DPOConfig.padding_value in favour or pad_token_id by @qgallouedec in https://github.com/huggingface/trl/pull/4006batchmean reduce op in GKDTrainer's loss by @cmpatino in https://github.com/huggingface/trl/pull/4105image_split_sizes in favour of image_grid_thw by @qgallouedec in https://github.com/huggingface/trl/pull/4111vllm_enable_sleep_mode to RLOO Trainer by @sergiopaniego in https://github.com/huggingface/trl/pull/4107backend parameter from GuidedDecodingParams by @qgallouedec in https://github.com/huggingface/trl/pull/4123max_batch_tokens, num_blocks and block_size from generation kwargs by @qgallouedec in https://github.com/huggingface/trl/pull/4065_generate by @qgallouedec in https://github.com/huggingface/trl/pull/4114image_split_sizes in favour of image_grid_thw by @qgallouedec in https://github.com/huggingface/trl/pull/4156_generate for GRPO with replay buffer by @qgallouedec in https://github.com/huggingface/trl/pull/4158<Tip> with new markdown syntax by @qgallouedec in https://github.com/huggingface/trl/pull/4161require_bitsandbytes by @qgallouedec in https://github.com/huggingface/trl/pull/4137clone_chat_template by @qgallouedec in https://github.com/huggingface/trl/pull/4097RewardTrainer refactor by @qgallouedec in https://github.com/huggingface/trl/pull/4093_generate in GRPO/RLOO: list of ints instead of tensors by @qgallouedec in https://github.com/huggingface/trl/pull/4146trainer.tokenizer by trainer.processing_class by @qgallouedec in https://github.com/huggingface/trl/pull/4185sft example script by @sergiopaniego in https://github.com/huggingface/trl/pull/4197Optional from processing_class in PPOTrainer by @sergiopaniego in https://github.com/huggingface/trl/pull/4212sft_video_llm example by @qgallouedec in https://github.com/huggingface/trl/pull/4214trl-internal-testing/tiny-DbrxForCausalLM by @qgallouedec in https://github.com/huggingface/trl/pull/4213_generate in GRPO/RLOO: Use prompt_ids from generation by @qgallouedec in https://github.com/huggingface/trl/pull/4152token_type_ids in DataCollatorForVisionLanguageModeling by @qgallouedec in https://github.com/huggingface/trl/pull/4190_generate in GRPO/RLOO: Rely on generator for prompt truncation by @qgallouedec in https://github.com/huggingface/trl/pull/4153chat_template_kwargs in apply_chat_template by @cmpatino in https://github.com/huggingface/trl/pull/4233RichProgressCallback enhancement by @qgallouedec in https://github.com/huggingface/trl/pull/4245token_type_ids in DPOTrainer by @aweers in https://github.com/huggingface/trl/pull/4285BestOfNSampler by @qgallouedec in https://github.com/huggingface/trl/pull/4291Full Changelog: https://github.com/huggingface/trl/compare/v0.23.0...v0.24.0
Fetched April 7, 2026