v0.29.1

What's Changed

Handle mm_token_type_ids in SFT/GRPO/RLOO to fix IndexError by @albertvillanova in https://github.com/huggingface/trl/pull/5178
Fix prepare_multimodal_messages to support tool_calls and tool role by @alvarobartt in https://github.com/huggingface/trl/pull/5212
Fix type for model_init_kwargs when passed as CLI JSON string by @albertvillanova in https://github.com/huggingface/trl/pull/5230
Decouple rollout dispatch from vLLM backend in GRPO _generate_single_turn by @albertvillanova in https://github.com/huggingface/trl/pull/5122
Simplify logic for structured outputs across vLLM versions by @albertvillanova in https://github.com/huggingface/trl/pull/5215
Add support for raw ids in prompts in vLLM client and server by @qgallouedec in https://github.com/huggingface/trl/pull/5225
Add VLM support when passing raw token IDs to vLLM client by @qgallouedec in https://github.com/huggingface/trl/pull/5227
Move rollout_func from _generate_single_turn to _generate by @qgallouedec in https://github.com/huggingface/trl/pull/5232
[GRPO/RLOO] Tokenize before vLLM generation call by @qgallouedec in https://github.com/huggingface/trl/pull/5238
Support JSON string parsing of teacher_model_init_kwargs in MiniLLMConfig by @albertvillanova in https://github.com/huggingface/trl/pull/5259
[GRPO/RLOO] Unify tokenization across all generation backends in _generate_single_turn by @qgallouedec in https://github.com/huggingface/trl/pull/5239
[GRPO/RLOO] Extract tokenize prompts from _generate_single_turn by @qgallouedec in https://github.com/huggingface/trl/pull/5240
[CPO/ORPO] Fix handling of different length chosen/rejected prompts. by @davmels in https://github.com/huggingface/trl/pull/4639
Fix type for teacher_model_init_kwargs when passed as CLI JSON string by @albertvillanova in https://github.com/huggingface/trl/pull/5258
Fix support for model_init_kwargs in GKD/GOLD when passed as CLI JSON string by @albertvillanova in https://github.com/huggingface/trl/pull/5266
Fix mm_token_type_ids silently dropped in DPO VLM training by @albertvillanova in https://github.com/huggingface/trl/pull/5279
Fix support for model_init_kwargs in MiniLLM when passed as CLI JSON string by @albertvillanova in https://github.com/huggingface/trl/pull/5274
Fix GRPOTrainer attribute access for vLLM model config by @falcondai in https://github.com/huggingface/trl/pull/5302
[GRPO] Fix re-tokenization bug in tool-calling loop by concatenating token IDs by @qgallouedec in https://github.com/huggingface/trl/pull/5242

New Contributors

@davmels made their first contribution in https://github.com/huggingface/trl/pull/4639
@falcondai made their first contribution in https://github.com/huggingface/trl/pull/5302

Full Changelog: https://github.com/huggingface/trl/compare/v0.29.0...v0.29.1

What's Changed

New Contributors

More from Hugging Face

From other products

From other products

More from Hugging Face