releases.shpreview

v0.29.0

February 25, 2026TRLView original ↗
$npx -y @buildinternet/releases show rel_sd0_rtFVZKLv1UVYmn474

Features

Add environment_factory to GRPOTrainer

GRPOTrainer now accepts an environment_factory argument, allowing users to specify a custom environment class for training. This enables more flexible and diverse training scenarios by letting users define their own environments with specific dynamics and reward structures.

from datasets import Dataset
from trl import GRPOConfig, GRPOTrainer

dataset = Dataset.from_dict({
    "prompt": [[{"role": "user", "content": f"Increment the counter by {i}."}] for i in range(1, 7)]
})

def reward_func(environments, **kwargs):
    return [env.counter for env in environments]

class IncrementEnv:
    def reset(self):
        self.counter = 0

    def increment(self, step: int) -> int:
        """
        Increment the internal counter.

        Args:
            step: Value to add to the counter.

        Returns:
            The updated counter value.
        """
        self.counter += step
        return self.counter

trainer = GRPOTrainer(
    model="Qwen/Qwen3-0.6B",
    args=GRPOConfig(chat_template_kwargs={"enable_thinking": False}),
    train_dataset=dataset,
    reward_funcs=reward_func,
    environment_factory=IncrementEnv,
)
trainer.train()

by @qgallouedec in https://github.com/huggingface/trl/pull/5093

Skills

TRL introduces agent-native CLI Integration: trl-training, a first-class Agent Skill that exposes TRL’s training workflows (SFT, DPO, GRPO, etc.) in a structured, agent-readable format. The skill is packaged directly with the trl library and can be installed via the CLI:

# Install into the project's agent directory (default scope=project), by agent name: claude, codex, opencode
trl skills install trl-training --target <agent>

This enables AI agents to safely and reproducibly execute TRL training workflows using a well-defined interface.

Skills can be installed at the project or global scope, and support explicit targets and overwrite controls.

Other

Fixes

Documentation and Examples

Deprecations

CI Improvements

Miscellaneous

Refactor CLI

What's Changed

New Contributors

Full Changelog: https://github.com/huggingface/trl/compare/v0.28.0...v0.29.0

Fetched April 7, 2026