A new loss_type="chunked_nll" option drastically reduces peak activation memory in SFT by avoiding the full [batch × seq × vocab] logits tensor. Ignored-label tokens are dropped before the lm_head matmul, and the cross-entropy is computed over the remaining tokens in checkpointed chunks (default chunk_size=256, the sweet spot consistent across model sizes and sequence lengths).
from trl import SFTConfig, SFTTrainer
trainer = SFTTrainer(
model="Qwen/Qwen3-4B",
args=SFTConfig(loss_type="chunked_nll"),
train_dataset=dataset,
)
trainer.train()
Peak GPU memory, AdamW fp32:
| Model | Hardware | Seq | nll | chunked_nll |
|---|---|---|---|---|
| Qwen3-1.7B + LoRA | 1×H100 80GB | 2048 | 47.9 GB | 12.3 GB (3.9× less) |
| Qwen3-4B | 1×H100 80GB | 16384 | OOM | 63.8 GB |
| Qwen3-14B | 8×H100 FSDP2 | 16384 | 58.9 GB | 38.9 GB (1.5× less) |
| Qwen3-32B | 8×H100 FSDP2 | 8192 | OOM | 71.2 GB |
End-to-end, chunked NLL is consistently as fast or faster than nll — and it unlocks sequence lengths that don't fit at all under the standard path.
The chunked path also supports VLMs (https://github.com/huggingface/trl/pull/5684).
by @qgallouedec in https://github.com/huggingface/trl/pull/5575, https://github.com/huggingface/trl/pull/5676 and https://github.com/huggingface/trl/pull/5684
A new trl.experimental.openreward adapter plugs any environment speaking the Open Reward Standard (ORS) protocol into any TRL trainer accepting an environment_factory (GRPOTrainer, AsyncGRPOTrainer). One identifier wires all three trainer slots — dataset, factory, reward_func:
from trl import GRPOConfig, GRPOTrainer
from trl.experimental.openreward import OpenRewardEnv
env = OpenRewardEnv("Eigent/SETA") # or "http://localhost:8000"
trainer = GRPOTrainer(
model="Qwen/Qwen3-4B",
args=GRPOConfig(...),
train_dataset=env.dataset,
environment_factory=env.factory,
reward_funcs=env.reward_func,
)
Tools are bound dynamically from JSON Schema at construction (no per-env wrapper code), and env.dataset autoderives task lists from the ORS task endpoints. The same code path works for envs hosted on the OpenReward platform, self-hosted on any container service, or running locally on localhost. A SETA training example is included.
by @adithya-s-k in https://github.com/huggingface/trl/pull/5696
Unit tests don't catch trainer-level numerical drift (gradient-accumulation normalization bugs, attention-impl divergence (eager ↔ FA2 / kernels)) they silently shift the loss trajectory and users only notice when their run no longer reproduces. (Cf. last year's transformers grad-accum bug, or the "We found two bugs in DeepSpeed" paper.)
A new opt-in pytest -m invariant suite asserts the loss / grad_norm trajectory of short end-to-end SFT/DPO runs against committed reference snapshots, with equivalence classes for configs that should produce identical trajectories (e.g. pdb=1, gas=8 ≡ default; eager ≡ FA2 ≡ kernels). Hardware-pinned to H100 80GB, real pretrained model, full_determinism, fixed seed. Initial coverage: 2 trainers × 2 invariance axes (grad-accum, attn-impl) × gradient-checkpointing equivalence.
by @qgallouedec in https://github.com/huggingface/trl/pull/5686, https://github.com/huggingface/trl/pull/5688 and https://github.com/huggingface/trl/pull/5689
Three new pure helpers in trl.trainer.utils for measuring training efficiency:
compute_flops_per_token(config, seq_len) — handles dense and MoE (Mixtral, Qwen3-MoE, DeepSeek-V2)compute_mfu(flops_per_token, tps, world_size, peak_flops) — Model FLOPs Utilization as a percentageadjusted_mfu(mfu, config, seq_len) — non-causal → causal-corrected (Llama / DS Ulysses convention)by @AmineDiro in https://github.com/huggingface/trl/pull/5698
GRPO's Liger-kernel integration is updated for Liger 0.8.0: delta two-sided clipping, use_bias_correction_kl, and SAPO/VESPO parameters are now forwarded into LigerFusedLinearGRPOLoss. The previous delta + use_liger_kernel guard is removed — both can be combined.
by @kashif in https://github.com/huggingface/trl/pull/5690
A new loss_type="sigmoid_norm" option for DPOConfig implements the per-token (length-normalized) DPO loss used by Tülu 3 / OLMo (paper §5.1.2 eq. 6) to mitigate length bias.
from trl import DPOConfig, DPOTrainer
trainer = DPOTrainer(
model="Qwen/Qwen3-4B",
args=DPOConfig(loss_type="sigmoid_norm"),
train_dataset=dataset,
)
by @BrownianNotion in https://github.com/huggingface/trl/pull/5406
Four more model families gain training-compatible chat templates with {% generation %} markers (assistant-only loss masking) and/or response schemas (tool-calling parsing):
{% generation %} markers by @qgallouedec in https://github.com/huggingface/trl/pull/5675get_training_chat_template now also accepts a processor (not just a tokenizer) — useful for VLMs (https://github.com/huggingface/trl/pull/5560).
Another batch of alignment PRs this cycle. KTO and DPO are now structurally aligned across PEFT handling, model initialization, training-arg grouping, ref-logp precomputation, and metric handling — promotion of KTO out of experimental is imminent.
PRs (all by @albertvillanova): #5659, #5660, #5661, #5679, #5701, #5702, #5703, #5704, #5705, #5714.
parallelism_config with cp_size>1 or sp_size>1 in GRPO/RLOO — fail fast at config init with a clear error instead of mid-training crash. By @kashif in https://github.com/huggingface/trl/pull/5699model_accepts_loss_kwargs=False in DPO and Reward by @albertvillanova in https://github.com/huggingface/trl/pull/5710_tokenizer attribute in experimental trainers by @albertvillanova in https://github.com/huggingface/trl/pull/5566peft_config handling in core / experimental trainers by @albertvillanova in https://github.com/huggingface/trl/pull/5673 and https://github.com/huggingface/trl/pull/5674isinstance with is_peft_model / drop redundant is_peft_available by @albertvillanova in https://github.com/huggingface/trl/pull/5682 and https://github.com/huggingface/trl/pull/5683parse_response by @qgallouedec in https://github.com/huggingface/trl/pull/5561OffloadActivations.__exit__ now syncs the compute/offload streams and clears the stash dictionaries, preventing orphaned offload tensors from leaking onto a dead stream (~0.2 GiB/step accumulation observed during QLoRA vision training before the fix). By @butterwecksolutions in https://github.com/huggingface/trl/pull/5694 and https://github.com/huggingface/trl/pull/5700DistillationTrainer by @k1064190 in https://github.com/huggingface/trl/pull/5594GKDTrainer: fix return_outputs in the Liger kernel path by @roycho96 in https://github.com/huggingface/trl/pull/4688GKDTrainer: fix seq-KD wasted teacher forward by @roycho96 in https://github.com/huggingface/trl/pull/5726GKDTrainer: fix Liger fused JSD path computing wrong loss by @roycho96 in https://github.com/huggingface/trl/pull/5731peft_config to core / experimental trainers by @albertvillanova in https://github.com/huggingface/trl/pull/5664 and https://github.com/huggingface/trl/pull/5665peft_config type hint in experimental trainers by @albertvillanova in https://github.com/huggingface/trl/pull/5666DistillationTrainer by @cmpatino in https://github.com/huggingface/trl/pull/5615Qwen3-4B-Instruct-2507 by @qgallouedec in https://github.com/huggingface/trl/pull/5586Qwen/Qwen3-30B-A3B by @qgallouedec in https://github.com/huggingface/trl/pull/5716DistillationTrainer by @cmpatino in https://github.com/huggingface/trl/pull/5615{% generation %} markers for Cohere2 chat template by @qgallouedec in https://github.com/huggingface/trl/pull/5675get_training_chat_template by @qgallouedec in https://github.com/huggingface/trl/pull/5560parse_response by @qgallouedec in https://github.com/huggingface/trl/pull/5561Full Changelog: https://github.com/huggingface/trl/compare/v1.3.0...v1.4.0
Full Changelog: https://github.com/huggingface/pytorch-image-models/compare/v1.0.26...v1.0.27
You can now manage Space secrets and environment variables directly from the command line with two new hf spaces subgroups: secrets and variables. Use hf spaces secrets to add, list, and delete write-only secrets, and hf spaces variables to add, list, and delete readable environment variables. Both add commands support multiple -s/-e flags and --secrets-file/-env-file for loading from dotenv files. On the Python side, HfApi.get_space_secrets() returns secret metadata (key, description, updated timestamp) without ever revealing values.
# List secrets (values are write-only — only keys and timestamps are shown)
$ hf spaces secrets ls username/my-space
# Add secrets
$ hf spaces secrets add username/my-space -s OPENAI_API_KEY=sk-...
$ hf spaces secrets add username/my-space --secrets-file .env.secrets
# Delete a secret (confirmation prompt, use --yes to skip)
$ hf spaces secrets delete username/my-space OPENAI_API_KEY --yes
# List, add, and delete variables (values are readable)
$ hf spaces variables ls username/my-space
$ hf spaces variables add username/my-space -e MODEL_ID=gpt2 -e MAX_TOKENS=512
$ hf spaces variables delete username/my-space MAX_TOKENS --yes
📚 Documentation: CLI guide · Manage your Space
hf buckets cp now supports rsync-style trailing slash semantics when copying folders. A trailing / on the source path copies only the folder's contents to the destination, while omitting it nests the folder itself — matching the behavior you'd expect from rsync. This makes it possible to flatten directory structures during copies, which was not possible before. Additionally, copy_files now raises an explicit EntryNotFoundError when the source path resolves to no files, instead of silently succeeding with zero operations.
# Without trailing slash: "logs" dir is nested => dst/logs/...
$ hf buckets cp hf://buckets/username/src-bucket/logs hf://buckets/username/dst/
# With trailing slash: only contents of "logs" are copied => dst/...
$ hf buckets cp hf://buckets/username/src-bucket/logs/ hf://buckets/username/dst/
📚 Documentation: Buckets guide · CLI guide
hf skills upgrade -> hf skills update by @hanouticelina in #4176 — hf skills upgrade no longer exists; use hf skills update instead.out.status() by @hanouticelina in #4171 — status updates (spinners/progress) on hf extensions install and hf spaces dev-mode are now suppressed when using --format json, --quiet, or --format agent.hf datasets leaderboard by @Wauplin in #4174hf update when already on latest version by @julien-c in #4177hf skills to bucket by @hanouticelina in #4175DeepSeek-V4 is the next-generation MoE (Mixture of Experts) language model from DeepSeek that introduces several architectural innovations over DeepSeek-V3. The architecture replaces Multi-head Latent Attention (MLA) with a hybrid local + long-range attention design, swaps residual connections for Manifold-Constrained Hyper-Connections (mHC), and bootstraps the first few MoE layers with a static token-id → expert-id hash table. This implementation covers DeepSeek-V4-Flash, DeepSeek-V4-Pro, and their -Base pretrained variants, which share the same architecture but differ in width, depth, expert count and weights.
Links: Documentation | Paper
Gemma 4 Assistant is a small, text-only model that enables speculative decoding for Gemma 4 models using the Multi-Token Prediction (MTP) method and associated candidate generator. The model shares the same Gemma4TextModel backbone as other Gemma 4 models but uses KV sharing throughout the entire model, allowing it to reuse the KV cache populated by the target model and skip the pre-fill phase entirely. This architecture includes cross-attention to make the most of the target model's context, allowing the assistant to accurately predict more drafted tokens per drafting round.
Links: Documentation
Granite Speech Plus is a variant of Granite Speech that enhances the projector by consuming the concatenation of the encoder's final hidden states with an arbitrary subset of its intermediate hidden states along the feature dimension. It is a multimodal speech-to-text model that can transcribe audio, provide speaker annotation and word level timestamps by responding to text prompts. The model inherits the same architecture components as Granite Speech including the speech encoder, query transformer projector, language model, and optional LoRA adapter.
Links: Documentation
Granite Vision 4.1 is a vision-language model from IBM Research designed for enterprise-grade document data extraction. It specializes in chart extraction (Chart2CSV, Chart2Summary, Chart2Code), table extraction (JSON, HTML, OTSL), and semantic key-value pair extraction. The model builds on LLaVA-NeXT with architectural innovations including SigLIP2 Vision Encoder, Window Q-Former Projectors, and DeepStack Feature Injection with 8 vision-to-LLM injection points.
Links: Documentation
EXAONE 4.5 is the first open-weight vision language model developed by LG AI Research, integrating a dedicated visual encoder into the existing EXAONE 4.0 framework to expand multimodal capabilities. The model features 33 billion parameters in total, including 1.2 billion parameters from the vision encoder, and achieves competitive performance in general benchmarks while outperforming similar-sized models in document understanding and Korean contextual reasoning. It builds on EXAONE 4.0 with key enhancements including an expanded vocabulary of 153,600 tokens, support for up to 256K token context windows, and a Multi-Token Prediction (MTP) mechanism.
Links: Documentation | Paper | Blog Post
PP-FormulaNet-L and PP-FormulaNet_plus-L are lightweight models designed for table structure recognition, focusing on accurately recognizing table structures in documents and natural scenes. The models are part of the SLANet series and can be used for image-to-text tasks, specifically for detecting and processing mathematical formulas and table structures from images.
Links: Documentation
Apex integration has been removed from the library (including RMSNorm usage in T5 and related models), so users relying on Apex for mixed precision or fused ops should migrate to PyTorch's native equivalents instead.
Fixed tokenizer mapping issues for DeepSeek R1 distilled (Qwen2) and DeepSeek OCR models, and resolved a significant performance regression in PreTrainedTokenizer.convert_ids_to_tokens where skip_special_tokens=True was rebuilding the special token set on every iteration, resulting in a ~300x speedup for that code path.
concurrency to PR CI workflow file (pr-ci-caller.yml) (#45786) by @ydshieh in [#45786]text_config in AutoModelFor*.from_config (#45770) by @jamesbraza in [#45770]OAI Privacy Filter] Add integration test (#45725) by @vasqu in [#45725]The following contributors have made significant changes to the library over the last release:
LLaDA2 is a family of discrete diffusion language models that generate text through block-wise iterative refinement. Instead of autoregressive token-by-token generation, LLaDA2 starts with a fully masked sequence and progressively unmasks tokens by confidence over multiple refinement steps.
NucleusMoE-Image is a 2B active 17B parameter model trained with efficiency at its core. Our novel architecture highlights the scalability of a sparse MoE architecture for Image generation.
Thanks to @sippycoder for the contribution.
ERNIE-Image is a powerful and highly efficient image generation model with 8B parameters.
Thanks to @HsiaWinter for the contribution.
LongCat-AudioDiT is a text-to-audio diffusion model from Meituan LongCat.
Thanks to @RuixiangMa for the contribution.
ACE-Step 1.5 generates variable-length stereo audio at 48 kHz (10 seconds to 10 minutes) from text prompts and optional lyrics. The full system pairs a Language Model planner with a Diffusion Transformer (DiT) synthesizer; this pipeline wraps the DiT half of that stack, and consists of three components: an AutoencoderOobleck VAE that compresses waveforms into 25 Hz stereo latents, a Qwen3-based text encoder for prompt and lyric conditioning, and an AceStepTransformer1DModel DiT that operates in the VAE latent space using flow matching.
Thanks to @ChuxiJ for the contribution.
Make your Flux.2 decoding faster with this new small decoder model from the Black Forest Labs. You can check it out here. It was contributed by @huemin-art in this PR.
We added modular support for LTX-2 and Hunyuan 1.5.
ring_anything as a new CP backendlru_cache warnings during torch.compile by @jiqing-feng in #13384--with_prior_preservation by @chenyangzhu1 in #133960.8.0-rc.0 by @McPatate in #13470trust_remote_code by @hlky in #13448The following contributors have made significant changes to the library over the last release:
trust_remote_code (#13448)This release adds three new CLI capabilities for exploring Hub content. hf models card, hf datasets card, and hf spaces card fetch the README of any repo and print it to stdout, with --metadata (YAML frontmatter as JSON) and --text (prose only) flags for splitting the card into its structured and unstructured parts. Calling hf models ls <repo_id>, hf datasets ls <repo_id>, or hf spaces ls <repo_id> now switches from listing repos to listing files inside that repo, with --tree, -R, -h, and --revision options mirroring the existing hf buckets ls behavior. And hf datasets leaderboard <dataset_id> surfaces model scores submitted to a benchmark dataset, making it easy to compare models by score from the terminal.
# Get model card metadata as JSON
hf models card google/gemma-4-31B-it --metadata --format json
# List files in a model repo (tree view with sizes)
hf models ls meta-llama/Llama-3.2-1B-Instruct --tree -h
# Show top 5 models on SWE-bench
hf datasets leaderboard SWE-bench/SWE-bench_Verified --limit 5
📚 Documentation: CLI guide
hf datasets leaderboard by @hanouticelina in #4154Three new hf spaces subcommands bring full lifecycle control to the terminal. hf spaces pause and hf spaces restart stop or rebuild a Space (with --factory-reboot for a clean rebuild), and hf spaces settings lets you configure sleep time and hardware in one call. A companion hf spaces hardware command lists all available hardware flavors with pricing, so you can discover options before changing settings. Pause and restart include a confirmation prompt (-y to skip) since they tear down the running container.
# Pause a Space when not in use (not billed while paused)
hf spaces pause username/my-space
# Restart with a GPU
hf spaces settings username/my-space --hardware t4-medium --sleep-time 3600
# List available hardware options
hf spaces hardware
📚 Documentation: CLI guide — Spaces
hf spaces hardware command by @Wauplin in #4169--hardware flag to hf spaces settings by @davanstrien in #4163hf update replaces the auto-update promptThe blocking interactive Y/n auto-update prompt at CLI startup is gone. It was catching too many non-interactive contexts (CI runners, Homebrew post-install hooks, Jupyter notebooks) and hanging automation. In its place, a single yellow stderr warning suggests running hf update — a new command that detects how hf was installed (Homebrew, standalone installer, or pip) and runs the right upgrade command. Set HF_HUB_DISABLE_UPDATE_CHECK=1 to silence the startup check entirely, for example in offline CI.
hf update
📚 Documentation: CLI guide — Updating
hf update + drop interactive update prompt by @Wauplin in #4131The --format, --json, and -q / --quiet flags are now handled globally by the CLI framework instead of being declared individually on each command. This means every hf command automatically accepts them — no more per-command --format boilerplate, and the flags are properly documented in a dedicated "Formatting options" section in every --help page. --format auto (the default) picks human for interactive terminals and agent when invoked by an AI agent, making CLI output automatically suitable for both people and tools.
# JSON output for scripting
hf models ls --search bert --limit 2 --json | jq '.[].id'
# IDs only, one per line
hf collections ls --owner nvidia -q
📚 Documentation: CLI guide — Output formatting
hf:// URI parsingA new parse_hf_uri function and HfUri dataclass provide a single source of truth for parsing hf://... strings across the library. Whether you reference a model, dataset, space, bucket, or file inside a repo, the parser handles all valid URI shapes — type prefixes, revisions, and paths — and rejects invalid ones with clear error messages. A companion parse_hf_mount / HfMount handles volume mount specifications (hf://...:/mnt:ro). Both are pure string parsers (no network calls) and round-trippable via .to_uri().
from huggingface_hub import parse_hf_uri, parse_hf_mount
parse_hf_uri("hf://datasets/namespace/my-dataset@refs/pr/3/train.json")
# HfUri(type='dataset', id='namespace/my-dataset', revision='refs/pr/3', path_in_repo='train.json')
parse_hf_mount("hf://buckets/my-org/my-bucket/sub/dir:/mnt:ro")
# HfMount(source=HfUri(type='bucket', id='my-org/my-bucket', ...), mount_path='/mnt', read_only=True)
📚 Documentation: HF URIs reference
Local scripts uploaded by hf jobs uv run are now stored in a {namespace}/jobs-artifacts bucket and mounted into the job container at /data instead of being base64-encoded into an environment variable. The old bash -c + xargs + base64 -d pipeline was fragile and required manual shell quoting. Bucket transport is simpler, easier to debug, and supports write-back: jobs can persist output artifacts to /data/ since the mount is read-write. The base64 transport path has been fully removed with no fallback.
Full Changelog: https://github.com/huggingface/huggingface_hub/compare/v1.12.1...v1.12.2
Laguna is Poolside's mixture-of-experts language model family that extends standard SwiGLU MoE transformers with two key innovations. It features per-layer head counts allowing different decoder layers to have different query-head counts while sharing the same KV cache shape, and implements a sigmoid MoE router with auxiliary-loss-free load balancing that uses element-wise sigmoid of gate logits plus learned per-expert bias for router scoring.
Links: Documentation
DEIMv2 (DETR with Improved Matching v2) is a real-time object detection model that extends DEIM with DINOv3 features and spans eight model sizes from X to Atto for diverse deployment scenarios. It uses a Spatial Tuning Adapter (STA) for larger variants to convert DINOv3's single-scale output into multi-scale features, while ultra-lightweight models employ pruned HGNetv2 backbones. The unified design achieves superior performance-cost trade-offs, with DEIMv2-X reaching 57.8 AP with only 50.3M parameters and DEIMv2-S being the first sub-10M model to exceed 50 AP on COCO.
Links: Documentation | Paper
Several attention-related bugs were fixed across multiple models, including a cross-attention cache type error in T5Gemma2 for long inputs, incorrect cached forward behavior in Qwen3.5's gated-delta-net linear attention, and a crash in GraniteMoeHybrid when no Mamba layers are present. Attention function dispatch was also updated to align with the latest model implementations.
There was a bug in AutoTokenizer that caused the wrong tokenizer class to be initialized. This caused regressions in models like DeepSeek R1.
Continuous batching generation received several fixes and improvements, including correcting KV deduplication and memory estimation for long sequences (16K+), and removing misleading warnings about num_return_sequences and other unsupported features that were incorrectly firing even when functionality worked correctly. Documentation for per-request sampling parameters was also added.
Improved kernel support by fixing configuration reading and error handling for FP8 checkpoints (e.g., Qwen3.5-35B-A3B-FP8), enabling custom expert kernels registered from the HF Hub to be properly loaded, and resolving an incompatibility that prevented Gemma3n and Gemma4 from using the rotary kernel.
x_clip: 8 failed test cases (#45394) by @kaixuanliu in [#45394]NameError: PeftConfigLike triggered by PreTrainedModel.__init_subclass__ (#45658) by @qgallouedec in [#45658]clean_up_tokenization for BPE tokenizers in PreTrainedTokenizerFast (#44915) by @maxsloef-goodfire in [#44915]supports_gradient_checkpointing to NemotronHPreTrainedModel (#45625) by @sergiopaniego in [#45625]problem_type="single_label_classification" with num_labels=1 (#45611) by @gaurav0107 in [#45611]AttributeError on s_aux=None in flash_attention_forward (#45589) by @jamesbraza in [#45589]The following contributors have made significant changes to the library over the last release:
Full Changelog: https://github.com/huggingface/datasets/compare/4.8.4...4.8.5
tokenizers 0.23.1 is the first proper stable release in the 0.23 line — 0.23.0 only ever shipped as rc0 because the release pipeline itself was broken (Node side hadn't shipped multi-platform binaries since 2023, Python side was on pyo3 0.27 without free-threaded support). 0.23.1 is the version where everything actually goes out the door together: full Node multi-platform wheels for the first time in years, Python 3.14 (regular and free-threaded 3.14t), full type hints for every Python class, and a stack of measurable perf wins on the BPE / added-vocab hot paths.
There is no functional 0.23.0 published — we tag 0.23.1 directly so users don't accidentally pull a never-shipped version.
requires-python = ">=3.10"; 3.9 users stay on 0.22.x.add_tokens normalizes content at insertion (#1995) — re-saved tokenizer.json may differ in the added_tokens block. Existing files load unchanged.Any now return real types; mypy --strict may surface previously-hidden errors. Stub layout also moved from tokenizers/<sub>/__init__.pyi to tokenizers/<sub>.pyi. This breaks the surface of some of the processors like RobertaProcessign's __init__ .PyResult<T> because of Arc<RwLock<Tokenizer>>; a poisoned lock surfaces as PyException instead of a panic.Run with cargo bench --bench <name> -- --save-baseline v0_22_2 on v0.22.2, then --baseline v0_22_2 on v0.23.1. Numbers are point-in-time wall clock on a single laptop; relative deltas are what matters, absolute numbers will differ on CI hardware.
bench: improve added_vocab_deserialize to reflect real-world workloads (#2000) is now representative of how transformers actually loads tokenizer.json files. The combined effect of daachorse for the matching automaton plus the normalize-on-insert refactor is enormous on this workload:
| benchmark | v0.22.2 | v0.23.1 | change |
|---|---|---|---|
| 100k tokens, special, no norm | ~410 ms | 248 ms | −40% |
| 100k tokens, non-special, no norm | ~7.1 s | 273 ms | −96% |
| 100k tokens, special, NFKC | ~395 ms | 235 ms | −40% |
| 100k tokens, non-special, NFKC | ~7.4 s | 290 ms | −96% |
| 400k tokens, special, no norm | ~15 s | 980 ms | −94% |
Real-world impact: loading a Llama-3-style tokenizer with a large set of added tokens dropped from "noticeable pause" to "instant".
| benchmark | v0.22.2 | v0.23.1 | change |
|---|---|---|---|
BPE GPT2 encode batch, no cache | 530 ms | 446 ms | −16% |
BPE GPT2 encode batch (cached) | 690 ms | 685 ms | noise |
BPE GPT2 encode (single) | 1.95 s | 1.94 s | noise |
BPE Train (small) | 32.6 ms | 31.5 ms | −3% |
BPE Train (big) | 1.01 s | 988 ms | −2% |
The BPE per-thread cache PR (#2028) shows much larger wins on highly-parallel workloads (+47–62% at 88+ threads on a server box, per the PR's own measurements on Vera). Single-thread batch numbers above are flat or slightly improved because cache-hit overhead was already low without contention.
| benchmark | v0.22.2 | v0.23.1 | change |
|---|---|---|---|
llama3-encode (single) | 2.10 s | 2.02 s | −4% |
llama3-batch | 438 ms | 408 ms | −7% |
llama3-offsets | 410 ms | 395 ms | −4% |
Right-direction truncation no longer pre-tokenizes past max_length. The new truncation_benchmark doesn't exist on v0.22.2 so there's no apples-to-apples here, but the PR's own measurements on the same machine showed −20–28% across a range of max_length values for right-truncation; left-truncation unchanged.
BPE::Builder::build no longer formats strings in a hot loop (#2010) — ~45% faster Tokenizer::from_file on Llama-3 in the PR's profile.The tokenizer.json format is forward-compatible: existing files load on 0.23 unchanged. Two things to know if you re-save:
added_tokens entries created via add_tokens(..., normalized=True) will have their content normalized at save time — see breaking-change note above.tokenizer.train(...) no longer keeps a redundant added_tokens/special_tokens Vec separate from the added_tokens_map_r. Public API surface unchanged; only the internal struct shape moved.bench: improve added_vocab_deserialize to reflect real-world workloads (#2000) lands a more realistic micro-benchmark for this surface; if you're tracking deserialize perf in your own CI, the new bench is the one to compare against.
Dedicated wheels for python3.14t (the free-threaded build introduced in PEP 703). The wheel:
Py_MOD_GIL_NOT_USED, so importing tokenizers does not force the GIL back on.abi3 cargo feature (free-threaded Python doesn't expose the limited API).Arc<RwLock<Tokenizer>> for the inner state so concurrent setters and encoders don't race PyO3's per-pyclass borrow check.A new stress-test module tests/test_freethreaded.py exercises N-encoder × M-setter races on a single Tokenizer and asserts no RuntimeError: Already borrowed, no RwLock poisoning, and that sys._is_gil_enabled() is False post-import.
For the regular CPython wheel everything is unchanged.
The npm package now ships 13 platforms (macOS x64/arm64/universal, Windows x64/i686/arm64, Linux x64/arm64/armv7 in both glibc and musl, Android arm64/armv7) — previous workflows only built 3 of those, leaving Apple Silicon / Linux ARM / Alpine users with package-not-found errors since 2023 (#1365, #1703, #1922). Fixed via #1970 + #2034, which also bumps @napi-rs/cli to v3 and switches cross-builds to cargo-zigbuild.
Every class in the python bindings now ships proper .pyi stubs — Tokenizer, AddedToken, Encoding, every decoder / model / normalizer / pre-tokenizer / processor / trainer. Editors and type checkers (mypy, pyright, ty) see real signatures with types and docstrings instead of falling back to Any.
The stubs are generated automatically from the compiled extension via tools/stub-gen (Rust binary using pyo3-introspection). Re-running make style regenerates them; CI guards against regenerated-vs-checked-in drift. If the generator ever returns 0 docstrings (e.g. because the [patch.crates-io] pin in .cargo/config.toml falls out of sync with the pyo3 dep version), it now hard-aborts with a precise diagnostic instead of silently emitting bare-bones stubs.
>>> from tokenizers import Tokenizer
>>> # IDEs now resolve every method, every kwarg, every return type
>>> Tokenizer.from_pretrained("bert-base-cased")
⚠️ As called out in breaking changes: stricter type info means previously-hidden type errors in user code may now surface under mypy --strict.
models.Unigram now exposes alpha and nbest_size for subword regularization (parity with Google's implementation, #1994). Closes long-standing requests #730 and #849.Tokenizer (#1958) — useful for long-lived caches that don't want to keep tokenizers alive.ci_benchmark against the stored baseline and posts a comparison chart to the PR.EncodingVisualizer: unclosed annotation span fixed (#1911), HTML escape applied to output (#1937).__copy__ / __deepcopy__ (#1930).to_vec() from slice (#1964).wget / norvig URL with HF Hub downloads in test data fetch (#2018).uv support in the Python Makefile (#1977).Thanks to everyone who shipped commits between v0.22.2 and v0.23.1:
@ArthurZucker, @finnagin, @gordonmessmer, @jberg5, @kennethsible, @llukito, @MayCXC, @McPatate, @michaelfeil, @mrkm4ntr, @musicinmybrain, @ngoldbaum, @OhashiReon, @paulinebm, @podarok, @rtrompier, @sebpop, @Shivam-Bhardwaj, @threexc, @wheynelau, @xanderlent — plus @dependabot and @hf-security-analysis for keeping pins fresh.
Full Changelog: https://github.com/huggingface/tokenizers/compare/v0.22.2...v0.23.1
TRL v1.3 ships training support for the new Qwen 3.6 family (Qwen/Qwen3.6-27B, Qwen/Qwen3.6-35B-A3B). Qwen 3.6 reuses the Qwen3_5Moe* architecture but ships a slightly different chat template (adds a preserve_thinking flag, tweaks tool-arg stringification), so exact-string template matching needed updates across the stack.
What landed:
qwen3_6.jinja (verbatim from upstream) and qwen3_6_training.jinja (prefix-preserving + {% generation %} markers for assistant_only_loss=True)qwen3_5_schema for tool-call parsing — output format unchangedtiny-Qwen3_5MoeForConditionalGeneration-3.6 (with MoE-specific shrinking)test_(train|training)_vlm casesfrom trl import SFTConfig, SFTTrainer
trainer = SFTTrainer(
model="Qwen/Qwen3.6-27B",
args=SFTConfig(assistant_only_loss=True), # works out of the box
train_dataset=dataset,
)
trainer.train()
Tool-calling agent training also works end-to-end via the existing Qwen 3.5 response schema:
from trl import GRPOConfig, GRPOTrainer
def multiply(a: int, b: int) -> int:
"""
Multiplies two integers.
Args:
a: The first integer.
b: The second integer.
Returns:
The product of the two integers.
"""
return a * b
trainer = GRPOTrainer(
model="Qwen/Qwen3.6-27B",
reward_funcs=my_reward_fn,
args=GRPOConfig(...),
train_dataset=dataset,
tools=[multiply],
)
trainer.train()
by @qgallouedec in https://github.com/huggingface/trl/pull/5642
A new experimental TPOTrainer implements Triple Preference Optimization, which augments DPO with a reference (gold) completion alongside chosen/rejected. The paper reports +7-19 points over DPO/SimPO on Arena-Hard, MixEval-Hard, MMLU-Pro and GSM8K, with less data.
from trl.experimental.tpo import TPOConfig, TPOTrainer
trainer = TPOTrainer(
model="Qwen/Qwen3-0.6B",
args=TPOConfig(output_dir="Qwen3-0.6B-TPO"),
train_dataset=load_dataset("tpo-alignment/triple-preference-ultrafeedback-40K", split="train"),
)
trainer.train()
by @kashif in https://github.com/huggingface/trl/pull/5506
trl vllm-serveA new --speculative_config JSON flag exposes vLLM's speculative decoding directly through trl vllm-serve — works with native MTP heads (Qwen3 Next), Eagle3 drafts, etc. — without forking the serve script.
# Qwen3 native MTP (no extra draft model)
trl vllm-serve --model Qwen/Qwen3-Next-80B-A3B-Instruct \
--speculative_config '{"method": "qwen3_next_mtp", "num_speculative_tokens": 5}'
# Eagle3 draft model
trl vllm-serve --model Qwen/Qwen3-32B \
--speculative_config '{"model": "RedHatAI/Qwen3-32B-speculator.eagle3", "method": "eagle3", "num_speculative_tokens": 3}'
by @Ofir408 in https://github.com/huggingface/trl/pull/5605
Twelve more alignment PRs this cycle, bringing KTOTrainer and DPOTrainer essentially into structural parity. Notable shifts include moving completion assembly out of _prepare_dataset into a new DataCollatorForKTO, inlining the two-pass tokenization into a single pass, removing BOS/EOS handling, and supporting IterableDataset and dict eval_dataset. The goal — promoting KTO out of experimental and into stable — is now within reach for an upcoming release.
PRs (all by @albertvillanova): #5582, #5578, #5579, #5583, #5587, #5599, #5601, #5600, #5606, #5612, #5632, #5635
{% generation %} training chat templatesThree more model families gain training-compatible chat templates with {% generation %} markers, so assistant_only_loss=True works out of the box:
maybe_apply_chat_template by @albertvillanova in https://github.com/huggingface/trl/pull/5567is_chat_template_prefix_preserving by @qgallouedec in https://github.com/huggingface/trl/pull/5558forward_masked_logits by @qgallouedec in https://github.com/huggingface/trl/pull/5626_tokenizer as trainer attribute by @albertvillanova in https://github.com/huggingface/trl/pull/5489PreTrainedTokenizerBase for tokenizer type hints by @qgallouedec in https://github.com/huggingface/trl/pull/5629async_reward_X to async_X by @qgallouedec in https://github.com/huggingface/trl/pull/5616attention_mask instead of label != -100), and wrong cross-rank aggregation (unweighted mean instead of sum/count). The reported entropy under completion_only_loss=True and sequence parallelism is now correct. Same fix applied to DPO entropy logging. By @qgallouedec in https://github.com/huggingface/trl/pull/5620AsyncGRPOTrainer's processing_class to AsyncRolloutWorker by @xuanduy04 in https://github.com/huggingface/trl/pull/5538generate_tiny_models for gpt-oss by @albertvillanova in https://github.com/huggingface/trl/pull/5622TestSupportsToolCalling for improved coverage by @qgallouedec in https://github.com/huggingface/trl/pull/5537is_chat_template_prefix_preserving by @qgallouedec in https://github.com/huggingface/trl/pull/5558TestSupportsToolCalling for improved coverage by @qgallouedec in https://github.com/huggingface/trl/pull/5537{% generation %} markers for training chat template by @casinca in https://github.com/huggingface/trl/pull/5519forward_masked_logits by @qgallouedec in https://github.com/huggingface/trl/pull/5626PreTrainedTokenizerBase for tokenizer type hints by @qgallouedec in https://github.com/huggingface/trl/pull/5629async_reward_X to async_X by @qgallouedec in https://github.com/huggingface/trl/pull/5616Full Changelog: https://github.com/huggingface/trl/compare/v1.2.0...v1.3.0
hf buckets commandsAll hf buckets commands now use the unified --format [auto|human|agent|json|quiet] flag and the out singleton for consistent, scriptable output. The previous --quiet and --format table|json flags have been replaced by a single --format option that works across create, list, info, delete, rm, move, and cp. Success messages use out.result(), detail views use out.dict(), and listings use out.table() with proper empty-results handling — making the buckets CLI consistent with the rest of the hf command suite.
# Quiet mode: print only bucket IDs
hf buckets list --format quiet
# JSON output for scripting
hf buckets create my-bucket --format json
# Agent-friendly structured output
hf buckets info username/my-bucket --format agent
📚 Documentation: Buckets guide · CLI guide
You can now filter buckets by name when listing them, both from the Python API and the CLI. Pass search="checkpoint" to list_buckets() or --search "checkpoint" to hf buckets list to find buckets matching a name pattern, without having to list and filter client-side.
# Filter buckets by name
hf buckets list --search "checkpoint"
# Filter buckets by name in Python
for bucket in list_buckets(search="checkpoint"):
print(bucket.id)
📚 Documentation: Buckets guide · CLI guide
pi agent by @hanouticelina in #4125mainSize to ExpandDatasetProperty_T by @Wauplin in #4136Qwen 3.5 and 3.6 MoE (text-only) were broken when using with FP8. It should now work again with this :saluting_face:
Full Changelog: https://github.com/huggingface/transformers/compare/v5.6.1...v5.6.2
tools to TextGenerationPipeline in #1655inputMetadata API for simplified internals in #1657Full Changelog: 4.1.0...4.2.0
past_key_values via pipeline function) in #1638q1, q1f16, q2, and q2f16 data types in #1647Full Changelog: 4.0.0...4.1.0
Flash attention path was broken! Sorry everyone for this one 🤗
OpenAI Privacy Filter is a bidirectional token-classification model for personally identifiable information (PII) detection and masking in text. It is intended for high-throughput data sanitization workflows where teams need a model that they can run on-premises that is fast, context-aware, and tunable. The model labels an input sequence in a single forward pass, then decodes coherent spans with a constrained Viterbi procedure, predicting probability distributions over 8 privacy-related output categories for each input token.
Links: Documentation
Privacy Filter] Add model (#45580) by @vasqu in #45580Qianfan-OCR is a 4B-parameter end-to-end document intelligence model developed by Baidu that performs direct image-to-text conversion without traditional multi-stage OCR pipelines. It supports a broad range of prompt-driven tasks including structured document parsing, table extraction, chart understanding, document question answering, and key information extraction all within one unified model. The model features a unique "Layout-as-Thought" capability that generates structured layout representations before producing final outputs, making it particularly effective for complex documents with mixed element types.
Links: Documentation | Paper
SAM3-LiteText is a lightweight variant of SAM3 that replaces the heavy SAM3 text encoder (353M parameters) with a compact MobileCLIP-based text encoder optimized through knowledge distillation, while keeping the SAM3 ViT-H image encoder intact. This reduces text encoder parameters by up to 88% while maintaining segmentation performance comparable to the original model. The model enables efficient vision-language segmentation by addressing the redundancy found in text prompting for segmentation tasks.
Links: Documentation | Paper
SLANet and SLANet_plus are lightweight models designed for table structure recognition, focusing on accurately recognizing table structures in documents and natural scenes. The model improves accuracy and inference speed by adopting a CPU-friendly lightweight backbone network PP-LCNet, a high-low-level feature fusion module CSP-PAN, and a feature decoding module SLA Head that aligns structural and positional information. SLANet was developed by Baidu PaddlePaddle Vision Team as part of their table structure recognition solutions.
Links: Documentation
The internal rotary_fn is no longer registered as a hidden kernel function, so any code referencing self.rotary_fn(...) within an Attention module will break and must be updated to call the function directly instead.
Kernels] Fix kernel function registration (#45420) by @vasquThe transformers serve command received several enhancements, including a new /v1/completions endpoint for legacy text completion, multimodal support for audio and video inputs, improved tool-calling via parse_response, proper forwarding of tool_calls/tool_call_id fields, a 400 error on model mismatch when the server is pinned to a specific model, and fixes for the response API. Documentation was also updated to cover new serving options such as --compile and --model-timeout.
transformers serve (#44558) by @rain-1 in [#44558]transformers serve is pinned (#45443) by @qgallouedec in [#45443]parse_response (#45485) by @SunMarc in [#45485]tool_calls/tool_call_id in processor inputs (#45418) by @qgallouedec in [#45418]Several vision-related bug fixes were applied in this release, including correcting Qwen2.5-VL temporal RoPE scaling for still images, fixing missing/mismatched image processor backends for Emu3 and BLIP, resolving modular image processor class duplication, and preventing accelerate from incorrectly splitting vision encoders in PeVideo/PeAudioVideo models. Image loading performance was also improved by leveraging torchvision's native decode_image in the torchvision backend, yielding up to ~17% speedup over PIL-based loading.
decode_image to load images in the torchvision backend (#45195) by @yonigozlan in [#45195]Fixed several bugs affecting distributed training, including silently wrong results or NaN loss with Expert Parallelism, NaN weights on non-rank-0 FSDP processes, and a resize failure in PP-DocLayoutV3; additionally added support for loading adapters with Tensor Parallelism, added MoE to the Gemma4 TP plan, and published documentation for TP training.
Fixed a docstring typo in streamer classes, resolved a Kimi-K2.5 tokenizer regression and _patch_mistral_regex AttributeError, and patched a streaming generation crash for Qwen3VLProcessor caused by incorrect _tokenizer attribute access. Additional housekeeping included moving the GPT-SW3 instruct tokenizer to an internal testing repo and fixing a global state leak in the tokenizer registry during tests.
Tokenizers] Move gpt sw3 tokenizer out (#45404) by @vasqu in [#45404]test_processors (#45318) by @tarekziade in [#45318]Cache handling was improved for Gemma4 and Gemma3n models by dissociating KV state sharing from the Cache class, ensuring KV states are always shared regardless of whether a Cache is used. Additionally, the image cache for Paddle models was updated to align with the latest API.
Audio models gained vLLM compatibility through targeted fixes across several model implementations, while reliability improvements were also made including exponential back-off retries for audio file downloads, a crash fix in the text-to-speech pipeline when generation configs contain None values, and corrected test failures for Kyutai Speech-To-Text.
text-to-speech pipeline crash when generation config contains None values (#45107) by @jiqing-feng in [#45107]Privacy Filter] Add model (#45580) by @vasqu in [#45580]pass (inherits from DSV3 MoE) (#45572) by @casinca in [#45572]DeepseekV3MoE and remote official implementation (#45441) by @casinca in [#45441]prepare_decoder_input_ids_from_labels (#45516) by @Tokarak in [#45516]TextToAudioPipeline missing <bos> token (#45525) by @jiqing-feng in [#45525]Conversion Mapping] Small fixups (#45483) by @vasqu in [#45483]get_image_size method (#45461) by @JiauZhang in [#45461]fix] Always early return for non-Mistral models in _patch_mistral_regex (#45444) by @tomaarsen in [#45444]fix] Make Qwen2_5OmniProcessor warning a lot less noisy via warning_once (#45455) by @tomaarsen in [#45455]step3_vl to MODELS_WITH_INCORRECT_HUB_TOKENIZER_CLASS (#45449) by @hmellor in [#45449]fix] PEFT integration fixes preventing save/load & integration (#45428) by @tomaarsen in [#45428]apply_chat_template crash on tool_call messages without content (#45348) by @qgallouedec in [#45348]trackio integration to use Buckets and "freeze" Space after training (#45329) by @abidlabs in [#45329]cohere_asr: fix device issue for test_model_parallel_beam_search (#45214) by @kaixuanliu in [#45214][transformers] prefix in non-verbose mode (#45316) by @zucchini-nlp in [#45316]pi0 model (#45011) by @kaixuanliu in [#45011]grouped_mm (#45001) by @Sai-Suraj-27 in [#45001]Wav2Vec2Config.vocab_size type to allow None (#45108) by @jiqing-feng in [#45108]hasattr(torch.backends.cudnn, "conv") to conftest.py (#45263) by @ydshieh in [#45263]SmolVLM video processor resize using wrong interpolation after backend refactor (#45258) by @ydshieh in [#45258]Qwen2IntegrationTest (#45268) by @ydshieh in [#45268]torch.backends.cudnn.conv.fp32_precision explicitly. (#45248) by @ydshieh in [#45248]torch 2.11 (#45243) by @ydshieh in [#45243]get_test_info.py (related to tiny model creation) (#45238) by @ydshieh in [#45238]test_register_result_handler (#45188) by @SunMarc in [#45188]The following contributors have made significant changes to the library over the last release:
Privacy Filter] Add model (#45580)Conversion Mapping] Small fixups (#45483)Kernels] Fix kernel function registration (#45420)Tokenizers] Move gpt sw3 tokenizer out (#45404)transformers serve (#44558)test_processors (#45318)hasattr(torch.backends.cudnn, "conv") to conftest.py (#45263)SmolVLM video processor resize using wrong interpolation after backend refactor (#45258)Qwen2IntegrationTest (#45268)torch.backends.cudnn.conv.fp32_precision explicitly. (#45248)torch 2.11 (#45243)get_test_info.py (related to tiny model creation) (#45238)SSDTrainer — Simple Self-DistillationA new experimental SSDTrainer implements the method described in Embarrassingly Simple Self-Distillation Improves Code Generation. SSD samples completions from the model itself at a training-time temperature/truncation setting, then fine-tunes on those raw, unverified samples with standard cross-entropy loss. No reward model, verifier, teacher model, or RL: just prompts and the model.
from datasets import Dataset
from trl.experimental.ssd import SSDConfig, SSDTrainer
dataset = Dataset.from_dict({
"prompt": [
[{"role": "user", "content": "Write a function to add two numbers."}],
[{"role": "user", "content": "Write a function to check if a number is prime."}],
],
})
trainer = SSDTrainer(
model="Qwen/Qwen3-4B-Instruct",
args=SSDConfig(
output_dir="ssd-model",
temperature=0.6, # T_train from the paper
top_k=20,
top_p=0.95,
learning_rate=5e-6,
),
train_dataset=dataset,
)
trainer.train()
by @kashif in https://github.com/huggingface/trl/pull/5505
GRPOTrainerWhen tool calls produce more tokens than max_completion_length allows, GRPOTrainer now rolls back the tool messages/images added in the current iteration instead of trying to truncate them. This removes ~80 lines of fragile, image-boundary-aware bookkeeping in favor of a ~15-line snapshot-and-rollback. Since overlong samples almost always get rewarded as failures anyway, the learning signal is effectively unchanged — but the code is dramatically simpler and no longer needs per-VLM-family vision-token lookup tables.
by @qgallouedec in https://github.com/huggingface/trl/pull/5521
Continuing the effort from v1.1:
{% generation %} markers, enabling assistant-only loss masking for DeepSeek-V3 models. By @RudrenduPaul in https://github.com/huggingface/trl/pull/5527As a result of a tightened detection (see fixes below), the list of templates reported as tool-calling capable is now correct — notably, the basic Llama 3 template is no longer falsely classified as tool-calling capable.
A major cleanup sweep keeps KTOTrainer and DPOTrainer in lockstep, same initialization patterns, same config surface, same precompute behavior:
precompute_ref_batch_size to KTO (https://github.com/huggingface/trl/pull/5530)ref_model initialization (https://github.com/huggingface/trl/pull/5534)None args (https://github.com/huggingface/trl/pull/5531)generate_during_eval (https://github.com/huggingface/trl/pull/5551)ref_model when precompute_ref_log_probs is set in DPO/KTO (https://github.com/huggingface/trl/pull/5542)All by @albertvillanova.
prepare_multimodal_messages by @albertvillanova in https://github.com/huggingface/trl/pull/5474prepare_multimodal_messages by @albertvillanova in https://github.com/huggingface/trl/pull/5508supports_tool_calling falsely accepting templates that drop assistant tool_calls by @qgallouedec in https://github.com/huggingface/trl/pull/5517add_response_schema for VLM processors — the schema was being set on the outer processor instead of the inner tokenizer, so it had no effect. This also collapses a handful of __init__/decode-gate workarounds. By @qgallouedec in https://github.com/huggingface/trl/pull/5520use_transformers_paged in GRPOConfig and RLOOConfig (and remove entirely from experimental OnlineDPOConfig, GOLDConfig, SelfDistillationConfig). Will be removed from the remaining configs in v2.0.0. In a small A/B benchmark (Qwen3-0.6B GRPO), the paged path is ~20% slower and uses ~6x more peak VRAM than the default; it's also superseded by transformers continuous batching. By @qgallouedec in https://github.com/huggingface/trl/pull/5544chat_templates/README by @qgallouedec in https://github.com/huggingface/trl/pull/5545supports_tool_calling falsely accepting templates that drop assistant tool_calls by @qgallouedec in https://github.com/huggingface/trl/pull/5517use_transformers_paged by @qgallouedec in https://github.com/huggingface/trl/pull/5544add_response_schema for VLM processors by @qgallouedec in https://github.com/huggingface/trl/pull/5520chat_templates/README by @qgallouedec in https://github.com/huggingface/trl/pull/5545Full Changelog: https://github.com/huggingface/trl/compare/v1.1.0...v1.2.0
Every Gradio Space now auto-serves an /agents.md endpoint, a machine-readable API description that AI agents can read and call directly. Point your coding agents (like Claude Code, Codex, or Pi) at it and they figure out how to use the Space without any setup.