dimensions & encoding_format parameter to InferenceClient for output embedding size #3671 by @mishig25Full Changelog: https://github.com/huggingface/huggingface_hub/compare/v1.3.0...v1.3.1
hf models, hf datasets, hf spaces CommandsThe CLI has been reorganized with dedicated commands for Hub discovery, while hf repo stays focused on managing your own repositories.
New commands:
# Models
hf models ls --author=Qwen --limit=10
hf models info Qwen/Qwen-Image-2512
# Datasets
hf datasets ls --filter "format:parquet" --sort=downloads
hf datasets info HuggingFaceFW/fineweb
# Spaces
hf spaces ls --search "3d"
hf spaces info enzostvs/deepsite
This organization mirrors the Python API (list_models, model_info, etc.), keeps the hf <resource> <action> pattern, and is extensible for future commands like hf papers or hf collections.
hf models/hf datasets/hf spaces commands by @hanouticelina in #3669You can now install the transformers CLI alongside the huggingface_hub CLI using the standalone installer scripts.
# Install hf CLI only (default)
curl -LsSf https://hf.co/cli/install.sh | bash -s
# Install both hf and transformers CLIs
curl -LsSf https://hf.co/cli/install.sh | bash -s -- --with-transformers
# Install hf CLI only (default)
powershell -c "irm https://hf.co/cli/install.ps1 | iex"
# Install both hf and transformers CLIs
powershell -c "irm https://hf.co/cli/install.ps1 | iex" -WithTransformers
Once installed, you can use the transformers CLI directly:
transformers serve
transformers chat openai/gpt-oss-120b
New hf jobs stats command to monitor your running jobs in real-time, similar to docker stats. It displays a live table with CPU, memory, network, and GPU usage.
>>> hf jobs stats
JOB ID CPU % NUM CPU MEM % MEM USAGE NET I/O GPU UTIL % GPU MEM % GPU MEM USAGE
------------------------ ----- ------- ----- -------------- --------------- ---------- --------- ---------------
6953ff6274100871415c13fd 0% 3.5 0.01% 1.3MB / 15.0GB 0.0bps / 0.0bps 0% 0.0% 0.0B / 22.8GB
A new HfApi.fetch_jobs_metrics() method is also available:
>>> for metrics in fetch_job_metrics(job_id="6953ff6274100871415c13fd"):
... print(metrics)
{
"cpu_usage_pct": 0,
"cpu_millicores": 3500,
"memory_used_bytes": 1306624,
"memory_total_bytes": 15032385536,
"rx_bps": 0,
"tx_bps": 0,
"gpus": {
"882fa930": {
"utilization": 0,
"memory_used_bytes": 0,
"memory_total_bytes": 22836000000
}
},
"replica": "57vr7"
}
The direction parameter in list_models, list_datasets, and list_spaces is now deprecated and not used. The sorting is always descending.
direction in list repos methods by @hanouticelina in #3630create_repo returning wrong repo_id by @hanouticelina in #3634The following contributors have made significant changes to the library over the last release:
create_repo returning wrong repo_id by @hanouticelina in #3634self.endpoint in job-related APIs for custom endpoint support by @PredictiveManish in #3653hf-xet Requirements missmatch by @0Falli0 in #3239@dataclass_transform decorator to dataclass_with_extra by @charliermarsh in #3639Full Changelog: https://github.com/huggingface/huggingface_hub/compare/v1.2.3...v1.2.4
Patch release for #3618 by @Wauplin.
When creating a new repo, we should default to private=None instead of private=False. This is already the case when using the API but not when using the CLI. This is a bug likely introduced when switching to Typer. When defaulting to None, the repo visibility will default to False except if the organization has configured repos to be "private by default" (the check happens server-side, so it shouldn't be hardcoded client-side).
Full Changelog: https://github.com/huggingface/huggingface_hub/compare/v1.2.2...v1.2.3
Full Changelog: https://github.com/huggingface/huggingface_hub/compare/v1.2.1...v1.2.2
We've improved how the huggingface_hub library handles rate limits from the Hub. When you hit a rate limit, you'll now see clear, actionable error messages telling you exactly how long to wait and how many requests you have left.
HfHubHTTPError: 429 Too Many Requests for url: https://huggingface.co/api/models/username/reponame.
Retry after 55 seconds (0/2500 requests remaining in current 300s window).
When a 429 error occurs, the SDK automatically parses the RateLimit header to extract the exact number of seconds until the rate limit resets, then waits precisely that duration before retrying. This applies to file downloads (i.e. Resolvers), uploads, and paginated Hub API calls (list_models, list_datasets, list_spaces, etc.).
More info about Hub rate limits in the docs 👉 here.
- Parse rate limit headers for better 429 error messages by @hanouticelina in #3570
- Use rate limit headers for smarter retry in http backoff by @hanouticelina in #3577
- Harmonize retry behavior for metadata fetch and
HfFileSystemby @hanouticelina in #3583- Add retry for preupload endpoint by @hanouticelina in #3588
- Use default retry values in pagination by @hanouticelina in #3587
Daily Papers endpoint: You can now programmatically access Hugging Face's daily papers feed. You can filter by week, month, or submitter, and sort by publication date or trending.
from huggingface_hub import list_daily_papers
for paper in list_daily_papers(date="2025-12-03"):
print(paper.title)
# DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models
# ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration
# MultiShotMaster: A Controllable Multi-Shot Video Generation Framework
# Deep Research: A Systematic Survey
# MG-Nav: Dual-Scale Visual Navigation via Sparse Spatial Memory
...
Add daily papers endpoint by @BastienGimbert in #3502 Add more parameters to daily papers by @Samoed in #3585
Offline mode helper: we recommend using huggingface_hub.is_offline_mode() to check whether offline mode is enabled instead of checking HF_HUB_OFFLINE directly.
Add
offline_modehelper by @Wauplin in #3593 Rename utility tois_offline_modeby @Wauplin #3598
Inference Endpoints: You can now configure scaling metrics and thresholds when deploying endpoints.
feat(endpoints): scaling metric and threshold by @oOraph in #3525
Exposed utilities: RepoFile and RepoFolder are now available at the root level for easier imports.
Expose
RepoFileandRepoFolderat root level by @Wauplin in #3564
OVHcloud AI Endpoints was added as an official Inference Provider in v1.1.5. OVHcloud provides European-hosted, GDPR-compliant model serving for your AI applications.
import os
from huggingface_hub import InferenceClient
client = InferenceClient(
api_key=os.environ["HF_TOKEN"],
)
completion = client.chat.completions.create(
model="openai/gpt-oss-20b:ovhcloud",
messages=[
{
"role": "user",
"content": "What is the capital of France?"
}
],
)
print(completion.choices[0].message)
Add OVHcloud AI Endpoints as an Inference Provider by @eliasto in #3541
We also added support for automatic speech recognition (ASR) with Replicate, so you can now transcribe audio files easily.
import os
from huggingface_hub import InferenceClient
client = InferenceClient(
provider="replicate",
api_key=os.environ["HF_TOKEN"],
)
output = client.automatic_speech_recognition("sample1.flac", model="openai/whisper-large-v3")
[Inference Providers] Add support for ASR with Replicate by @hanouticelina in #3538
The truncation_direction parameter in InferenceClient.feature_extraction ( (and its async counterpart) now uses lowercase values ("left"/"right" instead of "Left"/"Right") for consistency with other specs. The Async counterpart has been updated as well
[Inference] Use lowercase left/right truncation direction parameter by @Wauplin in #3548
HfFileSystem: A new top-level hffs alias make working with the filesystem interface more convenient.
>>> from huggingface_hub import hffs
>>> with hffs.open("datasets/fka/awesome-chatgpt-prompts/prompts.csv", "r") as f:
... print(f.readline())
"act","prompt"
"An Ethereum Developer","Imagine you are an experienced Ethereum developer tasked..."
[HfFileSystem] Add top level hffs by @lhoestq in #3556 [HfFileSystem] Add expand_info arg by @lhoestq in #3575
Paginated results when listing user access requests: list_pending_access_requests, list_accepted_access_requests, and list_rejected_access_requests now return an iterator instead of a list. This allows lazy loading of results for repositories with a large number of access requests. If you need a list, wrap the call with list(...).
Paginated results in
list_user_accessby @Wauplin in #3535
num_workers by @Qubitium in #3532HfApi download utils by @schmrlng in #3531whoami by @Wauplin in #3568repo_type_and_id_from_hf_id by @pulltheflower in #3507list_repo_tree in snapshot_download by @hanouticelina in #3565hf login example to hf auth login by @alisheryeginbay in #3590FileNotFoundError in CLI update check by @hanouticelina in #3574HfHubHTTPError reduce error by adding factory function by @owenowenisme in #3579constants.HF_HUB_ETAG_TIMEOUT as timeout for get_hf_file_metadata by @krrome in #3595huggingface_hub as dependency for hf by @Wauplin in #3527The following contributors have made significant changes to the library over the last release:
HfApi download utils (#3531)[HfFileSystem] Add top level hffs by @lhoestq #3556.
Example:
>>> from huggingface_hub import hffs
>>> with hffs.open("datasets/fka/awesome-chatgpt-prompts/prompts.csv", "r") as f:
... print(f.readline())
... print(f.readline())
"act","prompt"
"An Ethereum Developer","Imagine you are an experienced Ethereum developer tasked..."
Full Changelog: https://github.com/huggingface/huggingface_hub/compare/v1.1.6...v1.1.7
This release includes multiple bug fixes:
list_repo_tree in snapshot_download #3565 by @hanouticelinaHfHubHTTPError reduce error by adding factory function #3579 by @owenowenismeFileNotFoundError in CLI update check #3574 by @hanouticelinatiny-agents CLI #3573 by @WauplinFull Changelog: https://github.com/huggingface/huggingface_hub/compare/v1.1.5...v1.1.6
OVHcloud AI Endpoints is now an official Inference Provider on Hugging Face! 🎉 OVHcloud delivers fast, production ready inference on secure, sovereign, fully 🇪🇺 European infrastructure - combining advanced features with competitive pricing.
import os
from huggingface_hub import InferenceClient
client = InferenceClient(
api_key=os.environ["HF_TOKEN"],
)
completion = client.chat.completions.create(
model="openai/gpt-oss-20b:ovhcloud",
messages=[
{
"role": "user",
"content": "What is the capital of France?"
}
],
)
print(completion.choices[0].message)
More snippets examples in the provider documentation 👉 here.
Installing the CLI is now much faster, thanks to @Boulaouaney for adding support for uv, bringing faster package installation.
This release also includes the following bug fixes:
HF_DEBUG environment variable in #3562 by @hanouticelinaFull Changelog: https://github.com/huggingface/huggingface_hub/compare/v1.1.3...v1.1.4
Full Changelog: https://github.com/huggingface/huggingface_hub/compare/v1.1.0...v1.1.3
⚡ This release significantly improves the file download experience by making it faster and cleaning up the terminal output.
snapshot_download is now always multi-threaded, leading to significant performance gains. We removed a previous limitation, as Xet's internal resource management ensures we can parallelize downloads safely without resource contention. A sample benchmark showed this made the download much faster!
Additionally, the output for snapshot_download and hf download CLI is now much less verbose. Per file logs are hidden by default, and all individual progress bars are combined into a single progress bar, resulting in a much cleaner output.
snapshot_download and hf download by @Wauplin in #3523🆕 WaveSpeedAI is now an official Inference Provider on Hugging Face! 🎉 WaveSpeedAI provides fast, scalable, and cost-effective model serving for creative AI applications, supporting text-to-image, image-to-image, text-to-video, and image-to-video tasks. 🎨
import os
from huggingface_hub import InferenceClient
client = InferenceClient(
provider="wavespeed",
api_key=os.environ["HF_TOKEN"],
)
video = client.text_to_video(
"A cat riding a bike",
model="Wan-AI/Wan2.2-TI2V-5B",
)
More snippets examples in the provider documentation 👉 here.
We also added support for image-segmentation task for fal, enabling state-of-the-art background removal with RMBG v2.0.
import os
from huggingface_hub import InferenceClient
client = InferenceClient(
provider="fal-ai",
api_key=os.environ["HF_TOKEN"],
)
output = client.image_segmentation("cats.jpg", model="briaai/RMBG-2.0")
image-segmentation for fal by @hanouticelina in #3521Following the complete revamp of the Hugging Face CLI in v1.0, this release builds on that foundation by adding powerful new features and improving accessibility.
hf PyPI PackageTo make the CLI even easier to access, we've published a new, minimal PyPI package: hf. This package installs the hf CLI tool and It's perfect for quick, isolated execution with modern tools like uvx.
# Run the CLI without installing it
> uvx hf auth whoami
⚠️ Note: This package is for the CLI only. Attempting to import hf in a Python script will correctly raise an ImportError.
A big thank you to @thorwhalen for generously transferring the hf package name to us on PyPI. This will make the CLI much more accessible for all Hugging Face users. 🤗
hf CLI to PyPI by @Wauplin in #3511A new command group, hf endpoints, has been added to deploy and manage your Inference Endpoints directly from the terminal.
This provides "one-liners" for deploying, deleting, updating, and monitoring endpoints. The CLI offers two clear paths for deployment: hf endpoints deploy for standard Hub models and hf endpoints catalog deploy for optimized Model Catalog configurations.
> hf endpoints --help
Usage: hf endpoints [OPTIONS] COMMAND [ARGS]...
Manage Hugging Face Inference Endpoints.
Options:
--help Show this message and exit.
Commands:
catalog Interact with the Inference Endpoints catalog.
delete Delete an Inference Endpoint permanently.
deploy Deploy an Inference Endpoint from a Hub repository.
describe Get information about an existing endpoint.
ls Lists all Inference Endpoints for the given namespace.
pause Pause an Inference Endpoint.
resume Resume an Inference Endpoint.
scale-to-zero Scale an Inference Endpoint to zero.
update Update an existing endpoint.
A new command, hf cache verify, has been added to check your cached files against their checksums on the Hub. This is a great tool to ensure your local cache is not corrupted and is in sync with the remote repository.
> hf cache verify --help
Usage: hf cache verify [OPTIONS] REPO_ID
Verify checksums for a single repo revision from cache or a local directory.
Examples:
- Verify main revision in cache: `hf cache verify gpt2`
- Verify specific revision: `hf cache verify gpt2 --revision refs/pr/1`
- Verify dataset: `hf cache verify karpathy/fineweb-edu-100b-shuffle --repo-type dataset`
- Verify local dir: `hf cache verify deepseek-ai/DeepSeek-OCR --local-dir /path/to/repo`
Arguments:
REPO_ID The ID of the repo (e.g. `username/repo-name`). [required]
Options:
--repo-type [model|dataset|space]
The type of repository (model, dataset, or
space). [default: model]
--revision TEXT Git revision id which can be a branch name,
a tag, or a commit hash.
--cache-dir TEXT Cache directory to use when verifying files
from cache (defaults to Hugging Face cache).
--local-dir TEXT If set, verify files under this directory
instead of the cache.
--fail-on-missing-files Fail if some files exist on the remote but
are missing locally.
--fail-on-extra-files Fail if some files exist locally but are not
present on the remote revision.
--token TEXT A User Access Token generated from
https://huggingface.co/settings/tokens.
--help Show this message and exit.
hf cache verify by @hanouticelina in #3461Managing your local cache is now easier. The hf cache ls command has been enhanced with two new options:
--sort: Sort your cache by accessed, modified, name, or size. You can also specify order (e.g., modified:asc to find the oldest files).--limit: Get just the top N results after sorting (e.g., --limit 10).# List top 10 most recently accessed repos
> hf cache ls --sort accessed --limit 10
# Find the 5 largest repos you haven't used in over a year
> hf cache ls --filter "accessed>1y" --sort size --limit 5
Finally, we've patched the CLI installer script to fix a bug for zsh users. The installer now works correctly across all common shells.
We've fixed a bug in HfFileSystem where the instance cache would break when using multiprocessing with the "fork" start method.
Thanks to @BastienGimbert for translating the README to French 🇫🇷 🤗
and Thanks to @didier-durand for fixing multiple language typos in the library! 🤗
update-inference-types workflow by @hanouticelina in #3516The following contributors have made significant changes to the library over the last release:
In huggingface_hub v1.0 release, we've removed our dependency on aiohttp to replace it with httpx but we forgot to remove it from the huggingface_hub[inference] extra dependencies in setup.py. This patch release removes it, making the inference extra removed as well.
The internal method _import_aiohttp being unused, it has been removed as well.
Full Changelog: https://github.com/huggingface/huggingface_hub/compare/v1.0.0...v1.0.1
Check out our blog post announcement!
The huggingface_hub library now uses httpx instead of requests for HTTP requests. This change was made to improve performance and to support both synchronous and asynchronous requests the same way. We therefore dropped both requests and aiohttp dependencies.
The get_session and hf_raise_for_status still exist and respectively returns an httpx.Client and processes a httpx.Response object. An additional get_async_client utility has been added for async logic.
The exhaustive list of breaking changes can be found here.
hf_raise_for_status on async stream + tests by @Wauplin in #3442git_vs_http guide by @Wauplin in #3357huggingface_hub 1.0 marks a complete transformation of our command-line experience. We've reimagined the CLI from the ground up, creating a tool that feels native to modern ML workflows while maintaining the simplicity the community love.
huggingface-cliThis release marks the end of an era with the complete removal of the huggingface-cli command. The new hf command (introduced in v0.34.0) takes its place with a cleaner, more intuitive design that follows a logical "resource-action" pattern. This breaking change simplifies the user experience and aligns with modern CLI conventions - no more typing those extra 11 characters!
huggingface-cli entirely in favor of hf by @Wauplin in #3404hf CLI RevampThe new CLI introduces a comprehensive set of commands for repository and file management that expose powerful HfApi functionality directly from the terminal:
> hf repo --help
Usage: hf repo [OPTIONS] COMMAND [ARGS]...
Manage repos on the Hub.
Options:
--help Show this message and exit.
Commands:
branch Manage branches for a repo on the Hub.
create Create a new repo on the Hub.
delete Delete a repo from the Hub.
move Move a repository from a namespace to another namespace.
settings Update the settings of a repository.
tag Manage tags for a repo on the Hub.
A dry run mode has been added to hf download, which lets you preview exactly what will be downloaded before committing to the transfer—showing file sizes, what's already cached, and total bandwidth requirements in a clean table format:
> hf download gpt2 --dry-run
[dry-run] Fetching 26 files: 100%|██████████████████████████████████████████████████████████| 26/26 [00:00<00:00, 50.66it/s]
[dry-run] Will download 26 files (out of 26) totalling 5.6G.
File Bytes to download
--------------------------------- -----------------
.gitattributes 445.0
64-8bits.tflite 125.2M
64-fp16.tflite 248.3M
64.tflite 495.8M
README.md 8.1K
config.json 665.0
flax_model.msgpack 497.8M
generation_config.json 124.0
merges.txt 456.3K
model.safetensors 548.1M
onnx/config.json 879.0
onnx/decoder_model.onnx 653.7M
onnx/decoder_model_merged.onnx 655.2M
...
The CLI now provides intelligent shell auto-completion that suggests available commands, subcommands, options, and arguments as you type - making command discovery effortless and reducing the need to constantly check --help.
The CLI now also checks for updates in the background, ensuring you never miss important improvements or security fixes. Once every 24 hours, the CLI silently checks PyPI for newer versions and notifies you when an update is available - with personalized upgrade instructions based on your installation method.
The cache management CLI has been completely revamped with the removal of hf scan cache and hf scan delete in favor of docker-inspired commands that are more intuitive. The new hf cache ls provides rich filtering capabilities, hf cache rm enables targeted deletion, and hf cache prune cleans up detached revisions.
# List cached repos
>>> hf cache ls
ID SIZE LAST_ACCESSED LAST_MODIFIED REFS
--------------------------- -------- ------------- ------------- -----------
dataset/nyu-mll/glue 157.4M 2 days ago 2 days ago main script
model/LiquidAI/LFM2-VL-1.6B 3.2G 4 days ago 4 days ago main
model/microsoft/UserLM-8b 32.1G 4 days ago 4 days ago main
Found 3 repo(s) for a total of 5 revision(s) and 35.5G on disk.
# List cached repos with filters
>>> hf cache ls --filter "type=model" --filter "size>3G" --filter "accessed>7d"
# Output in different format
>>> hf cache ls --format json
>>> hf cache ls --revisions # Replaces the old --verbose flag
# Cache removal
>>> hf cache rm model/meta-llama/Llama-2-70b-hf
>>> hf cache rm $(hf cache ls --filter "accessed>1y" -q) # Remove old items
# Clean up detached revisions
hf cache prune # Removes all unreferenced revisions
Under the hood, this transformation is powered by Typer, significantly reducing boilerplate and making the CLI easier to maintain and extend with new features.
hf cache by @hanouticelina in #3439The new cross-platform installers simplify CLI installation by creating isolated sandboxed environments without interfering with your existing Python setup or project dependencies. The installers work seamlessly across macOS, Linux, and Windows, automatically handling dependencies and PATH configuration.
# On macOS and Linux
>>> curl -LsSf https://hf.co/cli/install.sh | sh
# On Windows
>>> powershell -ExecutionPolicy ByPass -c "irm https://hf.co/cli/install.ps1 | iex"
Finally, the [cli] extra has been removed - The CLI now ships with the core huggingface_hub package.
[cli] extra by @hanouticelina in #3451The v1.0 release is a major milestone for the huggingface_hub library. It marks our commitment to API stability and the maturity of the library. We have made several improvements and breaking changes to make the library more robust and easier to use. A migration guide has been written to reduce friction as much as possible: https://huggingface.co/docs/huggingface_hub/concepts/migration.
We'll list all breaking changes below:
Minimum Python version is now 3.9 (instead of 3.8).
HTTP backend migrated from requests to httpx. Expect some breaking changes on advances features and errors. The exhaustive list can be found here.
The deprecated huggingface-cli has been removed, hf (introduced in v0.34) replaces it with a clearer ressource-action CLI.
huggingface-cli entirely in favor of hf by @Wauplin in #3404The [cli] extra has been removed - The CLI now ships with the core huggingface_hub package.
[cli] extra by @hanouticelina in #3451Long deprecated classes like HfFolder, InferenceAPI, and Repository have been removed.
HfFolder and InferenceAPI classes by @Wauplin in #3344Repository class by @Wauplin in #3346constant.hf_cache_home have been removed. Use constants.HF_HOME instead.
use_auth_token is not supported anymore. Use token instead. Previously using use_auth_token automatically redirected to token with a warning
removed get_token_permission. Became useless when fine-grained tokens arrived.
removed update_repo_visibility. Use update_repo_settings instead.
removed is_write_action is all build_hf_headers methods. Not relevant since fine-grained tokens arrived.
removed write_permission arg from login method. Not relevant anymore.
renamed login(new_session) to login(skip_if_logged_in) in login methods. Not announced but hopefully very little friction. Only some notebooks to update on the Hub (will do it once released)
removed resume_download / force_filename / local_dir_use_symlinks parameters from hf_hub_download/snapshot_download (and mixins)
removed library / language / tags / task from list_models args
upload_file/upload_folder now returns a url to the commit created on the Hub as any other method creating a commit (create_commit, delete_file, etc.)
require keyword arguments on login methods
Remove any Keras 2.x and tensorflow-related code
hf_transfer support. hf_xet is now the default upload/download manager
Routing for Chat Completion API in Inference Providers is now done server-side. This saves 1 HTTP call + allows us to centralize logic to route requests to the correct provider. In the future, it enables use cases like choosing fastest or cheapest provider directly.
Also some updates in the docs:
We've added support for TypedDict to our @strict framework, which is our data validation tool for dataclasses. Typed dicts are now converted to dataclasses on-the-fly for validation, without mutating the input data. This logic is currently used by transformers to validate config files but is library-agnostic and can therefore be used by anyone. More details in this guide.
from typing import Annotated, TypedDict
from huggingface_hub.dataclasses import validate_typed_dict
def positive_int(value: int):
if not value >= 0:
raise ValueError(f"Value must be positive, got {value}")
class User(TypedDict):
name: str
age: Annotated[int, positive_int]
# Valid data
validate_typed_dict(User, {"name": "John", "age": 30})
Added a HfApi.list_organization_followers endpoint to list followers of an organization, similar to the existing one for user's followers.
sentence_similarity docstring by @tolgaakar in #3374image-to-image by @hanouticelina in #3399ty quality by @hanouticelina in #3441The following contributors have made changes to the library over the last release. Thank you!
sentence_similarity docstring (#3374) (#3375)This is the final minor release before v1.0.0. This release focuses on performance optimizations to HfFileSystem and adds a new get_organization_overview API endpoint.
We'll continue to release security patches as needed, but v0.37 will not happen. The next release will be 1.0.0. We’re also deeply grateful to the entire Hugging Face community for their feedback, bug reports, and suggestions that have shaped this library.
Full Changelog: https://github.com/huggingface/huggingface_hub/compare/v0.35.0...v0.36.0
HfFileSystemMajor optimizations have been implemented in HfFileSystem:
fs instance. This is particularily useful when streaming datasets in a distributed training environment. Each worker won't have to rebuild their cache anymoreListing files with .glob() has been greatly optimized:
from huggingface_hub import HfFileSystem
HfFileSystem().glob("datasets/HuggingFaceFW/fineweb-edu/data/*/*")
# Before: ~100 /tree calls (one per subdirectory)
# Now: 1 /tree call
maxdepth: do less /tree calls in glob() by @lhoestq in #3389Minor updates:
HfApiIt is now possible to get high-level information about an organization, the same way it is already possible to do with users:
>>> from huggingface_hub import get_organization_overview
>>> get_organization_overview("huggingface")
Organization(
avatar_url='https://cdn-avatars.huggingface.co/v1/production/uploads/1583856921041-5dd96eb166059660ed1ee413.png',
name='huggingface',
fullname='Hugging Face',
details='The AI community building the future.',
is_verified=True,
is_following=True,
num_users=198,
num_models=164, num_spaces=96,
num_datasets=1043,
num_followers=64814
)
sentence_similarity docstring by @tolgaakar in #3374ty quality by @hanouticelina in #3441The following contributors have made changes to the library over the last release. Thank you!
sentence_similarity docstring (#3374) (#3375)This release includes two bug fixes:
Full Changelog: https://github.com/huggingface/huggingface_hub/compare/v0.35.2...v0.35.3
Full Changelog: https://github.com/huggingface/huggingface_hub/compare/v0.35.1...v0.35.2
New inference provider! :fire:
Z.ai is now officially an Inference Provider on the Hub. See full documentation here: https://huggingface.co/docs/inference-providers/providers/zai-org.
from huggingface_hub import InferenceClient
client = InferenceClient(provider="zai-org")
completion = client.chat.completions.create(
model="zai-org/GLM-4.5",
messages=[{"role": "user", "content": "What is the capital of France?"}],
)
print("\nThinking:")
print(completion.choices[0].message.reasoning_content)
print("\nOutput:")
print(completion.choices[0].message.content)
Thinking:
Okay, the user is asking about the capital of France. That's a pretty straightforward geography question.
Hmm, I wonder if this is just a casual inquiry or if they need it for something specific like homework or travel planning. The question is very basic though, so probably just general knowledge.
Paris is definitely the correct answer here. It's been the capital for centuries, since the Capetian dynasty made it the seat of power. Should I mention any historical context? Nah, the user didn't ask for details - just the capital.
I recall Paris is also France's largest city and major cultural hub. But again, extra info might be overkill unless they follow up. Better keep it simple and accurate.
The answer should be clear and direct: "Paris". No need to overcomplicate a simple fact. If they want more, they'll ask.
Output:
The capital of France is **Paris**.
Paris has been the political and cultural center of France for centuries, serving as the seat of government, the residence of the President (Élysée Palace), and home to iconic landmarks like the Eiffel Tower, the Louvre Museum, and Notre-Dame Cathedral. It is also France's largest city and a global hub for art, fashion, gastronomy, and history.
Misc:
strict dataclasses https://github.com/huggingface/huggingface_hub/pull/3376Full Changelog: https://github.com/huggingface/huggingface_hub/compare/v0.35.0...v0.35.1
In v0.34.0 release, we announced Jobs, a new way to run compute on the Hugging Face Hub. In this new release, we are announcing Scheduled Jobs to run Jobs on a regular basic. Think "cron jobs running on GPU".
This comes with a fully-fledge CLI:
hf jobs scheduled run @hourly ubuntu echo hello world
hf jobs scheduled run "0 * * * *" ubuntu echo hello world
hf jobs scheduled ps -a
hf jobs scheduled inspect <id>
hf jobs scheduled delete <id>
hf jobs scheduled suspend <id>
hf jobs scheduled resume <id>
hf jobs scheduled uv run @weekly train.py
It is now possible to run a command with uv run:
hf jobs uv run --with lighteval -s HF_TOKEN lighteval endpoint inference-providers "model_name=openai/gpt-oss-20b,provider=groq" "lighteval|gsm8k|0|0"
hf jobs uv run by @lhoestq in #3303Some other improvements have been added to the existing Jobs API for a better UX.
And finally, Jobs documentation has been updated with new examples (and some fixes):
In addition to the Scheduled Jobs, some improvements have been added to the hf CLI.
Two new partners have been integrated to Inference Providers: Scaleway and PublicAI! (as part of releases 0.34.5 and 0.34.6).
Image to video is now supported in the InferenceClient:
from huggingface_hub import InferenceClient
client = InferenceClient(provider="fal-ai")
video = client.image_to_video(
"cat.png",
prompt="The cat starts to dance",
model="Wan-AI/Wan2.2-I2V-A14B",
)
Header content-type is now correctly set when sending an image or audio request (e.g. for image-to-image task). It is inferred either from the filename or the URL provided by the user. If user is directly passing raw bytes, the content-type header has to be set manually.
A .reasoning field has been added to the Chat Completion output. This is used by some providers to return reasoning tokens separated from the .content stream of tokens.
tiny-agents now handles AGENTS.md instruction file (see https://agents.md/).
Tools filtering has already been improved to avoid loading non-relevant tools from an MCP server:
HF_HUB_DISABLE_XET in the environment dump by @hanouticelina in #3290apps as a parameter to HfApi.list_models by @anirbanbasu in #3322ty type checker by @hanouticelina in #3294tycheck quality by @hanouticelina in #3320is_jsonable if circular reference by @Wauplin in #3348The following contributors have made changes to the library over the last release. Thank you!
apps as a parameter to HfApi.list_models (#3322)Full Changelog: https://github.com/huggingface/huggingface_hub/compare/v0.34.5...v0.34.6
[!Tip] All supported PublicAI models can be found here.
Public AI Inference Utility is a nonprofit, open-source project building products and organizing advocacy to support the work of public AI model builders like the Swiss AI Initiative, AI Singapore, AI Sweden, and the Barcelona Supercomputing Center. Think of a BBC for AI, a public utility for AI, or public libraries for AI.
from huggingface_hub import InferenceClient
client = InferenceClient(provider="publicai")
completion = client.chat.completions.create(
model="swiss-ai/Apertus-70B-Instruct-2509",
messages=[{"role": "user", "content": "What is the capital of Switzerland?"}],
)
print(completion.choices[0].message.content)