[v0.28.0]: Third-party Inference Providers on the Hub & multiple quality of life improvements and bug fixes
The InferenceClient now supports third-party providers, offering a unified interface to run inference across multiple services while leveraging models from the Hugging Face Hub. This update enables developers to:
A list of supported third-party providers can be found here.
Example of text-to-image inference with Replicate:
>>> from huggingface_hub import InferenceClient
>>> replicate_client = InferenceClient(
... provider="replicate",
... api_key="my_replicate_api_key", # Using your personal Replicate key
)
>>> image = replicate_client.text_to_image(
... "A cyberpunk cat hacking neural networks",
... model="black-forest-labs/FLUX.1-schnell"
)
>>> image.save("cybercat.png")
Another example of chat completion with Together AI:
>>> from huggingface_hub import InferenceClient
>>> client = InferenceClient(
... provider="together", # Use Together AI provider
... api_key="<together_api_key>", # Pass your Together API key directly
... )
>>> client.chat_completion(
... model="deepseek-ai/DeepSeek-R1",
... messages=[{"role": "user", "content": "How many r's are there in strawberry?"}],
... )
When using external providers, you can choose between two access modes: either use the provider's native API key, as shown in the examples above, or route calls through Hugging Face infrastructure (billed to your HF account):
>>> from huggingface_hub import InferenceClient
>>> client = InferenceClient(
... provider="fal-ai",
... token="hf_****" # Your Hugging Face token
)
⚠️ Parameters availability may vary between providers - check provider documentation. 🔜 New providers/models/tasks will be added iteratively in the future. 👉 You can find a list of supported tasks per provider and more details here.
- [InferenceClient] Add third-party providers support by @hanouticelina in #2757
- Unified
prepare_requestmethod + class-based providers by @Wauplin in #2777- [InferenceClient] Support proxy calls for 3rd party providers by @hanouticelina in #2781
- [InferenceClient] Add
text-to-videotask and update supported tasks and models by @hanouticelina in #2786- Add type hints for providers by @Wauplin in #2788
- [InferenceClient] Update inference documentation by @hanouticelina in #2776
- Add text-to-video to supported tasks by @Wauplin in #2790
The following change aligns the client with server-side updates by adding new repositories properties: usedStorage and resourceGroup.
[HfApi] update list of repository properties following server side updates by @hanouticelina in #2728
Extends empty commit prevention to file copy operations, preserving clean version histories when no changes are made.
[HfApi] prevent empty commits when copying files by @hanouticelina in #2730
Thanks to @WizKnight, the hindi translation is much better!
Improved Hindi Translation in Documentation📝 by @WizKnight in #2697
The like endpoint has been removed to prevent misuse. You can still remove existing likes using the unlikeendpoint.
[HfApi] remove
likeendpoint by @hanouticelina in #2739
chat_completion()'s logit_bias as UNUSED by @hanouticelina in #2724py.typed to be compliant with PEP-561 again by @hanouticelina in #2752typing.get_type_hints call on a ModelHubMixin by @aliberts in #2729CardData.get() to respect default values when None by @hanouticelina in #2770RepoCard test on Windows by @hanouticelina in #2774Fetched April 7, 2026