Hybrid inference now GA; on-device with cloud fallback
Hybrid inference for web apps is now Generally Available (GA). This enables running inference using on-device models when available and seamlessly falling back to cloud-hosted models otherwise. Includes ability to determine which inference mode was used for a request.
Fetched June 5, 2026


