A race condition in connection shutdown during a hot reload event occasionally left some connections in an active state instead of entering terminating state. This could cause out-of-memory errors over time as multiple pipelines remained active.
Connections that open during shutdown now immediately terminate.
By @BrynCooke in https://github.com/apollographql/router/pull/8169
The ARM64 Docker images shipped for v2.6.0 incorrectly contained AMD64/x86 binaries due to a CI build pipeline bug. This has been remedied in v2.6.1.
_entities Apollo Error Metrics Missing Service Attribute (PR #8153)The error counting feature introduced in v2.5.0 (PR #7712) caused a bug where _entities errors from subgraph fetches no longer included a service (subgraph or connector) attribute. This incorrectly categorized these errors as originating from the router instead of their actual service in the Apollo Studio UI.
This fix restores the missing service attribute.
By @rregitsky in https://github.com/apollographql/router/pull/8153
Fixed a regression introduced in v2.5.0, where WebSocket connections to subgraphs would remain open after all client subscriptions were closed. This could lead to unnecessary resource usage and connections not being properly cleaned up until a new event was received.
Previously, when clients disconnected from subscription streams, the router would correctly close client connections but would leave the underlying WebSocket connection to the subgraph open indefinitely in some cases.
By @bnjjj in https://github.com/apollographql/router/pull/8104
id field optional for WebSocket subscription connection_error messages (Issue #6138)Fixed a Subscriptions over WebSocket issue where connection_error messages from subgraphs would be swallowed by the router because they incorrectly required an id field. According to the graphql-transport-ws specification (one of two transport specifications we provide support for), connection_error messages only require a payload field, not an id field. The id field in is now optional which will allow the underlying error message to propagate to clients when underlying connection failures occur.
By @jeffutter in https://github.com/apollographql/router/pull/8189
The Helm chart previously did not allow customization of annotations on the deployment itself (as opposed to the pods within it, which is done with podAnnotations); this can now be done with the deploymentAnnotations value.
By @glasser in https://github.com/apollographql/router/pull/8164
[!IMPORTANT] Due to a CI bug, our ARM64 Docker images published for v2.6.0 incorrectly contained AMD64/x86 artifacts. This is fixed in v2.6.1.
This change adds a new, experimental histogram to capture subgraph fetch duration for GraphOS. This will eventually be used to power subgraph-level insights in Apollo Studio.
This can be toggled on using a new boolean config flag:
telemetry:
apollo:
experimental_subgraph_metrics: true
The new instrument is only sent to GraphOS and is not available in 3rd-party OTel export targets. It is not currently
customizable. Users requiring a customizable alternative can use the existing http.client.request.duration
instrument, which measures the same value.
By @rregitsky in https://github.com/apollographql/router/pull/8013 and https://github.com/apollographql/router/pull/8045
The router now provides Redis cache monitoring with new metrics that help track performance, errors, and resource usage.
Connection and performance metrics:
apollo.router.cache.redis.connections: Number of active Redis connectionsapollo.router.cache.redis.command_queue_length: Commands waiting to be sent to Redis, indicates if Redis is keeping up with demandapollo.router.cache.redis.commands_executed: Total number of Redis commands executedapollo.router.cache.redis.redelivery_count: Commands retried due to connection issuesapollo.router.cache.redis.errors: Redis errors by type, to help diagnose authentication, network, and configuration problemsExperimental performance metrics:
experimental.apollo.router.cache.redis.network_latency_avg: Average network latency to Redisexperimental.apollo.router.cache.redis.latency_avg: Average Redis command execution timeexperimental.apollo.router.cache.redis.request_size_avg: Average request payload sizeexperimental.apollo.router.cache.redis.response_size_avg: Average response payload size[!NOTE] The experimental metrics may change in future versions as we improve the underlying Redis client integration.
You can configure how often metrics are collected using the metrics_interval setting:
supergraph:
query_planning:
cache:
redis:
urls: ["redis://localhost:6379"]
ttl: "60s"
metrics_interval: "1s" # Collect metrics every second (default: 1s)
By @BrynCooke in https://github.com/apollographql/router/pull/7920
The router license functionality now allows granular specification of features enabled to support current and future pricing plans.
By @DMallare in https://github.com/apollographql/router/pull/7917
This adds new custom instrument selectors for Connectors and enhances some existing selectors. The new selectors are:
supergraph_operation_name
supergraph_operation_kind
query, mutation, subscription)request_context
connector_on_response_error
is_successful condition. Or, if that condition is not set,
returns true when the response has a non-200 status codeThese selectors were modified to add additional functionality:
connector_request_mapping_problems
boolean variant that will return true when a mapping problem exists on the requestconnector_response_mapping_problems
boolean variant that will return true when a mapping problem exists on the responseBy @rregitsky in https://github.com/apollographql/router/pull/8045
This PR enables the jemalloc allocator on MacOS by default, making it easier to do memory profiling. Previously, this was only done for Linux.
By @Velfi in https://github.com/apollographql/router/pull/8046
When the Subgraph Entity Caching feature is in use, it determines the Cache-Control HTTP response header sent to supergraph clients based on those received from subgraph servers.
In this process, Apollo Router only emits the max-age directive and not s-maxage.
This PR fixes a bug where, for a query that involved a single subgraph fetch that was not already cached, the subgraph response’s Cache-Control header would be forwarded as-is.
Instead, it now goes through the same algorithm as other cases.
By @SimonSapin in https://github.com/apollographql/router/pull/7987
The router now correctly generates query plans when using progressive override (@override with labels) on types that implement interfaces within the same subgraph. Previously, the Rust query planner would fail to generate plans for these scenarios with the error "Was not able to find any options for {}: This shouldn't have happened.", while the JavaScript planner handled them correctly.
This fix resolves planning failures when your schema uses:
The router will now successfully plan and execute queries that previously resulted in query planning errors.
By @TylerBloom in https://github.com/apollographql/router/pull/7929
The Multipart HTTP protocol for GraphQL Subscriptions distinguishes between GraphQL-level errors and fatal transport-level errors. The router previously used a heuristic to determine if a given error was fatal or not, which could sometimes cause errors to be wrongly classified. For example, if a subgraph returned a GraphQL-level error for a subscription and then immediately ended the subscription, the router might propagate this as a fatal transport-level error.
This is now fixed. Fatal transport-level errors are tagged as such when they are constructed, so the router can reliably know how to serialize errors when sending them to the client.
By @goto-bus-stop in https://github.com/apollographql/router/pull/7901
Now that we have a DockerHub account we have published the Runtime Container to that account. This fix simply adds a reference to that to the documentation
By @jonathanrainer in https://github.com/apollographql/router/pull/8054
Configuration can now specify different Cross-Origin Resource Sharing (CORS) rules for different origins using the cors.policies key. See the CORS documentation for details.
cors:
policies:
# The default CORS options work for Studio.
- origins: ["https://studio.apollographql.com"]
# Specific config for trusted origins
- match_origins: ["^https://(dev|staging|www)?\\.my-app\\.(com|fr|tn)$"]
allow_credentials: true
allow_headers: ["content-type", "authorization", "x-web-version"]
# Catch-all for untrusted origins
- origins: ["*"]
allow_credentials: false
allow_headers: ["content-type"]
By @Velfi in https://github.com/apollographql/router/pull/7853
This PR adds the following new metrics when running the router on Linux with its default global-allocator feature:
munmap(2) or similar.By @Velfi in https://github.com/apollographql/router/pull/7735
The router was creating invalid GraphQL responses internally, especially when subscriptions terminate. When a coprocessor is configured, it validates all responses for correctness, causing errors to be logged when the router generates invalid internal responses. This affects the reliability of subscription workflows with coprocessors.
Fix handling of invalid GraphQL responses returned from coprocessors, particularly when used with subscriptions. Added conditional response validation and improved testing to ensure correctness. Added the response_validation configuration option at the coprocessor level to enable the response validation (by default it's enabled).
By @BrynCooke in https://github.com/apollographql/router/pull/7731
Fixes a regression introduced in v1.50.0. When multiple client subscriptions are deduped onto a single subgraph subscription in WebSocket passthrough mode, and the first client subscription closes, the Router would close the subgraph subscription. The other deduplicated subscriptions would then silently stop receiving events.
Now outgoing subscriptions to subgraphs are kept open as long as any client subscription uses them.
By @bnjjj in https://github.com/apollographql/router/pull/7879
When a hot reload is triggered by a configuration change, the router attempted to apply updated configuration to open subscriptions. This could cause excessive logging.
When a hot reload was triggered by a schema change, the router closed subscriptions with a SUBSCRIPTION_SCHEMA_RELOAD error. This happened before the new schema was fully active and warmed up, so clients could reconnect to the old schema, which should not happen.
To fix these issues, a configuration and a schema change now have the same behavior. The router waits for the new configuration and schema to be active, and then closes all subscriptions with a SUBSCRIPTION_SCHEMA_RELOAD/SUBSCRIPTION_CONFIG_RELOAD error, so clients can reconnect.
By @goto-bus-stop and @bnjjj in https://github.com/apollographql/router/pull/7777
When trying to remove non-UTF-8 headers from a Rhai plugin, users were faced with an unhelpful error. Now, non-UTF-8 values will be lossy converted to UTF-8 when accessed from Rhai. This change affects get, get_all, and remove operations.
By @Velfi in https://github.com/apollographql/router/pull/7801
The router now correctly generates query plans when using progressive override (@override with labels) on types that implement interfaces within the same subgraph. Previously, the Rust query planner would fail to generate plans for these scenarios with the error "Was not able to find any options for {}: This shouldn't have happened.", while the JavaScript planner handled them correctly.
This fix resolves planning failures when your schema uses:
The router will now successfully plan and execute queries that previously resulted in query planning errors.
By @TylerBloom in https://github.com/apollographql/router/pull/7929
When the Persisted Queries feature is enabled, the router no longer hangs during startup when using a GraphOS account with no Persisted Queries manifest.
@ from error paths (Issue #4548)When a subgraph returns an unexpected response (ie not a body with at least one of errors or data), the errors surfaced by the router include an @ in the path which indicates an error applied to all elements in the array. This is not a behavior defined in the GraphQL spec and is not easily parsed.
This fix expands the @ symbol to reflect all paths that the error applies to.
Consider a federated graph with two subgraphs, products and inventory, and a topProducts query which fetches a list of products from products and then fetches an inventory status for each product.
A successful response might look like:
{
"data": {
"topProducts": [
{"name": "Table", "inStock": true},
{"name": "Chair", "inStock": false}
]
}
}
Prior to this change, if the inventory subgraph returns a malformed response, the router response would look like:
{
"data": {"topProducts": [{"name": "Table", "inStock": null}, {"name": "Chair", "inStock": null}]},
"errors": [
{
"message": "service 'inventory' response was malformed: graphql response without data must contain at least one error",
"path": ["topProducts", "@"],
"extensions": {"service": "inventory", "reason": "graphql response without data must contain at least one error", "code": "SUBREQUEST_MALFORMED_RESPONSE"}
}
]
}
With this change, the response will look like:
{
"data": {"topProducts": [{"name": "Table", "inStock": null}, {"name": "Chair", "inStock": null}]},
"errors": [
{
"message": "service 'inventory' response was malformed: graphql response without data must contain at least one error",
"path": ["topProducts", 0],
"extensions": {"service": "inventory", "reason": "graphql response without data must contain at least one error", "code": "SUBREQUEST_MALFORMED_RESPONSE"}
},
{
"message": "service 'inventory' response was malformed: graphql response without data must contain at least one error",
"path": ["topProducts", 1],
"extensions": {"service": "inventory", "reason": "graphql response without data must contain at least one error", "code": "SUBREQUEST_MALFORMED_RESPONSE"}
}
]
}
The above examples reflect the behavior with include_subgraph_errors = true; if include_subgraph_errors is false:
{
"data": {"topProducts": [{"name": "Table", "inStock": null}, {"name": "Chair", "inStock": null}]},
"errors": [
{
"message": "Subgraph errors redacted",
"path": ["topProducts", 0]
},
{
"message": "Subgraph errors redacted",
"path": ["topProducts", 1]
}
]
}
By @carodewig in https://github.com/apollographql/router/pull/7684
The APOLLO_TELEMETRY_DISABLED environment variable only disables anonymous telemetry, it was never meant for disabling identifiable telemetry. This includes metrics from the fleet detection plugin.
By @DMallare in https://github.com/apollographql/router/pull/7907
This release is part of the LTS of Router v1.61. It is marked for End-of-Support on March 30, 2026.
Fixes a regression introduced in v1.50.0. When multiple client subscriptions are deduped onto a single subgraph subscription in WebSocket passthrough mode, and the first client subscription closes, the Router would close the subgraph subscription. The other deduplicated subscriptions would then silently stop receiving events.
Now outgoing subscriptions to subgraphs are kept open as long as any client subscription uses them.
By @bnjjj in https://github.com/apollographql/router/pull/7879
The router was creating invalid GraphQL responses internally, especially when subscriptions terminate. When a coprocessor is configured, it validates all responses for correctness, causing errors to be logged when the router generates invalid internal responses. This affects the reliability of subscription workflows with coprocessors.
Fix handling of invalid GraphQL responses returned from coprocessors, particularly when used with subscriptions. Added conditional response validation and improved testing to ensure correctness. Added the response_validation configuration option at the coprocessor level to enable the response validation (by default it's enabled).
By @BrynCooke in https://github.com/apollographql/router/pull/7731
When a hot reload is triggered by a configuration change, the router attempted to apply updated configuration to open subscriptions. This could cause excessive logging.
When a hot reload was triggered by a schema change, the router closed subscriptions with a SUBSCRIPTION_SCHEMA_RELOAD error. This happened before the new schema was fully active and warmed up, so clients could reconnect to the old schema, which should not happen.
To fix these issues, a configuration and a schema change now have the same behavior. The router waits for the new configuration and schema to be active, and then closes all subscriptions with a SUBSCRIPTION_SCHEMA_RELOAD/SUBSCRIPTION_CONFIG_RELOAD error, so clients can reconnect.
By @goto-bus-stop and @bnjjj in https://github.com/apollographql/router/pull/7777