This release is marked for LTS under the v2026.1 LTS Policy for the GraphOS Runtime. It will be supported until September 30, 2026 with patch updates.
Response caching is now Generally Available (GA) and ready for production use!
Response caching enables the router to cache subgraph query responses using Redis, improving query latency and reducing load on your underlying services. Unlike traditional HTTP caching solutions, response caching provides GraphQL-aware caching at the entity and root field level, making cached data reusable across different users and queries.
For complete documentation, configuration options, and quickstart guide, see the response caching documentation.
@cacheControl directives or Cache-Control response headers@cacheTag and invalidate specific cache entries via HTTP endpoint when data changesThe router caches two kinds of data:
By @bnjjj in https://github.com/apollographql/router/pull/8678
Read-only queries are now sent to replica nodes when using clustered Redis. Previously, all commands were sent to the primary nodes.
This change applies to all Redis caches, including the query plan cache and the response cache.
By @carodewig in https://github.com/apollographql/router/pull/8405
The router's HTTP/2 header size limit configuration option now applies to requests using TCP and UDS (Unix domain sockets). Previously, this setting only worked for TLS connections.
By @aaronArinder in https://github.com/apollographql/router/pull/8673
Previously, the response cache invalidation endpoint was only enabled when global invalidation was enabled via response_cache.subgraph.all.invalidation.enabled. If you enabled invalidation for only specific subgraphs without enabling it globally, the invalidation endpoint wouldn't start, preventing cache invalidation requests from being processed.
The invalidation endpoint now starts if either:
response_cache.subgraph.all.invalidation.enabled: true), ORThis enables more flexible configuration where you can enable invalidation selectively for specific subgraphs:
response_cache:
enabled: true
invalidation:
listen: 127.0.0.1:4000
path: /invalidation
subgraph:
all:
enabled: true
# Global invalidation not enabled
subgraphs:
products:
invalidation:
enabled: true # Endpoint now starts
shared_key:
By @bnjjj in https://github.com/apollographql/router/pull/8680
Previously, the router attempted to connect to Redis for response caching regardless of whether response caching was enabled or disabled. This caused unnecessary connection attempts and configuration errors even when the feature was explicitly disabled.
The router now ignores Redis configuration if response caching is disabled. If response caching is configured to be enabled, Redis configuration is required, and missing Redis configuration raises an error on startup:
Error: you must have a redis configured either for all subgraphs or for subgraph "products"
By @bnjjj in https://github.com/apollographql/router/pull/8684
Coprocessor context keys deleted in a previous stage no longer reappear in later stages.
By @rohan-b99 in https://github.com/apollographql/router/pull/8679
You can now customize cached responses using Rhai or coprocessors. You can also set a different private_id based on subgraph request headers.
Example Rhai script customizing private_id:
fn subgraph_service(service, subgraph) {
service.map_request(|request| {
if "private_id" in request.headers {
request.context["private_id"] = request.headers["private_id"];
}
});
}
By @bnjjj in https://github.com/apollographql/router/pull/8652
The DIY Dockerfile now pins the Rust builder to the Bookworm variant (for example, rust:1.91.1-slim-bookworm) so the builder and runtime share the same Debian base. This prevents the image from failing at startup with /lib/x86_64-linux-gnu/libc.so.6: version 'GLIBC_2.39' not found.
This resolves a regression introduced when the rust:1.90.0 bump used a generic Rust image without specifying a Debian variant. The upstream Rust image default advanced to a newer variant with glibc 2.39, although the DIY runtime remained on Bookworm, creating a version mismatch.
By @theJC in https://github.com/apollographql/router/pull/8629
The apollo.router.operations.response_cache.fetch.error metric was out of sync with the apollo.router.cache.redis.errors metric because errors weren't being returned from the Redis client wrapper. The response caching plugin now increments the error metric as expected.
By @carodewig in https://github.com/apollographql/router/pull/8711
http.client.request.body.size metric correctly (PR #8712)The histogram for http.client.request.body.size was using the SubgraphRequestHeader selector, looking for Content-Length before it had been set in on_request, so http.client.request.body.size wasn't recorded. The router now uses the on_response handler and stores the body size in the request context extensions.
By @rohan-b99 in https://github.com/apollographql/router/pull/8712
http.server.response.body.size metric correctly (PR #8697)Previously, the http.server.response.body.size metric wasn't recorded because the router attempted to read from the Content-Length header before it had been set. The router now uses the size_hint of the body if it's exact.
By @rohan-b99 in https://github.com/apollographql/router/pull/8697
Interface objects can be entities, but response caching wasn't treating them that way. Interface objects are now respected as entities so they can be used as cache keys.
By @aaronArinder in https://github.com/apollographql/router/pull/8582
The router now validates propagator configuration and emits a warning log if:
By @rohan-b99 in https://github.com/apollographql/router/pull/8677
CORS configuration now supports private network access (PNA). Enable PNA for a CORS policy by specifying the private_network_access field, which supports two optional subfields: access_id and access_name.
Example configuration:
cors:
policies:
- origins: ["https://studio.apollographql.com"]
private_network_access:
access_id:
- match_origins: ["^https://(dev|staging|www)?\\.my-app\\.(com|fr|tn)$"]
private_network_access:
access_id: "01:23:45:67:89:0A"
access_name: "mega-corp device"
By @TylerBloom in https://github.com/apollographql/router/pull/8279
The router now supports configuring the maximum size for HTTP/2 header lists via the limits.http2_max_headers_list_bytes setting. This protects against excessive resource usage from clients sending large sets of HTTP/2 headers.
The default remains 16KiB. When a client sends a request with HTTP/2 headers whose total size exceeds the configured limit, the router rejects the request with a 431 error code.
Example configuration:
limits:
http2_max_headers_list_bytes: "48KiB"
By @aaronArinder in https://github.com/apollographql/router/pull/8636
The response cache key can now be customized per subgraph using the apollo::response_cache::key context entry. The new subgraphs field enables defining separate cache keys for individual subgraphs.
Subgraph-specific data takes precedence over data in the all field—the router doesn't merge them. To set common data when providing subgraph-specific data, add it to the subgraph-specific section.
Example payload:
{
"all": 1,
"subgraph_operation1": "key1",
"subgraph_operation2": {
"data": "key2"
},
"subgraphs": {
"my_subgraph": {
"locale": "be"
}
}
}
By @bnjjj in https://github.com/apollographql/router/pull/8543
The new response_cache_control selector enables telemetry metrics based on the computed Cache-Control header from subgraph responses.
Example configuration:
telemetry:
exporters:
metrics:
common:
service_name: apollo-router
views:
- name: subgraph.response.cache_control.max_age
aggregation:
histogram:
buckets:
- 10
- 100
- 1000
- 10000
- 100000
instrumentation:
instruments:
subgraph:
subgraph.response.cache_control.max_age:
value:
response_cache_control: max_age
type: histogram
unit: s
description: A histogram of the computed TTL for a subgraph response
By @bnjjj in https://github.com/apollographql/router/pull/8524
_redacted suffix from event attributes in apollo.router.state.change.total metric (Issue #8464)Event names in the apollo.router.state.change.total metric no longer include the _redacted suffix. The metric now uses the Display trait instead of Debug for event names, changing values like updateconfiguration_redacted to updateconfiguration in APM platforms.
The custom behavior for UpdateLicense events is retained—the license state name is still appended.
By @rohan-b99 in https://github.com/apollographql/router/pull/8464
The router now uses the Content-Length header for GraphQL responses with known content lengths instead of transfer-encoding: chunked. Previously, the fleet_detector plugin destroyed HTTP body size hints when collecting metrics.
This extends the fix from #6538, which preserved size hints for router → subgraph requests, to also cover client → router requests and responses. Size hints now flow correctly through the entire pipeline for optimal HTTP header selection.
By @morriswchris in https://github.com/apollographql/router/pull/7977
apollo.router.operations.subscriptions.events metric counting (PR #8483)The apollo.router.operations.subscriptions.events metric now increments correctly for each subscription event (excluding ping/pong/close messages). The counter call has been moved into the stream to trigger on each event.
This change also removes custom pong response handling before connection acknowledgment, which previously caused duplicate pongs because the WebSocket implementation already handles pings by default.
By @rohan-b99 in https://github.com/apollographql/router/pull/8483
Tokio- and Redis-based timeouts now use the same timeout code in apollo.router.operations.response_cache.*.error metrics. Previously, they were inadvertently given different code values.
By @carodewig in https://github.com/apollographql/router/pull/8515
The ttl parameter under redis configuration had no effect and is removed. Configure TTL at the subgraph level to control cache entry expiration:
preview_response_cache:
enabled: true
subgraph:
all:
enabled: true
ttl: 10m # ✅ Configure TTL here
redis:
urls: [ "redis://..." ]
# ❌ ttl was here previously (unused)
By @carodewig in https://github.com/apollographql/router/pull/8513
The telemetry selectors documentation now correctly reflects the active_subgraph_requests attribute.
By @faisalwaseem in https://github.com/apollographql/router/pull/8530
The FAQ now includes information about supported Redis versions and Redis key eviction setup.
By @carodewig in https://github.com/apollographql/router/pull/8624
@key fields for entity caching (PR #8367)Entity caching now supports arrays (including arrays of objects and scalars) in complex @key fields when resolving entities by key. This improves entity matching when using complex @key fields as primary cache keys.
By @aaronArinder, @bnjjj, and @duckki in https://github.com/apollographql/router/pull/8367
The router now correctly parses scientific notation (like 1.5e10) in Rhai scripts and JSON operations. Previously, the Rhai scripting engine failed to parse these numeric formats, causing runtime errors when your scripts processed data containing exponential notation.
This fix upgrades Rhai from 1.21.0 to 1.23.6, resolving the parsing issue and ensuring your scripts handle scientific notation seamlessly.
By @BrynCooke in https://github.com/apollographql/router/pull/8528
@cacheTag directive format (PR #8496)Composition validation no longer raises an error when using enum types in the @cacheTag directive's format argument. Previously, only scalar types were accepted.
Example:
type Query {
testByCountry(id: ID!, country: Country!): Test
@cacheTag(format: "test-{.id}-{.country}")
}
By @bnjjj in https://github.com/apollographql/router/pull/8496
Debugging data now includes a flag that indicates to Apollo Sandbox whether the data should be cached, preventing unnecessary local computation. This update also includes improved warnings.
By @bnjjj in https://github.com/apollographql/router/pull/8459
The debugger now displays cache tags generated from subgraph responses (in extensions). For performance reasons, these generated cache tags are only displayed when the data has been cached in debug mode.
By @bnjjj in https://github.com/apollographql/router/pull/8531
The router telemetry documentation now clarifies that OpenTelemetry's "Recommended" attributes from their development-status GraphQL semantic conventions are experimental and still evolving. Apollo recommends using required attributes instead of recommended attributes because of high cardinality, security, and performance risks with attributes like graphql.document.
Learn more in Router Telemetry.
By @abernix
The record/replay plugin no longer panics when externalizing headers with invalid UTF-8 values. Instead, the plugin writes the header keys and errors to a header_errors object for both requests and responses.
By @rohan-b99 in https://github.com/apollographql/router/pull/8485
This release is part of the LTS of Router v1.61. It is marked for End-of-Support on March 30, 2026.
[!NOTE] For more information on the impact of the fixes in this release and how your deployment might be affected or remediated, see the corresponding GitHub Security Advisory (GHSA) linked on the entries below. In both listed cases, updating to a patched Router version will resolve any vulnerabilities.
Updates the auth plugin to correctly handle access control requirements when processing polymorphic types.
When querying interface types/fields, the auth plugin was verifying only whether all implementations shared the same access control requirements. In cases where interface types/fields did not specify the same access control requirements as the implementations, this could result in unauthorized access to protected data.
The auth plugin was updated to correctly verify that all polymorphic access control requirements are satisfied by the current context.
See GHSA-x33c-7c2v-mrj9 for additional details and the associated CVE number.
By @dariuszkuc
The router auth plugin did not properly handle access control requirements when subgraphs renamed their access control directives through imports. When such renames occurred, the plugin’s @link-processing code ignored the imported directives entirely, causing access control constraints defined by the renamed directives to be ignored.
The plugin code was updated to call the appropriate functionality in the apollo-federation crate, which correctly handles both because spec and imports directive renames.
See GHSA-g8jh-vg5j-4h3f for additional details and the associated CVE number.
By @sachindshinde
[!NOTE] For more information on the impact of the fixes in this release and how your deployment might be affected or remediated, see the corresponding GitHub Security Advisory (GHSA) linked on the entries below. In both listed cases, updating to a patched Router version will resolve any vulnerabilities.
Updates the auth plugin to correctly handle access control requirements when processing polymorphic types.
When querying interface types/fields, the auth plugin was verifying only whether all implementations shared the same access control requirements. In cases where interface types/fields did not specify the same access control requirements as the implementations, this could result in unauthorized access to protected data.
The auth plugin was updated to correctly verify that all polymorphic access control requirements are satisfied by the current context.
See GHSA-x33c-7c2v-mrj9 for additional details and the associated CVE number.
By @dariuszkuc
The router auth plugin did not properly handle access control requirements when subgraphs renamed their access control directives through imports. When such renames occurred, the plugin’s @link-processing code ignored the imported directives entirely, causing access control constraints defined by the renamed directives to be ignored.
The plugin code was updated to call the appropriate functionality in the apollo-federation crate, which correctly handles both because spec and imports directive renames.
See GHSA-g8jh-vg5j-4h3f for additional details and the associated CVE number.
By @sachindshinde
This release is part of the LTS of Router v1.61. It is marked for End-of-Support on March 30, 2026.
The router adds the following new metrics when running the router on Linux with its default global-allocator feature:
apollo_router_jemalloc_active: Total number of bytes in active pages allocated by the application.apollo_router_jemalloc_allocated: Total number of bytes allocated by the application.apollo_router_jemalloc_mapped: Total number of bytes in active extents mapped by the allocator.apollo_router_jemalloc_metadata: Total number of bytes dedicated to metadata, which comprise base allocations used for bootstrap-sensitive allocator metadata structures and internal allocations.apollo_router_jemalloc_resident: Maximum number of bytes in physically resident data pages mapped by the allocator, comprising all pages dedicated to allocator metadata, pages backing active allocations, and unused dirty pages.apollo_router_jemalloc_retained: Total number of bytes in virtual memory mappings that were retained rather than being returned to the operating system via e.g. munmap(2) or similar.By @Velfi in https://github.com/apollographql/router/pull/7735
The router now correctly generates query plans when using progressive override (@override with labels) on types that implement interfaces within the same subgraph.
Previously, the Rust query planner would fail to generate plans for these scenarios with the error "Was not able to find any options for {}: This shouldn't have happened.", while the JavaScript planner handled them correctly.
This fix resolves planning failures when your schema uses:
These will now successfully plan and execute.
By @TylerBloom in https://github.com/apollographql/router/pull/7929
WebSocket connections to subgraphs now close properly when all client subscriptions end, preventing unnecessary resource usage.
Previously, connections could remain open after clients disconnected, not being cleaned up until a new event was received. The router now tracks active subscriptions and closes the subgraph connection when the last client disconnects, ensuring efficient resource management.
By @bnjjj in https://github.com/apollographql/router/pull/8104
The router now logs interrupted WebSocket streams at trace level instead of error level.
Previously, WebSocket stream interruptions logged at error level, creating excessive noise in logs when clients disconnected normally or networks experienced transient issues. Client disconnections and network interruptions are expected operational events that don't require immediate attention.
Your logs will now be cleaner and more actionable, making genuine errors easier to spot. You can enable trace level logging when debugging WebSocket connection issues.
By @bnjjj in https://github.com/apollographql/router/pull/8344
The router now reliably terminates all connections during hot reload, preventing out-of-memory errors from multiple active pipelines.
A race condition during hot reload occasionally left connections in an active state instead of terminating. Connections that are opening during shutdown now immediately terminate, maintaining stable memory usage through hot reloads.
By @BrynCooke in https://github.com/apollographql/router/pull/8169
Available on all GraphOS plans including Free, Developer, Standard and Enterprise.
Response caching enables the router to cache GraphQL subgraph origin responses using Redis, delivering performance improvements by reducing subgraph load and query latency. Unlike traditional HTTP caching or client-side caching, response caching works at the GraphQL entity level—caching reusable portions of query responses that can be shared across different operations and users.
Response caching caches two types of data:
Benefits include:
Cache-Control headers from subgraph originsResponse caching solves traditional GraphQL caching challenges including mixed TTL requirements across a single response, personalized versus public data mixing, and high data duplication.
Configure response caching using the preview_response_cache configuration option with Redis as the cache backend. For complete setup instructions and advanced configuration, see the Response Caching documentation.
Migration from entity caching: For existing entity caching users, migration is as simple as renaming configuration options. For migration details see the Response Caching FAQ.
You can now configure different coprocessor URLs for each stage of request/response processing (router, supergraph, execution, subgraph). Each stage can specify its own url field that overrides the global default URL.
Changes:
url field to all stage configuration structsas_service methods to accept and resolve URLsThis change maintains full backward compatibility—existing configurations with a single global URL continue to work unchanged.
By @cgati in https://github.com/apollographql/router/pull/8384
The router now automatically converts duration measurements to match the configured unit for telemetry instruments.
Previously, duration instruments always recorded values in seconds regardless of the configured unit field.
When you specify units like "ms" (milliseconds), "us" (microseconds), or "ns" (nanoseconds),
the router automatically converts the measured duration to the appropriate scale.
Supported units:
"s" - seconds (default)"ms" - milliseconds"us" - microseconds"ns" - nanoseconds[!NOTE] Use this feature only when you need to integrate with an observability platform that doesn't properly translate from source time units to target time units (for example, seconds to milliseconds). In all other cases, follow the OTLP convention that you "SHOULD" use seconds as the unit.
Example:
telemetry:
instrumentation:
instruments:
subgraph:
acme.request.duration:
value: duration
type: histogram
unit: ms # Values are now automatically converted to milliseconds
description: "Metric to get the request duration in milliseconds"
By @theJC in https://github.com/apollographql/router/pull/8415
All subgraph responses are checked and corrected to ensure alignment with the schema and query. When a misaligned value is returned, it's nullified. When enabled, errors for this nullification are now included in the errors array in the response.
Enable this feature in your router configuration:
supergraph:
enable_result_coercion_errors: true
When enabled, the router generates validation errors with the code RESPONSE_VALIDATION_FAILED for any values that don't match the expected GraphQL type. These errors include the specific path and reason for the validation failure, helping you identify data inconsistencies between your subgraphs and schema.
While this feature improves GraphQL correctness, clients may encounter errors in responses where they previously did not, which may require consideration based on your specific usage patterns.
By @TylerBloom in https://github.com/apollographql/router/pull/8441
The apollo.router.overhead histogram provides a direct measurement of router processing overhead. This metric tracks the time the router spends on tasks other than waiting for downstream HTTP requests—including GraphQL parsing, validation, query planning, response composition, and plugin execution.
The overhead calculation excludes time spent waiting for downstream HTTP services (subgraphs and connectors), giving you visibility into the router's actual processing time versus downstream latency. This metric helps identify when the router itself is a bottleneck versus when delays are caused by downstream services.
Note: Coprocessor request time is currently included in the overhead calculation. In a future release, coprocessor time may be excluded similar to subgraphs and connectors.
telemetry:
instrumentation:
instruments:
router:
apollo.router.overhead: true
[!NOTE] Note that the use of this metric is nuanced, and there is risk of misinterpretation. See the full docs for this metric to help understand how it can be used.
By @BrynCooke in https://github.com/apollographql/router/pull/8455
Error messages for malformed Trace IDs now include the invalid value to help with debugging. Previously, when the router received an unparseable Trace ID in incoming requests, error logs only indicated that the Trace ID was invalid without showing the actual value.
Trace IDs can be unparseable due to invalid hexadecimal characters, incorrect length, or non-standard formats. Including the invalid value in error logs makes it easier to diagnose and resolve tracing configuration issues.
By @juancarlosjr97 in https://github.com/apollographql/router/pull/8149
The router can now rename instruments via OpenTelemetry views. Details on how to use this feature can be found in the docs.
Benefits:
By @theJC in https://github.com/apollographql/router/pull/8412
Previously, schema or config reloads would always reload telemetry, dropping existing exporters and creating new ones.
Telemetry exporters are now only recreated when relevant configuration has changed.
By @BrynCooke in https://github.com/apollographql/router/pull/8328
The apollo.router.cache.redis.connections metric has been removed and replaced with the apollo.router.cache.redis.clients metric.
The connections metric was implemented with an up-down counter that would sometimes not be collected properly (it could go negative). The name connections was also inaccurate since Redis clients each make multiple connections, one to each node in the Redis pool (if in clustered mode).
The new clients metric counts the number of clients across the router via an AtomicU64 and surfaces that value in a gauge.
[!NOTE] The old metric included a
kindattribute to reflect the number of clients in each pool (for example, entity caching, query planning). The new metric doesn't include this attribute; the purpose of the metric is to ensure the number of clients isn't growing unbounded (#7319).
By @carodewig in https://github.com/apollographql/router/pull/8161
When the Age header is higher than the max-age directive in Cache-Control, the router no longer caches the data because it's already expired.
For example, with these headers:
Cache-Control: max-age=5
Age: 90
The data won't be cached since Age (90) exceeds max-age (5).
By @bnjjj in https://github.com/apollographql/router/pull/8456
File watch events during an existing hot reload no longer spam the logs. Hot reload continues as usual after the existing reload finishes.
By @goto-bus-stop in https://github.com/apollographql/router/pull/8336
@shareable mutation fields (PR #8352)Query planning a mutation operation that executes a @shareable mutation field at the top level may unexpectedly error when attempting to generate a plan where that mutation field is called more than once across multiple subgraphs. Query planning now avoids generating such plans.
By @sachindshinde in https://github.com/apollographql/router/pull/8352
UpDownCounters now use RAII guards instead of manual incrementing and decrementing, ensuring they're always decremented when dropped.
This fix resolves drift in apollo.router.opened.subscriptions that occurred due to manual incrementing and decrementing.
By @BrynCooke in https://github.com/apollographql/router/pull/8379
Rhai scripts that short-circuit the pipeline by throwing now only log an error if a response body isn't present.
For example the following will NOT log:
throw #{
status: 403,
body: #{
errors: [#{
message: "Custom error with body",
extensions: #{
code: "FORBIDDEN"
}
}]
}
};
For example the following WILL log:
throw "An error occurred without a body";
By @BrynCooke in https://github.com/apollographql/router/pull/8364
@requires subgraph jump fetches @key from wrong subgraph (PR #8016)During query planning, a subgraph jump added due to a @requires field may sometimes try to collect the necessary @key fields from an upstream subgraph fetch as an optimization, but it wasn't properly checking whether that subgraph had those fields. This is now fixed and resolves query planning errors with messages like "Cannot add selection of field T.id to selection set of parent type T".
By @sachindshinde in https://github.com/apollographql/router/pull/8016
The router now logs interrupted WebSocket streams at trace level instead of error level.
Previously, WebSocket stream interruptions logged at error level, creating excessive noise in logs when clients disconnected normally or networks experienced transient issues. Client disconnections and network interruptions are expected operational events that don't require immediate attention.
Your logs will now be cleaner and more actionable, making genuine errors easier to spot. You can enable trace level logging when debugging WebSocket connection issues.
By @bnjjj in https://github.com/apollographql/router/pull/8344
The existing insert code would silently fail when trying to insert multiple values that correspond to different Redis cluster hash slots. This change corrects that behavior, raises errors when inserts fail, and adds new metrics to track Redis client health.
New metrics:
apollo.router.cache.redis.unresponsive: counter for 'unresponsive' events raised by the Redis library
kind: Redis cache purpose (APQ, query planner, entity)server: Redis server that became unresponsiveapollo.router.cache.redis.reconnection: counter for 'reconnect' events raised by the Redis library
kind: Redis cache purpose (APQ, query planner, entity)server: Redis server that required client reconnectionBy @carodewig in https://github.com/apollographql/router/pull/8185
A regression introduced in v2.5.0 caused query planner construction to unnecessarily precompute metadata, leading to increased CPU and memory utilization during supergraph loading. Query planner construction now correctly avoids this unnecessary precomputation.
By @sachindshinde in https://github.com/apollographql/router/pull/8373
[!IMPORTANT] If you have enabled Entity caching, this release contains changes that necessarily alter the hashing algorithm used for the cache keys. You should anticipate additional cache regeneration cost when updating between these versions while the new hashing algorithm comes into service.
The entity cache key version has been bumped to avoid keeping invalid cached data for too long (fixed in #8456).
By @bnjjj in https://github.com/apollographql/router/pull/8458
http_client headers (PR #8349)A new telemetry instrumentation configuration for http_client spans allows request headers added by Rhai scripts to be attached to the http_client span. The some_rhai_response_header value remains available on the subgraph span as before.
telemetry:
instrumentation:
spans:
mode: spec_compliant
subgraph:
attributes:
http.response.header.some_rhai_response_header:
subgraph_response_header: "some_rhai_response_header"
http_client:
attributes:
http.request.header.some_rhai_request_header:
request_header: "some_rhai_request_header"
By @bonnici in https://github.com/apollographql/router/pull/8349
The subgraph_metrics config flag that powers the Studio Subgraph Insights feature is now promoted from preview to general availability.
The flag name has been updated from preview_subgraph_metrics to
telemetry:
apollo:
subgraph_metrics: true
By @david_castaneda in https://github.com/apollographql/router/pull/8392
Error messages raised during tracing and metric exports now indicate whether the error occurred when exporting to Apollo Studio or to your configured OTLP or Zipkin endpoint. For example, errors that occur when exporting Apollo Studio traces look like:
OpenTelemetry trace error occurred: [apollo traces] <etc>
while errors that occur when exporting traces to your configured OTLP endpoint look like:
OpenTelemetry trace error occurred: [otlp traces] <etc>
By @bonnici in https://github.com/apollographql/router/pull/8363
MCP's default port has changed from 5000 to 8000.
Two new deployment guides are now available for popular hosting platforms: Render and Railway.
By @the-gigi-apollo in https://github.com/apollographql/router/pull/8242
The documentation now includes a comprehensive reference for all context keys the router supports.
By @faisalwaseem in https://github.com/apollographql/router/pull/8420
Restructured the router observability and telemetry documentation to improve content discoverability and user experience. GraphOS insights documentation and router OpenTelemetry telemetry documentation are now in separate sections, with APM-specific documentation organized in dedicated folders for each APM provider (Datadog, Dynatrace, Jaeger, Prometheus, New Relic, Zipkin). This reorganization makes it easier for users to find relevant monitoring and observability configuration for their specific APM tools.
By @Robert113289 in https://github.com/apollographql/router/pull/8183
The Datadog APM guide has been expanded to include the OpenTelemetry Collector, recommended router telemetry configuration, and out-of-the-box dashboard templates:
By @Robert113289 in https://github.com/apollographql/router/pull/8319
The documentation reflects more clearly that subgraph timeouts should not be higher than the router timeout or the router timeout will initiate prior to the subgraph.
By @abernix in https://github.com/apollographql/router/pull/8203
ResponseErrors selector to router response (PR #7882)The ResponseErrors selector in telemetry configurations captures router response errors, enabling you to log errors encountered at the router service layer. This selector enhances logging by allowing you to log only router errors instead of the entire router response body, reducing noise in your telemetry data.
telemetry:
instrumentation:
events:
router:
router.error:
attributes:
"my_attribute":
response_errors: "$.[0]"
# Examples: "$.[0].message", "$.[0].locations", "$.[0].extensions", etc.
By @Aguilarjaf in https://github.com/apollographql/router/pull/7882
_entities Apollo error metrics missing service attribute (PR #8153)The error counting feature introduced in v2.5.0 caused _entities errors from subgraph fetches to no longer report a service (subgraph or connector) attribute. This incorrectly categorized these errors as originating from the router instead of their actual service in Apollo Studio.
The service attribute is now correctly included for _entities errors.
By @rregitsky in https://github.com/apollographql/router/pull/8153
A regression introduced in v2.5.0 caused WebSocket connections to subgraphs to remain open after all client subscriptions ended. This led to unnecessary resource usage and connections not being cleaned up until a new event was received.
The router now correctly closes WebSocket connections to subgraphs when clients disconnect from subscription streams.
By @bnjjj in https://github.com/apollographql/router/pull/8104
When using OTLP metrics export with delta temporality configured, UpDown counters could exhibit drift issues where counter values became inaccurate over time. This occurred because UpDown counters were incorrectly exported as deltas instead of cumulative values.
UpDown counters now export as aggregate values according to the OpenTelemetry specification.
By @BrynCooke in https://github.com/apollographql/router/pull/8174
connection_error message handling (Issue #6138)The router now correctly processes connection_error messages from subgraphs that don't include an id field. Previously, these messages were ignored because the router incorrectly required an id field. According to the graphql-transport-ws specification, connection_error messages only require a payload field.
The id field is now optional for connection_error messages, allowing underlying error messages to propagate to clients when connection failures occur.
By @jeffutter in https://github.com/apollographql/router/pull/8189
The Helm chart now supports customizing annotations on the deployment itself using the deploymentAnnotations value. Previously, you could only customize pod annotations with podAnnotations.
By @glasser in https://github.com/apollographql/router/pull/8164
An uncommon query planning error has been resolved: "Cannot add selection of field X to selection set of parent type Y that is potentially an interface object type at runtime". The router now handles __typename selections from interface object types correctly, as these selections are benign even when unnecessary.
By @duckki in https://github.com/apollographql/router/pull/8109
A race condition during hot reload that occasionally left connections in an active state instead of terminating has been fixed. This issue could cause out-of-memory errors over time as multiple pipelines remained active.
Connections that are opening during shutdown now immediately terminate.
By @BrynCooke in https://github.com/apollographql/router/pull/8169
Persisted Query metrics now include operations requested by safelisted operation body. Previously, the router only recorded metrics for operations requested by ID.
By @bonnici in https://github.com/apollographql/router/pull/8168
Apollo telemetry configuration now allows separate fine-tuning for metrics and traces batch processors. The configuration has changed from:
telemetry:
apollo:
batch_processor:
scheduled_delay: 5s
max_export_timeout: 30s
max_export_batch_size: 512
max_concurrent_exports: 1
max_queue_size: 2048
To:
telemetry:
apollo:
tracing:
# Config for Apollo OTLP and Apollo usage report traces
batch_processor:
max_export_timeout: 130s
scheduled_delay: 5s
max_export_batch_size: 512
max_concurrent_exports: 1
max_queue_size: 2048
metrics:
# Config for Apollo OTLP metrics.
otlp:
batch_processor:
scheduled_delay: 13s # This does not apply config gauge metrics, which have a non-configurable scheduled_delay.
max_export_timeout: 30s
# Config for Apollo usage report metrics.
usage_reports:
batch_processor:
max_export_timeout: 30s
scheduled_delay: 5s
max_queue_size: 2048
The old telemetry.apollo.batch_processor configuration will be used if you don't specify these new values. The router displays the configuration being used in an info-level log message at startup.
By @bonnici in https://github.com/apollographql/router/pull/8258
The subgraph_metrics configuration flag that powers Apollo Studio's Subgraph Insights feature has been promoted from experimental to preview. The flag name has been updated from experimental_subgraph_metrics to preview_subgraph_metrics:
telemetry:
apollo:
preview_subgraph_metrics: true
By @rregitsky in https://github.com/apollographql/router/pull/8200