releases.shpreview
Apollo GraphQL/Apollo GraphQL Blog

Apollo GraphQL Blog

Mon
Wed
Fri
JunJulAugSepOctNovDecJanFebMarAprMay
Less
More
Releases23Avg7/moVersionsv4.2

Today, we're launching a new API keys page in GraphOS Studio, a single place where you can view and manage your SCIM, subgraph, and operator API keys directly from the UI. With built-in tools to filter by key status, cleanly rotate expiring keys with configurable buffer periods, and revoke unused keys, it empowers platform teams to maintain strict security oversight over their graph infrastructure without disrupting active development workflows.

Figure 1: API keys page in GraphOS Studio

The problem it solves

API keys are the backbone of securing access to your GraphQL infrastructure. But managing them has historically been fragmented. Subgraph keys, operator keys, and SCIM keys lived in different places, and were only available through Rover CLI and the Platform API.

If a key was compromised or a token needed to be rotated out, your options were binary: delete it and lose the audit trail, or leave it active and accept the risk. Now, you can rotate or revoke a key so that it cannot be used, but you can continue to view it in your list of API keys.

What we built

View, search, and filter API keys at a glance

SCIM, subgraph, and operator keys are now visible in Apollo Studio. Search by name, filter by type or status, and quickly find the key you're looking for.

Figure 2: Filter API keys by type

Create, rename, and update expiration

Need a new API key for an upcoming deployment? As long as you have the appropriate permissions you can now spin one up without leaving the browser. You can also rename existing keys and adjust their expiration dates directly from the UI, keeping your key inventory clean and descriptive as your organization grows.

Figure 3: Create a new API key

Rotate API keys without downtime

The new rotate action lets you cycle out a key while keeping it accessible for investigation or a transition period, so you can maintain continuity without sacrificing oversight. No more choosing between security and uptime.

Figure 4: Rotate API key

Revoke vs. delete API key

Sometimes you need to immediately cut off access to a key, but you're not ready to erase it entirely. Revoking a key stops it from being used right away while preserving the record, useful for incident response, audits, or investigations where visibility matters. Deleting a key remains available when you're ready to permanently remove it.

Figure 5: Revoke API key

Getting started

Access to the API keys page is based on your role in GraphOS Studio. Graph admins can manage subgraph keys for their associated graphs. Org Admins have full access to all key types across the organization. Want to learn more? Check out our documentation for detailed setup instructions.

Rover CLI users, this doesn't change your workflow. Rover CLI continues to support creating, listing, deleting, and renaming subgraph and operator API keys. The Studio UI is an additive surface, not a replacement.

What's next

The API keys page is the foundation for a broader vision: a unified view of every key in your Apollo organization. Currently, graph and user API keys are available on separate key management pages, but in the future they will be available on the API keys page. You'll be able to see and manage personal, graph-level, and org-level keys side by side. We're also working on getting all of these new capabilities into Rover CLI.

v4.2

We're excited to announce the release of Apollo Client 4.2. This release brings two additional long-awaited features:

  • Type-safe default options
  • Event-based refetching

Let's dive in!

Type-safe default options

Prior to version 4.2, you had to choose between convenience and type-safety when you wanted to propagate a default option throughout your application. For example, you might set your default errorPolicy to "all" in order to render partially successful queries in your components.

This could introduce a mismatch between the runtime behavior and what TypeScript reports as the right value. Consider this useSuspenseQuery example:

const { data } = useSuspenseQuery(QUERY);
//      ^? TData

const { data } = useSuspenseQuery(QUERY, { errorPolicy: "all" });
//      ^? TData | undefined

With no explicit errorPolicy, useSuspenseQuery's throws by default when an error is returned, which means data can be typed as the query data type. Passing errorPolicy: "all" changes the return type to include undefined because the server might return an error with no data.

Changing the default errorPolicy in defaultOptions was considered unsafe however because it modified the runtime behavior, but the types remained the same. This could cause crashes in your production environment that weren't caught by TypeScript because you might access properties on undefined.

In 4.2, default options are now propagated through all React hooks and APIs to provide the correct type to match the runtime value. You opt in by declaring your default options using TypeScript module augmentation:

// apollo.d.ts
import "@apollo/client";

declare module "@apollo/client" {
  namespace ApolloClient {
    namespace DeclareDefaultOptions {
      interface WatchQuery {
        errorPolicy: "all";
      }
    }
  }
}

To make sure the runtime behavior matches the types, Apollo Client forces you to add a matching defaultOptions option:

new ApolloClient({
  // without this option, TypeScript reports an error
  defaultOptions: {
    watchQuery: {
      errorPolicy: "all"
    }
  }
});

With that TypeScript declaration in place, the hook now reflects the runtime value:

const { data } = useSuspenseQuery(QUERY);
//      ^? TData | undefined

This behavior extends to all query and mutation hooks and core APIs.

Learn more about declaring type-safe default options in the TypeScript guide.

Deprecation of generic arguments

To achieve type-safe default options, Apollo Client requires the use of type inference. As a result, passing generic arguments to hooks and core APIs is now deprecated.

// Generic arguments are no deprecated
useQuery<DataType, VariablesType>(QUERY)

You can still use this signature, but you won't be able to take advantage of the new type-safety. Migrate to TypedDocumentNode instead:

const QUERY: TypedDocumentNode<DataType, VariablesType> = gql``

Learn more about migrating in the migration guide.

Event-based refetching

One of our most popular requests has been window focus refetching, a feature popularized by TanStack Query that triggers automatic refetches when the browser tab regains focus. Building your own system for handling automatic refetches resulted in a complicated mess of useEffect or wrapper hooks to provide this sort of functionality yourself.

4.2 introduces the new RefetchEventManager, which handles refetches for you in response to events such as window focus or network reconnection. Pass a RefetchEventManager instance to the refetchEventManager option to opt-into automatic refetches:

import { RefetchEventManager, windowFocusSource } from "@apollo/client";

const client = new ApolloClient({
  // ...
  refetchEventManager: new RefetchEventManager({
    sources: {
      windowFocus: windowFocusSource,
    },
  }),
});

Anytime a user focuses the browser tab, the client automatically refetches active queries.

Queries can also opt-out of a specific event refetch with the new refetchOn option:

useQuery(QUERY, { 
  // Don't refetch this query when the windowFocus event is triggered
  refetchOn: { windowFocus: false } 
});

RefetchEventManager is designed for extensibility in mind. You can register your own custom events, provide customized handlers to determine which queries should be refetched, and more.

See the event-based refetching docs to learn more.

Wrapping up

Ready to upgrade? Install Apollo Client 4.2 today:

npm install @apollo/client@latest

For the full list of changes, check out the release notes. Questions and feedback are always welcome in the Apollo Community.

Happy querying!

We're excited to announce Apollo Kotlin 5 is now available.

Apollo Kotlin 3 was a full rewrite of Apollo Android for Kotlin multiplatform. Apollo Kotlin 4 reworked error handling, introduced the Apollo IDE plugin and semantic nullability.

Apollo Kotlin 5 is GraphQL Golden Path ready, comes with a new normalized cache, a rework of the Gradle plugin, new compiler plugins APIs, agent skills, linuxX64 and watchosDeviceArm64 targets and more.

If you're currently using Apollo Kotlin 4, you should feel right at home with Apollo Kotlin 5. Most APIs are untouched. For the others, read the migration guide for details.

For a comprehensive list of all changes, please review the full changelog or read on for a brief summary of the key updates.

GraphQL Golden Path

Apollo Kotlin aims to support the latest version of the GraphQL draft specification. Making a change to the specification draft is a long and near-irreversible process. Yet experimentation matters: it gives the community confidence that proposed changes are sound before they ship.

For this reason, Apollo Kotlin 5 supports a number of experimental GraphQL RFCs:

Make sure to join an upcoming working group and share feedback if you're using any of these.

Normalized cache

Apollo Kotlin 5 comes with a new, separately versioned, normalized cache that supports:

  • TTL
  • Garbage collection
  • Pagination
  • Binary format
  • Partial results

For more details, read the dedicated blog post.

Modernization

The Gradle plugin now uses Gratatouille classloader isolation, instead of GR8 relocation previously. This makes the plugin more robust and easier to debug.

Apollo Kotlin 5 uses KGP 2.3, with 2.1 compatibility for JVM and Android consumers. Native and JS consumers must compile with KGP 2.3+.

If you are using Apollo Kotlin with AI agents, the Apollo Kotlin agent skill is now available. Agents can use it to discover the Apollo Kotlin best practices.

Migration path

Previous DeprecationLevel.WARNING symbols are now DeprecationLevel.ERROR. Previous DeprecationLevel.ERROR symbols are removed.

Most of the runtime APIs as well as the package name are unchanged. If your build has no Apollo deprecation warnings on v4, the upgrade should require minimal changes.

The main breaking changes are in experimental Data Builders and Apollo Compiler Plugins.

See the v5 migration guide for a complete upgrade walkthrough.

Update today

Apollo Kotlin 5 is now available on Maven Central:

plugins {
  id("com.apollographql.apollo").version("5.0.0")
}

dependencies {
  implementation("com.apollographql.apollo:apollo-runtime:5.0.0")
}

Any feedback? Let us know what you think! The team is looking forward to seeing what you build!

Today we're releasing a preview of Rust-native composition for Apollo Federation. We are close to the finish line of a multi-year effort to replatform Apollo Federation, both the query planner and the composition engine, from JavaScript/TypeScript to Rust. You can try it now by enabling the Federation Next build track in GraphOS Studio.

Why Rust

When we shipped the Rust-native query planner over a year ago, the performance and reliability improvements were immediate. Composition was the obvious next target, and the last major Federation component still running in JavaScript.

The JS composition engine relied on an embedded Deno runtime and V8 to execute. That meant garbage collection pauses and high memory consumption for large graphs. Rust eliminates that entire class of overhead, and its predictable memory model handles graphs with hundreds of subgraphs comfortably.

Beyond performance, this unifies our codebase. The query planner and composition engine now live in a single Rust monorepo. We no longer need to duplicate bug fixes across two languages, synchronize spec changes between JS and Rust implementations, or manage complex cross-language dependency chains. That simplification lets us iterate faster on Federation features going forward.

Performance

We benchmarked the Rust implementation against the JS implementation across tens of thousands of real-world production graphs. The results are significant. At the median, Rust composition is roughly 27x faster. At p95, it's about 8.5x faster. The gap narrows at higher percentiles because the largest graphs stress both implementations, but Rust still delivers a 6x improvement at p99.

These numbers are pre-GA and measure composition in isolation. Your experience in Studio may differ since composition is one step in a larger build pipeline. That said, these results are with minimal dedicated optimization work on the Rust side, so there's room to go further.

How We Built It

Composition is a multi-stage pipeline. Rewriting all of it at once would have been risky. A single bug in any stage could produce incorrect supergraph schemas. Instead, we built a hybrid pipeline that let each stage run independently in either JavaScript or Rust.

This allowed us to develop and validate one stage at a time, comparing JS and Rust output at every boundary to catch regressions before they compounded. In production, we ran the Rust implementation in shadow alongside the existing JS pipeline, processing the same graphs in parallel so we could validate correctness and performance against real-world workloads without any risk.

How We Validated It

Correctness was the bar we had to clear before anything else. A composition engine that produces fast but wrong supergraph schemas is worse than useless. It's dangerous.

We tested against tens of thousands of real-world production graphs from our corpus. For every graph, we ran composition through both the JS and Rust implementations and compared their outputs at each stage of the pipeline. This wasn't just "does it produce a schema." We compared the full output including the supergraph schema, composition hints, and error messages.

To make these comparisons meaningful, we normalized for known cosmetic differences between the implementations. The JS and Rust compilers handle whitespace in schema descriptions differently, and JS relies on automatic type coercion for default values where Rust is explicit. We accounted for these so we could focus on semantic correctness rather than surface-level formatting noise.

At the time of this preview release, the Rust implementation produces identical results to JS on the vast majority of the corpus with the majority of differences appearing in hints or error messages rather than in the composed schemas themselves. A meaningful number of schema differences are cases where the Rust implementation is actually more correct. Any remaining differences are actively categorized and will be resolved before the GA release.

Try It

You can test the Rust composition preview today by enabling the Federation Next build track in GraphOS Studio. This will run your graph's compositions through the Rust pipeline.

We want to hear from you. If you encounter any differences in composition behavior, please report them through your usual support channel or file an issue on the Apollo Router GitHub repo.

What's Next

We're working toward general availability, which means resolving the remaining equivalence gaps, completing our production shadow validation, and ultimately removing the JS fallback path entirely.

We're also porting contracts, the last piece of Federation tooling still in JavaScript, to Rust. Once that's complete, the entire Federation stack will run natively.

Acknowledgments

This was a multi-year effort done right. Rewriting, validating and shipping a system that processes schemas for thousands of production graphs demands that kind of rigor.

We're grateful to everyone across the Apollo team who contributed, and to the community members whose large, complex real-world graphs continue to push us to make Federation better. Those graphs are both our hardest test cases and our strongest motivation.

Today we're honored to announce Apollo has been recognized in the 2026 Gartner Peer Insights "Voice of the Customer" for API Management for the first time. We scored 4.7 out of 5 stars from 55 verified enterprise reviewers, and 98% said they would recommend Apollo to a peer as of January 31, 2026.

"The overall experience with Apollo GraphOS has been amazing. It has provided us with good visibility on our subgraphs."

Every score in this report comes directly from customers: people who deployed the product, ran it in production, and took the time to write about the experience honestly. We also are named as a Strong Performer in the report's quadrant view.

Thank You

The above numbers show the aggregate view, but what our customers actually wrote is what I want to share.

  • "Excellent at allowing our teams onboard quickly to the graph without them needing to setup everything from scratch." – Software Developer in Media, 5/5 rating
  • "The overall experience with Apollo GraphOS has been amazing. It has provided us with good visibility on our subgraphs." – Engineer in Real Estate, 5/5 rating
  • "The Apollo Connector is something I was interested in as we have a lot of REST services and we plan to leverage that for our company." – Senior Staff Software Engineer in Banking, 5/5 rating
  • "Ease of setting up applications from scratch and the latest products with Apollo Connectors for REST APIs making it a breeze." – Software Developer in Media, 5/5 rating

To every customer who took the time to share their experience: thank you. The feedback you leave shapes how we build, how we support, and how we show up for the next team working through the same problems you solved.

If you're an Apollo customer and haven't yet left a review, we'd love to hear from you on Gartner Peer Insights. Your perspective makes the product better for everyone.

Gartner® and Peer Insights™ are trademarks of Gartner, Inc. and/or its affiliates. All rights reserved. Gartner® Peer Insights™ content consists of the opinions of individual end users based on their own experiences, and should not be construed as statements of fact, nor do they represent the views of Gartner or its affiliates. Gartner does not endorse any vendor, product or service depicted in this content nor makes any warranties, expressed or implied, with respect to this content, about its accuracy or completeness, including any warranties of merchantability or fitness for a particular purpose.

Gartner, Voice of the Customer for API Management, by Peer Contributors, 31 March 2026.

Over the past few months, we've introduced several additions to the GraphOS Platform API that make it easier than ever to analyze your graph's usage and performance trends. These new Insights APIs allow for detailed metric analysis of operations, fields, and subgraphs.

Let's take a closer look at what's new and how you can start using it today.

Top Operation Report

The topOperationsReport field can be used when you're looking for a point-in-time view of the operations run against your GraphQL server. This API can return operation details like the operation name, ID, and signature as well the total number of times the operation has been executed within the specified time range. The results can also be filtered to only show operations executed by specific client names and versions.

You could use this to generate a weekly report showing the top 10 operations executed by a client so that you can see how the operations and request counts are changing:

query WeeklyTopOperations {
  graph(id: "your-graph") {
    variant(name: "current") {
      topOperationsReport(
        from: "2025-11-01T00:00:00Z"
        to: "2025-11-07T00:00:00Z"
        filter: { in: { clientName: [some-client"] } }
        limit: 20
      ) {
        operationId
        name
        type
        requestCount
      }
    }
  }
}

You could also use this to get a list of all operation signatures that were executed within a day (assuming this is less than the max limit of 10k operations):

query AllDailyOperations {
  graph(id: "your-graph") {
    variant(name: "current") {
      topOperationsReport(
        from: "2025-11-01T00:00:00Z"
        to: "2025-11-02T00:00:00Z"
        limit: 10000
      ) {
        operationId
        name
        signature
        requestCount
      }
    }
  }
}

Insights Timeseries APIs

When you need to track how your graph's performance and usage evolve over time, you can use the timeseries operations to provide structured, time-bucketed insights. These insights are available for operations (operationInsightsTimeseriesReport), subgraphs (subgraphInsightsTimeseriesReport), and fields/enums (schemaCoordinateInsightsTimeseriesReport). These APIs allowing you to monitor changes in key metrics, automate data analysis workflows, or build dashboards.

They provide results in the form of a set of metrics, grouped by a set of dimensions, split into time chunks of a particular resolution across a specified time range, and include the option to filter in or out operations or fields that have specific dimensions.

Operation Insights Timeseries API

The operationInsightsTimeseriesReport field provides metrics such as request counts, latency percentiles, and error rates, grouped by time and dimensions like graph variant, operation name and ID, or client name and version. The results can be returned in a CSV or GraphQL format depending on your needs.

This is ideal for investigating usage spikes, comparing latency between operations, or monitoring error rates after a deployment.

For example, if you notice that a particular operation named GetUserProfile is slow, you could query for the p95 latency for each hour across the last week:

query OperationLatencyTrend {
  graph(id: "your-graph") {
    operationInsightsTimeseriesReport(
      resolution: HOUR
      from: "2025-11-01T00:00:00Z"
      to: "2025-11-07T00:00:00Z"
      dimensions: [OPERATION_ID]
      metrics: [REQUEST_COUNT, REQUEST_LATENCY_P99_MS]
      filters: { 
        include: { 
          variantName: "current"
          operationName: ["GetUserProfile"]
        } 
      }
    ) {
      csv
      records {
        startTimestamp
        endExclusiveTimestamp
        dimensions { type value }
        metrics { type value }
      }
    }
  }
}

You could also query for the request count and error count per minute across an hour for each client and version except for an internal test client, so that you can see which clients contributed to a particularly heavy request period:

query ClientRequestTrend {
  graph(id: "your-graph") {
    operationInsightsTimeseriesReport(
      resolution: MINUTE
      from: "2025-10-01T00:00:00Z"
      to: "2025-10-01T01:00:00Z"
      dimensions: [GRAPH_VARIANT, CLIENT_NAME, CLIENT_VERSION]
      metrics: [REQUEST_WITH_ERROR_COUNT, REQUEST_COUNT]
      filters: {
        exclude: { clients: [{ clientName: "internal-test-client" }] }
      }
    ) {
      csv
      records {
        startTimestamp
        endExclusiveTimestamp
        dimensions { type value }
        metrics { type value }
      }
    }
  }
}
Subgraph Insights Timeseries API

The subgraphInsightsTimeseriesReport field provides subgraph request count and latency percentiles grouped by time and dimensions like graph variant, fetch service name (the name of the subgraph or connector), client, or operation. Like all the timeseries reports, the results can be returned in a CSV or GraphQL format depending on your needs.

For example, to see a daily trend of fetches and fetches with errors to all your subgraphs and connectors across all variants of a graph, you could use the following query:

query SubgraphInsights {
  graph(id: "your-graph") {
    subgraphInsightsTimeseriesReport(
      resolution: DAY, 
      from: "2025-10-01T00:00:00Z"
      to: "2025-11-01T00:00:00Z"
      dimensions: [GRAPH_VARIANT, FETCH_SERVICE_NAME], 
      metrics: [FETCH_COUNT, FETCH_WITH_ERRORS_COUNT], 
    ) {
      records {
        startTimestamp
        dimensions {
          type
          value
        }
        metrics {
          type
          value
        }
      }
    }
  }
}
Schema Coordinate Insights Timeseries API

The schemaCoordinateInsightsTimeseriesReport field focuses on schema coordinate metrics, helping you understand how specific object fields, input object fields, or enums are being used. Again, the results can be returned in a CSV or GraphQL format depending on your needs.

Use this to track adoption of new fields or monitor error rates tied to specific parts of your graph.

For example, to check the adoption and for any errors on a new User.email object field across all variants, you could use:

query FieldUsageTrend {
  graph(id: "your-graph") {
    schemaCoordinateInsightsTimeseriesReport(
      resolution: HOUR
      from: "2025-11-01T00:00:00Z"
      to: "2025-11-02T00:00:00Z"
      dimensions: [VARIANT_NAME]
      metrics: [REQUEST_COUNT, ERROR_COUNT]
      filters: {
        include: { 
          coordinates: [{ 
            kind: OBJECT_FIELD, 
            namedType: "User", 
            namedAttribute: "email" 
          }] 
        }
      }
    ) {
      csv
      records {
        startTimestamp
        endExclusiveTimestamp
        dimensions { type value }
        metrics { type value }
      }
    }
  }
}

To see a monthly overview of the usage of the UserRole enum values or to check for any unused roles, you could use:

query EnumUsageOverview {
  graph(id: "your-graph") {
    schemaCoordinateInsightsTimeseriesReport(
      resolution: MONTH
      from: "2025-01-01T00:00:00Z"
      to: "2025-11-01T00:00:00Z"
      dimensions: [NAMED_TYPE, NAMED_ATTRIBUTE]
      metrics: [REQUEST_COUNT]
      filters: {
        include: {
          coordinates: [{ kind: ENUM_VALUE, namedType: "UserRole" }]
        }
      }
    ) {
      csv
      records {
        startTimestamp
        endExclusiveTimestamp
        dimensions { type value }
        metrics { type value }
      }
    }
  }
}

Wrapping Up

For more information, check the Platform API schema and sample operations for querying insights. We hope you try out these new features soon, and if you have any feedback, we'd love to hear it in our Apollo Community discussion thread.

Agent Skills reduced our AI agent's token consumption from 65,000 to 24,000 tokens and completion time from five minutes to under two — for the exact same task with the exact same agent. Here's what happened, why it matters, and what it means for how you build with AI agents.

There's a recurring question in the AI community and engineering circles right now: with frontier models becoming more capable every quarter, do AI agents actually need curated Skills and instructions? Or will raw model intelligence eventually make them unnecessary?

We decided to test it. We ran a controlled comparison and the data speaks for itself.

The experiment

Apollo MCP Server gives AI agents a secure way to access any GraphQL API. Instead of writing custom integration code, agents explore the schema, build valid operations, and fetch data through a single MCP interface. It handles schema discovery efficiently so agents don't need the entire schema in context, just the parts relevant to the current task.

We used this as our test case: ask Claude Code to set up an Apollo MCP Server connected to a public GraphQL API, then use it to fetch country data. Same model, same starting conditions, same goal. The only variable was whether the agent had access to Apollo Skills.

First, here's the agent working without Skills:

And here's the same task with Apollo Skills installed:

The results:

Without SkillsWith Skills
Time to completion5+ minutes< 2 minutes
Token consumption~65,000~24,000

Both runs produced the same correct result. The difference was the path to get there.

Why the gap exists

Watch the two recordings side by side and the pattern is obvious. Without Skills, the agent spends most of its time researching: searching the web, fetching documentation, reading through pages of results, extracting relevant pieces, then attempting an approach. Often incorrectly on the first try. It backtracks, adjusts, tries again. It gets there, but the route is wasteful.

Think of it as off-roading versus a highway. Without Skills, many paths look plausible, but some lead to dead ends, outdated patterns, or unnecessary detours. Each wrong turn burns tokens, and the agent has no way to know the path was wrong until it gets there.

Skills pave the road. The agent knows what knowledge is available and when to load it. No researching. No guessing. No trial and error. When the right Skill activates, the agent reads it, understands the correct approach, and executes. Same destination, far less fuel.

That 65k-to-24k drop is a 63% reduction for the same outcome. For teams running agents at scale, that translates directly to cost savings and faster iteration cycles.

This wasn't a one-off. We've been running comparisons for months since we launched Apollo Skills, and a few patterns keep showing up.

Skill quality matters more than quantity

Bad Skills can do more harm than good. A poorly written Skill introduces noise into the agent's context window (wrong patterns, outdated syntax, misleading instructions) and the agent follows them dutifully. You end up worse off than if you'd given it nothing.

This matters even more when you consider what models already "know." The Apollo Client knowledge baked into current models is often outdated or incorrect. Models are confidently wrong without correction. A well-maintained Skill is an authoritative, up-to-date source of truth that takes precedence over whatever the model learned during pre-training.

The download numbers on skills.sh tell this story clearly. Our rust-best-practices Skill has 5,000+ downloads, the most-downloaded Rust Skill on the platform, despite being published after several alternatives. The earlier ones were low quality. Engineers tried them, got worse results, and moved on. The same pattern holds across our GraphQL Skills: apollo-client at 1.7K, graphql-schema at 1.1K, graphql-operations at 963, and apollo-mcp-server at 895. Quality wins.

The implication: treat Skills like production code. Review them. Test them. Hold them to a standard.

Skills need a maintenance cycle

A Skill that helps today might hurt six months from now. Models train on increasingly recent data and reason better with each generation. The gap a Skill was designed to bridge can shrink or disappear.

Not all Skills age the same way, though. Some compensate for things a model gets wrong today, like outdated API patterns or incorrect library usage. These have the shortest lifespan because a future model may handle them natively.

Others encode team conventions (coding standards, naming patterns, response formatting) that no model will learn from public training data. These last longer but still need review as your own standards evolve.

The point is to treat Skills as something you maintain, not something you write once and forget:

  • Test with and without. When a new model drops, re-run your tasks with the Skill disabled. If the model produces the same quality at comparable cost and speed, the Skill has done its job. Consider retiring it.
  • Update when the product changes. If your API ships a new auth flow or deprecates an endpoint, the Skill needs to reflect that. Stale Skills produce stale code.
  • Keep evals running. Even after retiring a Skill, keep validating that the model handles the underlying task correctly. This catches regressions before they reach production.

We're building this into how Apollo Skills work. Each Skill targets the latest stable version of its product, and the team that owns the product owns the accuracy of the Skill.

Those teams don't manually watch for drift, though. AI-powered sync pipelines detect when product changes affect a Skill's content. An LLM triages the diff, determines whether the existing guidance is now stale, and if so, generates an update as a pull request. Product teams review and approve rather than writing updates themselves.

Skills stay in sync with the products they describe without becoming a maintenance burden. The guidance agents receive reflects the actual state of the tools, not a snapshot from months ago.

Skills shine in constrained environments

Not everyone runs the latest frontier model with unlimited tokens. Many teams use local LLMs, self-hosted models, or lower-tier API plans for cost, privacy, or compliance reasons. CI/CD pipelines add another constraint: automated agents running in CI often operate with smaller models and stricter token budgets to keep build costs predictable. These environments need more guidance, not less.

Skills close the gap. They give a constrained agent access to the same curated knowledge that a larger model might have partially internalized through training. A smaller model with the right Skill can match a larger model without one, producing correct results faster and with fewer tokens.

The CI case is worth calling out specifically. When an agent runs in a pipeline (generating code, reviewing pull requests, updating docs) every minute it spends researching is a minute a developer waits for feedback. If the agent takes five minutes instead of two, that delay compounds across every PR, every build, every developer on the team. Skills cut that wait time. Faster agents mean faster feedback loops.

Whether it's a smaller model, a token-limited CI job, or self-hosted infrastructure, Skills can be the difference between an agent that blocks your workflow and one that speeds it up.

What this means for your team

Both runs in our experiment produced correct results. The agent is capable without Skills. But are you willing to pay the cost of letting it figure things out from scratch every time?

For a single ad-hoc task, maybe the difference doesn't matter. For teams running agents across dozens of workflows daily, fewer tokens and less time per task adds up. The exact savings vary by Skill and model, but the direction is consistent: Skills cut tokens and time. Multiply that across an engineering org and the return becomes hard to ignore.

There's also the repetition problem. Without Skills, developers paste the same instructions into every new agent session: "use v4 of this API, not v3," "follow this naming convention," "don't use the deprecated auth flow." Every session starts from zero, and the developer becomes responsible for catching the agent's mistakes.

Skills encode that guidance once. The agent picks it up automatically, session after session, without the developer repeating themselves.

The less capable the model or the tighter the token budget, the more Skills matter. They put the right knowledge in context so the agent doesn't have to go find it.

Apollo Skills are now marked as Official Skills on skills.sh, meaning they're published and maintained by the team that builds the technology. To get started:

npx skills add apollographql/skills

More on how Apollo Skills work and what Apollo contributes: Apollo Skills: Teaching AI agents how to use Apollo and GraphQL.

Last Checked
34m ago
Latest
Jun 2, 2026
Tracking since Dec 14, 2022