releases.shpreview
Datadog/Datadog dd-trace-py

Datadog dd-trace-py

Mon
Wed
Fri
JunJulAugSepOctNovDecJanFebMarAprMay
Less
More
Releases48Avg16/moVersionsv4.5.2 to v4.10.0
v4.10.1
Bug Fixes
  • internal: Fixed an issue that could have caused some timers, like the one responsible for Symbol Database uploads, to fire repeatedly after the first execution.
<!-- -->
  • internal: This fix resolves a memory leak where reference cycles through PeriodicThread callbacks were invisible to Python's cyclic garbage collector and could accumulate when threads used bound methods as targets.
<!-- -->
  • profiling: Fixes a memory leak in native frame tracking caused by unbounded native call-site metadata growth.
<!-- -->
  • SCA: This fix resolves an issue where unresolved runtime reachability targets could accumulate across Software Composition Analysis updates, causing resident memory usage to grow over time.
v4.10.0
Bug Fixes
  • AAP: This fix resolves an issue where the AppSec body-parsing hook consumed the websocket.connect ASGI message, causing ASGI/FastAPI WebSocket connections to fail with HTTP 500 when AppSec was enabled.
<!-- -->
  • LLM Observability: Fixes an issue where reasoning_content was missing from streamed chat completions in the OpenAI and LiteLLM integrations when an OpenAI-compatible reasoning provider (e.g. DeepSeek, Qwen) emitted delta.reasoning_content chunks. The aggregated message now captures reasoning text in the output message, matching non-streaming behavior.
<!-- -->
  • dynamic instrumentation: fixes an issue where the Symbol Database uploader sends empty payloads on a recurring timer.
Other Changes
  • LLM Observability: when LLMObs is enabled in agentless mode (Datadog Agent not reachable or with DD_LLMOBS_AGENTLESS_ENABLED=1), APM traces are now exported agentlessly to Datadog's intake. This should not change user-facing behavior: both APM and LLMObs spans remain visible in the UI; LLMObs spans are simply no longer shipped separately for agentless users. Note that setting DD_APM_TRACING_ENABLED=false takes higher precedence and will result in LLMObs span events shipping separately as existing behavior.
v4.9.0
Upgrade Notes
  • AI Guard: ddtrace.appsec.ai_guard.AIGuardAbortError now derives from ddtrace.internal._exceptions.DDBlockException (a BaseException subclass) instead of Exception. This brings AI Guard block decisions in line with how ASM blocks are surfaced and prevents a generic except Exception: in user code from silently swallowing a block.
<!-- -->
  • settings: Legacy environment variable names registered as aliases in the configuration registry now also work when set via local or fleet stable config files, not just shell environment variables. #17958
<!-- -->
  • ray: Adds DD_TRACE_RAY_IGNORED_ACTORS configuration to exclude specific Ray actor methods from instrumentation. Set DD_TRACE_RAY_IGNORED_ACTORS='{"ActorA": ["method1"], "ActorB": "*"}' to leave matching methods or actors uninstrumented while continuing to trace other Ray actor methods. Matching is based on actor class name only.
Deprecation Notes
  • Tracing: DD_TRACE_INFERRED_SPANS_ENABLED is deprecated and will be removed in 5.0.0. Use DD_TRACE_INFERRED_PROXY_SERVICES_ENABLED instead. The old environment variable continues to work but emits a DDTraceDeprecationWarning when set.
New Features
  • aws_durable_execution_sdk_python: Add tracing support for the aws-durable-execution-sdk-python library. Instruments @durable_execution workflows and DurableContext operations (step, invoke, wait, wait_for_condition, wait_for_callback, create_callback, map, parallel, run_in_child_context) to generate spans.
<!-- -->
  • LLM Observability: Adds step spans to the Claude Agent SDK integration. Each inference cycle is now represented by a step container span with an llm child span for the model call and tool child spans for any tool invocations.
<!-- -->
  • tracing: Adds a centralized supported-configurations.json registry of all supported DD_* and OTEL_* environment variables, following the same schema used by other Datadog tracing libraries. Accesses to unregistered environment variables now produce a debug log to help identify typos or unsupported configuration options.
<!-- -->
  • AI Guard: Copies anomaly-detection attributes from the local root (service-entry) span onto every ai_guard span: ai_guard.http.useragent, ai_guard.http.client_ip, ai_guard.network.client.ip, ai_guard.usr.id and ai_guard.usr.session_id.
<!-- -->
  • AI Guard: When DD_AI_GUARD_ENABLED=true is set and an ai_guard span is created during a request, the tracer now populates http.client_ip and network.client.ip on the service-entry (local root) span, mirroring the behavior used for Application Security. If AI Guard does not run during the request, no client IP tags are added. DD_TRACE_CLIENT_IP_ENABLED is ignored once AI Guard reports, and DD_TRACE_CLIENT_IP_HEADER continues to override header resolution.
<!-- -->
  • ai_guard: add AI Guard evaluation support to the OpenAI SDK chat completions instrumentation. Both non-streaming and streaming requests and non-streaming responses are evaluated through the configured AI Guard client, and evaluation is automatically skipped when a framework integration (LangChain, Strands Agents) is already evaluating the same call.
<!-- -->
  • code origin for spans: The code origin for spans feature has been enabled by default.
<!-- -->
  • code origin: attach code origin information to the first span generated by a function wrapped with <span class="title-ref">tracer.wrap</span>.
<!-- -->
  • openfeature: This introduces a configurable initialization timeout for DataDogProvider. The timeout controls how long initialize() waits for configuration before returning, and defaults to 10 seconds. Set it via the DD_EXPERIMENTAL_FLAGGING_PROVIDER_INITIALIZATION_TIMEOUT_MS environment variable or the init_timeout constructor parameter.
<!-- -->
  • CI Visibility: This introduces Jenkins custom parent ID propagation, which enables Datadog to correlate tests run from Jenkins with their Jenkins jobs and pipelines.
<!-- -->
  • LLM Observability: Adds an optional cost_tags argument to LLMObs.annotate() and LLMObs.annotation_context(). Pass a list of tag keys (already set via tags or annotated previously on the same span) to have them attached to the cost and token metrics generated from LLM and embedding spans, which can help breaking down spend by team, project, org, or any custom dimension.
<!-- -->
  • LLM Observability: Adds support for an optional version (string) field on each tool definition dictionary passed to LLMObs.annotate() via the tool_definitions parameter.
<!-- -->
  • profiling: Add DD_PROFILING_LOCK_EXCLUDE_MODULES config to skip lock profiling for framework-internal locks. Excluded locks remain native with zero profiling overhead. Set it to a comma-separated list of module prefixes (e.g., django.db,sqlalchemy.pool,urllib3).
<!-- -->
  • LLM Observability: Bedrock Agent orchestration step events (model invocations, tool/action group calls, knowledge base lookups, guardrails, rationales) are now emitted as APM child spans of the Bedrock Agent <agent_id> span when LLM Observability is enabled, with the same LLMObs payload shape as before.
Bug Fixes
  • tracing: Exclude wrapt==2.2.0 from the supported dependency range to avoid a regression that breaks wrapped C descriptors.

  • ai_guard: This fix resolves a conflict between ddtrace.auto and strands when imported in the same file, which left Strands hooks silently disabled. The Strands integration now loads lazily on first attribute access so its event class identities match those the agent dispatches.

<!-- -->
  • appsec: Adds telemetry metrics instrum.user_auth.missing_user_login and instrum.user_auth.missing_user_id when Django auth events cannot resolve the expected identity fields, enabling detection of misconfigured user model field mappings.
<!-- -->
  • AAP: This fix resolves an issue where the usr.session_id tag was missing from the entry span of authenticated follow-up Django requests when automatic user instrumentation was enabled. They now also carry usr.session_id, matching other authenticated user-tagging paths.
<!-- -->
  • azure_cosmos: This change removes the http.status_code tags from Azure CosmosDB spans and replaces them with the use of the db.response.status_code metric. For customers using ddtrace v4.8.0 and relying on the http.status_code tag of cosmosdb.query spans, this is a breaking change.
<!-- -->
  • CI Visibility: Fixes an issue in the pytest plugin where a malformed log call emitted a --- Logging error --- traceback to stderr during Attempt to Fix retries, polluting pytest output and contributing to spurious test failures.
<!-- -->
  • CI Visibility: Fixes an IndexError in retry bookkeeping that occurred when a test's teardown phase failed. The error produced --- Logging error --- tracebacks in stderr, which could pollute test output and cause spurious test failures during retries. #17863
<!-- -->
  • CI Visibility: Fixes a regression where setting DD_TEST_MANAGEMENT_ENABLED=0 was not honored by the new pytest plugin, causing Test Management features such as quarantining, disabling tests, and Attempt to Fix to remain enabled.
<!-- -->
  • CI Visibility: Fixes code coverage instrumentation on Python 3.13, 3.14, and 3.15. Resolves lost per-test line data caused by: sys.monitoring callbacks running in a snapshot context where ContextVar changes are not visible (Python 3.14+); empty modules emitting no LINE events (Python 3.13+); and ProcessPoolExecutor child coverage not being propagated to the parent context. Also fixes a stale-data bug where child process executable lines could inflate coverage denominators after stop_coverage() was called before join().
<!-- -->
  • datastreams: Demotes the retry limit exceeded submitting pathway stats log from ERROR to WARNING and removes the multi-line traceback from the record. This message fires when the processor cannot reach the agent within its 1-second timeout; the dropped 10 seconds of DSM data is auto-recovered on the next flush.
<!-- -->
  • LLM Observability: Fixes a concurrency bug in the Bedrock Agent integration where concurrent invoke_agent calls could orphan or cross-attribute spans due to shared class-level state. Per-invocation state is now used.
<!-- -->
  • LLM Observability: This fix resolves an issue where text wrapped in Bedrock Converse guardContent content blocks was rendered as [Unsupported content type: guardContent] in traces, dropping the user's input.
<!-- -->
  • Fixed an issue that could have caused some instrumented code to fail to execute correctly when the original function had keyword arguments passed in as a cell variable.
<!-- -->
  • CI Visibility: Fixes an issue where tests marked as attempt-to-fix could have failures hidden when they were also quarantined or disabled.
<!-- -->
  • django: Stop tagging async view and middleware spans as errored on routine ASGI cancellations (e.g. client disconnects on streaming responses), a regression introduced in 4.8.0rc4. Cancellation still propagates; the span just finishes without error.type='asyncio.exceptions.CancelledError'.
<!-- -->
  • django: Fixes DD_DJANGO_DATABASE_SERVICE and DD_DJANGO_DATABASE_SERVICE_NAME, which were previously generated as DD_DJANGO-DATABASE_SERVICE and DD_DJANGO-DATABASE_SERVICE_NAME. The hyphenated names were invalid POSIX identifiers and unusable from most shells. Hyphens in integration names are now normalized to underscores when building env var names. The old hyphenated names are preserved as aliases for backward compatibility. #17952
<!-- -->
  • django: API endpoint discovery now supports Django sub-applications mounted with django.urls.include(...). Endpoints are reported with their full URL path including the parent prefix — for example, a view served at /api/users/ is now reported as /api/users/ instead of losing the /api/ prefix.
<!-- -->
  • django: API endpoint discovery now reports the correct HTTP methods for views decorated with @require_http_methods combined with another decorator such as @csrf_exempt; the declared methods are reported instead of a generic wildcard entry.
<!-- -->
  • telemetry: tolerate malformed installed distribution metadata so a single bad dist-info entry no longer floods stderr with repeated tracebacks.
<!-- -->
  • langchain, botocore: This fix resolves an issue where auto-instrumented langchain_aws.ChatBedrockConverse spans reported an opaque inference-profile ARN identifier as the model name when an inference profile was used. base_model_id which represents the underlying foundation model is now checked first when extracting model names, and the botocore Bedrock integration reads the resolved base model from a shared in-process cache populated by langchain so the same resolution applies to the underlying bedrock-runtime span.
<!-- -->
  • LLM Observability: This fix resolves an issue where running an experiment with a dataset whose records had null metadata caused the summary evaluator to crash with a TypeError while preparing evaluator inputs.
<!-- -->
  • LLM Observability: Changes the default model_name and model_provider of LLM and embedding spans from custom to unknown if not provided or empty. This applies to both auto-instrumented spans and manual instrumentation via LLMObs.llm() / LLMObs.embedding() and the @llm / @embedding decorators.
<!-- -->
  • profiling: Fixes an issue where the lock profiler silently stopped capturing lock events when running under ddtrace-run with <span class="title-ref">gevent</span> installed.
<!-- -->
  • LLM Observability: The OpenAI integration now preserves assistant message content when tool_calls are present on the same message. #17760
<!-- -->
  • openfeature: This fix resolves an issue where DataDogProvider.initialize() returned before configuration was received, causing the OpenFeature SDK to mark the provider as ready to serve evaluations too early and flag evaluations to silently return default values. The provider now waits for configuration before returning.
<!-- -->
  • openfeature: Fixes targeting key handling in the OpenFeature provider. None targeting key is now correctly passed to the native evaluator instead of being coerced to empty string. Flags that don't require a targeting key (static, rule-based) now evaluate successfully without one, matching the Datadog provider spec. Additionally, the Rust binding now correctly maps TargetingKeyMissing errors from libdatadog instead of returning a generic error code.
<!-- -->
  • tracing: Fixes an issue where the svc.auto process tag produced garbled values such as python_-m_unittest when a process was launched with the full command as a single sys.argv[0] string (e.g. from a Docker ENTRYPOINT, a process manager, or a subprocess call with an unsplit command). The correct module or script name is now extracted in these cases. #17764
<!-- -->
  • profiling: A crash that could happen in child processes after fork has been fixed.
<!-- -->
  • profiling: A rare crash caused by the memory allocation profiler has been fixed.
<!-- -->
  • RemoteConfig: Fixed an issue where deleted remote configurations were not applied, causing stale settings to persist.
<!-- -->
  • tracing: This fix resolves a memory leak where reference cycles through a span's properties were invisible to Python's cyclic garbage collector and accumulated proportionally to traced call volume.
<!-- -->
  • starlette: This fix resolves an issue where passing middleware=None caused application startup to fail when Starlette tracing was enabled.
<!-- -->
  • wsgi: This fix resolves an issue where the http.url tag on inbound request spans contained the WSGI mount prefix twice (for example /admin/admin/users instead of /admin/users) when the application was served behind werkzeug.middleware.dispatcher.DispatcherMiddleware or any other in-process mount that preserves the original request URI in RAW_URI / REQUEST_URI while also setting SCRIPT_NAME.
<!-- -->
  • tracing: A crash that occurred when exiting a gevent application with DD_TRACE_DEBUG=1 has been fixed.
<!-- -->
  • langchain: Strips interface identifiers (e.g. chat, llm) and path prefixes (e.g. models/) when extracting the model_provider and model_name, so reported values identify the actual provider and model name rather than the LangChain interface or API resource path.
<!-- -->
  • llmobs: fixes child spans created within an experiment task not inheriting the dataset_id tag. Previously only dataset_name was propagated via baggage to child spans; dataset_id is now propagated as well, making dataset, project, and experiment context (name and ID) consistent across all spans in an experiment trace.
<!-- -->
  • Profiling: This fixes a bug where uploaded profiles would not have a linked span post fork
<!-- -->
  • profiling: A rare crash happening on systems with small stack sizes when profiling asyncio code has been fixed.
<!-- -->
  • propagation: Limits parsing of the W3C tracestate header during tracecontext extraction to 32 list-members and 512 UTF-8 bytes, consistent with the W3C Trace Context specification (https://www.w3.org/TR/trace-context/). Extra list-members and trailing whole entries that would exceed the byte budget are ignored, so unusually large headers no longer expand unbounded work during extraction. The Datadog dd= list-member is preferred: it is kept when present (including when it appears late in the header or alone exceeds the byte cap), and other vendors are dropped first. List-members longer than DD_TRACE_TRACESTATE_ITEM_MAX_CHARS (128) characters are removed first when trimming by list-member count or byte budget, so shorter vendor entries are kept when possible.
<!-- -->
  • Fixed a startup deadlock when using snowflake-connector-python >= 4.4.0 with DD_TRACE_SNOWFLAKE_ENABLED=true.
<!-- -->
  • tracing: This change fixes an issue in which <span class="title-ref">svc_src</span> is set to <span class="title-ref">m</span> in cases where <span class="title-ref">service</span> matches the <span class="title-ref">_default_service</span> of an active integration config. In such cases, the intended behavior is that it <span class="title-ref">svc_src</span> is equal to <span class="title-ref">service</span>. #17712
<!-- -->
  • tracing: Parsing incoming baggage HTTP headers now respects DD_TRACE_BAGGAGE_MAX_ITEMS [default 64] and DD_TRACE_BAGGAGE_MAX_BYTES [default 8192], consistent with baggage injection. Previously, extraction could retain every comma-separated entry regardless of those limits. The tracer drops excess pairs and records truncation telemetry when limits apply.
Other Changes
  • profiling: The ECHION_ALT_VM_READ_FORCE configuration flag has been removed and support for the associated feature has been dropped.
v4.8.7
Bug Fixes
  • tracing: Exclude wrapt==2.2.0 from the supported dependency range to avoid a regression that breaks wrapped C descriptors.
  • LLM Observability: Agent connection error logs are no longer logged when using agentless and not running a Datadog agent.
v4.8.6

Estimated end-of-life date, accurate to within three months: 05-2027 See the support level definitions for more information.

Bug Fixes
  • LLM Observability: Users auto instrumenting Langchain and using Bedrock inference profiles would have their spans' be associated with the inference profile's ARN instead of the underlying LLM model. We now resolve the correct underlying model if users pass base_model_id to their ChatBedrockConverse instantiation. https://github.com/DataDog/dd-trace-py/pull/18151

Full Changelog: https://github.com/DataDog/dd-trace-py/compare/v4.8.5...v4.8.6

v4.8.5

Estimated end-of-life date, accurate to within three months: 05-2027 See the support level definitions for more information.

Deprecation Notes
  • Tracing: DD_TRACE_INFERRED_SPANS_ENABLED is deprecated and will be removed in 5.0.0. Use DD_TRACE_INFERRED_PROXY_SERVICES_ENABLED instead. The old environment variable continues to work but emits a DDTraceDeprecationWarning when set.
Bug Fixes
  • LLM Observability: This fix resolves an issue where text wrapped in Bedrock Converse guardContent content blocks was rendered as [Unsupported content type: guardContent] in traces, dropping the user's input.
<!-- -->
  • LLM Observability: The OpenAI integration now preserves assistant message content when tool_calls are present on the same message. #17760
<!-- -->
  • starlette: This fix resolves an issue where passing middleware=None caused application startup to fail when Starlette tracing was enabled.
<!-- -->
  • LLM Observability: Resolves an issue in the Claude Agent SDK integration where unnecessary LLM spans were being created and affecting the trace structure. The handler now only opens a new LLM span after a UserMessage that actually contained tool results, so messages without tool results no longer overwrite the in-flight LLM span.
v4.8.3

Estimated end-of-life date, accurate to within three months: 05-2027 See the support level definitions for more information.

Bug Fixes
  • azure_cosmos: This change removes the http.status_code tags from Azure CosmosDB spans and replaces them with the use of the db.response.status_code metric. For customers using ddtrace v4.8.0 and relying on the http.status_code tag of cosmosdb.query spans, this is a breaking change.

Estimated end-of-life date, accurate to within three months: 05-2027 See the support level definitions for more information.

Bug Fixes
  • CI Visibility: This fix resolves an issue where a failure response from the /search_commits endpoint caused the git metadata upload to fall back to sending the full 30-day commit history instead of aborting. This fallback could trigger cascading write load on the backend. The upload now aborts when search_commits fails, matching the behavior when the /packfile upload itself fails.
<!-- -->
  • ai_guard: This fix resolves a conflict between ddtrace.auto and strands when imported in the same file, which left Strands hooks silently disabled. The Strands integration now loads lazily on first attribute access so its event class identities match those the agent dispatches.
<!-- -->
  • AAP: This fix resolves an issue where Application and API Protection (AAP) was incorrectly reported as an enabled product in internal telemetry for all services by default. Previously, registering remote configuration listeners caused AAP to be reported as activated even when it was not actually enabled. This had no impact on customers as it only affected internal telemetry data. AAP is now only reported as activated when it is explicitly enabled or enabled through remote configuration.
<!-- -->
  • iast: A crash has been fixed.
<!-- -->
  • profiling: A crash that could happen in child processes after fork has been fixed.
<!-- -->
  • tracing: This fix resolves a memory leak where reference cycles through a span's properties were invisible to Python's cyclic garbage collector and accumulated proportionally to traced call volume.
<!-- -->
  • internal: A crash has been fixed.
<!-- -->
  • LLM Observability: Fixes incorrect span hierarchy in LLMObs traces when using the ddtrace SDK alongside OTel-based instrumentation (e.g. Strands Agents). OTel gen_ai spans (e.g. invoke_agent) were incorrectly appearing as siblings of their SDK parent span (e.g. call_agent) rather than being nested under it.
<!-- -->
  • profiling: A rare crash occurring when profiling asyncio code with many tasks or deep call stacks has been fixed.
<!-- -->
  • profiling: A bug in Lock Profiling that could cause crashes when trying to access attributes of custom Lock subclasses (e.g. in Ray) has been fixed.
<!-- -->
  • profiling: A race condition which could make asyncio code raise exceptions at exit has been fixed.
<!-- -->
  • serverless: AWS Lambda functions now appear under their function name as the service when DD_SERVICE is not explicitly configured. Service remapping rules configured in Datadog will now apply correctly to Lambda spans.

Estimated end-of-life date, accurate to within three months: 05-2027 See the support level definitions for more information.

Bug Fixes
  • ai_guard
    • This fix resolves a conflict between ddtrace.auto and strands when imported in the same file, which left Strands hooks silently disabled. The Strands integration now loads lazily on first attribute access so its event class identities match those the agent dispatches.
  • tracing
    • Limits parsing of the W3C tracestate header during tracecontext extraction to 32 list-members and 512 UTF-8 bytes, consistent with the W3C Trace Context specification (https://www.w3.org/TR/trace-context/). Extra list-members and trailing whole entries that would exceed the byte budget are ignored, so unusually large headers no longer expand unbounded work during extraction. The Datadog dd= list-member is preferred: it is kept when present (including when it appears late in the header or alone exceeds the byte cap), and other vendors are dropped first. List-members longer than DD_TRACE_TRACESTATE_ITEM_MAX_CHARS (128) characters are removed first when trimming by list-member count or byte budget, so shorter vendor entries are kept when possible.
    • tracing: Parsing incoming baggage HTTP headers now respects DD_TRACE_BAGGAGE_MAX_ITEMS [default 64] and DD_TRACE_BAGGAGE_MAX_BYTES [default 8192], consistent with baggage injection. Previously, extraction could retain every comma-separated entry regardless of those limits. The tracer drops excess pairs and records truncation telemetry when limits apply.

Estimated end-of-life date, accurate to within three months: 05-2027 See the support level definitions for more information.

Bug Fixes
  • django: Stop tagging async view and middleware spans as errored on routine ASGI cancellations (e.g. client disconnects on streaming responses), a regression introduced in 4.8.0rc4. Cancellation still propagates; the span just finishes without error.type='asyncio.exceptions.CancelledError'.
<!-- -->
  • tracing: This fix resolves a memory leak where reference cycles through a span's properties were invisible to Python's cyclic garbage collector and accumulated proportionally to traced call volume.

Estimated end-of-life date, accurate to within three months: 05-2027 See the support level definitions for more information.

Upgrade Notes
  • claude_agent_sdk: Tool span resource names have changed from the tool name (e.g. Read, Bash) to claude_agent_sdk.tool. The specific tool name is still available in the span name (e.g. claude_agent_sdk.tool.Read). Users relying on tool resource names should update them accordingly.

  • ray: Adds DD_TRACE_RAY_SUBMISSION_SPANS_ENABLED (default: False) configuration to control Ray submission tracing. Set DD_TRACE_RAY_SUBMISSION_SPANS_ENABLED=true to trace task.submit and actor_method.submit spans. Leave it unset to trace only execution spans. See Ray integration documentation for more details.

  • ray: ray.job.submit spans are removed. Ray job submission outcome is now reported on the existing ray.job span through ray.job.submit_status.

Deprecation Notes
  • Tracing: DD_TRACE_INFERRED_PROXY_SERVICES_ENABLED is deprecated and will be removed in 5.0.0. Use DD_TRACE_INFERRED_SPANS_ENABLED instead. The old environment variable continues to work but emits a DDTraceDeprecationWarning when set.

  • tracing: The pin parameter in ddtrace.contrib.dbapi.TracedConnection, ddtrace.contrib.dbapi.TracedCursor, and ddtrace.contrib.dbapi_async.TracedAsyncConnection is deprecated and will be removed in version 5.0.0. To manage configuration of DB tracing please use integration configuration and environment variables.

  • LLM Observability: Removes support for the RAGAS integration. As an alternative, if you have RAGAS evaluations, you can manually submit these evaluation results. See LLM Observability external evaluation documentation for more information.

New Features
  • AI Guard: Add DD_AI_GUARD_BLOCK environment variable. Defaults to True, which means the blocking behavior configured in the Datadog AI Guard UI (in-app) will be honored. Set to False to force monitor-only mode locally: evaluations are still performed but AIGuardAbortError is never raised, regardless of the in-app blocking setting.

  • AI Guard response objects now include a dict field tag_probs with the probabilities for each tag.

  • CI Visibility: Adds Bazel offline execution support with two modes: manifest mode (DD_TEST_OPTIMIZATION_MANIFEST_FILE), which reads settings and test data from pre-fetched cache files without network access; and payload-files mode (DD_TEST_OPTIMIZATION_PAYLOADS_IN_FILES), which writes test event, coverage, and telemetry payloads as JSON files instead of sending HTTP requests. Both modes can be used independently or together.

  • LLM Observability: Captures individual LLM spans for each Claude model turn within a Claude Agent SDK session. Each LLM span captures the input messages, output messages, model name, and token usage metrics (for claude_agent_sdk >= 0.1.49).

  • AAP: This adds Application Security support for FastAPI and Starlette applications using mounted sub-applications (via app.mount()). WAF evaluation, path parameter extraction, API endpoint discovery, and http.route reporting now correctly account for mount prefixes in sub-application routing.

  • google_cloud_pubsub: This adds tracing for Google Cloud Pub/Sub admin operations on topic, subscription, snapshot, and schema management methods.

  • google_cloud_pubsub: Adds support for Google Cloud Pub/Sub push subscriptions. When a push subscription delivers a message via HTTP, the integration now creates an inferred gcp.pubsub.receive span that captures subscription and message metadata. Use DD_GOOGLE_CLOUD_PUBSUB_PROPAGATION_AS_SPAN_LINKS to control whether the inferred span becomes a child of the producer trace or starts a new trace with the producer context attached as a span link (default: False).

  • LLM Observability: Add ExperimentRun.as_dataframe() to convert experiment run results into a pandas.DataFrame with a two-level MultiIndex on columns. Each top-level group (input, output, expected_output, evaluations, metadata, error, span_id, trace_id) maps to the first index level. Dict-valued fields are flattened one level deep; scalar fields use an empty string as the sub-column name. Each evaluator gets its own column containing the full evaluation result dict. Requires pandas to be installed (pip install pandas).

  • LLM Observability: Adds an eval_scope parameter to LLMObs.submit_evaluation() (one of "span" (default) or "trace"). Use eval_scope="trace" to associate an evaluation with an entire trace by passing the root span context.

  • LLM Observability: Adds LLMObs.get_spans() to retrieve span events from the Datadog platform API (GET /api/v2/llm-obs/v1/spans/events). Supports filtering by trace ID, span ID, span kind, span name, ML app, tags, and time range. Results are auto-paginated. Requires DD_API_KEY and DD_APP_KEY.

  • profiling: Profiles generated from fork-based servers now include a process_type tag with the value main or worker.

  • tracing: Support for making the default span name for @tracer.wrap include the class name has been added. For now, this is opt-in and can be enabled by setting DD_TRACE_WRAP_SPAN_NAME_INCLUDE_CLASS=true. The new naming will become the default in the next major release.

  • llmobs: Adds support for enabling and disabling LLMObs via Remote Configuration.

  • mysql: This introduces tracing support for mysql.connector.aio.connect in the MySQL integration.

  • profiling: Thread sub-sampling is now supported. This allows to set a maximum number of threads to capture stacks for at each sampling interval. This can be used to reduce the CPU overhead of the Stack Profiler.

  • llama_index: Adds APM tracing and LLM Observability support for llama-index-core>=0.11.0. Traces LLM calls, query engines, retrievers, embeddings, and agents. See the llama_index documentation for more information.

  • ASM: Adds a LiteLLM proxy guardrail integration for Datadog AI Guard. The ddtrace.appsec.ai_guard.integrations.litellm.DatadogAIGuardGuardrail class can be registered as a custom guardrail in the LiteLLM proxy to evaluate requests and responses against AI Guard security policies. Requires the LiteLLM proxy guardrails API v2 available since litellm>=1.46.1.

  • azure_cosmos: Add tracing support for Azure CosmosDB. This integration traces CRUD operations on CosmosDB databases, containers, and items.

  • LLM Observability: Introduces a decorator tag to LLM Observability spans that are traced by a function decorator.

  • CI Visibility: adds automatic log correlation and submission so that test logs appear alongside their corresponding test run in Datadog. Set DD_AGENTLESS_LOG_SUBMISSION_ENABLED=true for agentless setups, or DD_LOGS_INJECTION=true when using the Datadog Agent.

  • tracing: Adds support for exporting traces in OTLP HTTP/JSON format via libdatadog. Set OTEL_TRACES_EXPORTER=otlp to send spans to an OTLP endpoint instead of the Datadog Agent.

  • LLM Observability: Experiments accept a pydantic_evals ReportEvaluator as a summary evaluator when its evaluate return annotation is exactly ScalarResult. The scalar value is recorded as the summary evaluation. Report evaluators that declare a broader analysis return type (for example the full ReportAnalysis union) are not accepted as summary evaluators; use a class-based or function summary evaluator instead. Examples and further documentation can found in our documentation [here](https://docs.datadoghq.com/llm_observability/guide/evaluation_developer_guide).

    Example:

    from pydantic_evals.evaluators import ReportEvaluator
    from pydantic_evals.evaluators import ReportEvaluatorContext
    from pydantic_evals.reporting.analyses import ScalarResult
    
    from ddtrace.llmobs import LLMObs
    
    dataset = LLMObs.create_dataset(
        dataset_name="<DATASET_NAME>",
        description="<DATASET_DESCRIPTION>",
        records=[RECORD_1, RECORD_2, RECORD_3, ...]
    )
    
    class TotalCasesEvaluator(ReportEvaluator):
        def evaluate(self, ctx: ReportEvaluatorContext) -> ScalarResult:
            return ScalarResult(
                title='Total Cases',
                value=len(ctx.report.cases),
                unit='cases',
            )
    
    def my_task(input_data, config):
        return input_data["output"]
    
    equals_expected = EqualsExpected()
    summary_evaluator = TotalCasesEvaluator()
    
    experiment = LLMObs.experiment(
        name="<EXPERIMENT_NAME>",
        task=my_task, 
        dataset=dataset,
        evaluators=[equals_expected],
        summary_evaluators=[summary_evaluator],
        description="<EXPERIMENT_DESCRIPTION>."
    )
    
    result = experiment.run()
Bug Fixes
  • CI visibility: This fix resolves issues where CI provider metadata could omit pull request base branch and head commit details or report incorrect pull request values for some providers.

  • AAP: This fix resolves an issue where Application and API Protection (AAP) was incorrectly reported as an enabled product in internal telemetry for all services by default. Previously, registering remote configuration listeners caused AAP to be reported as activated even when it was not actually enabled. This had no impact on customers as it only affected internal telemetry data. AAP is now only reported as activated when it is explicitly enabled or enabled through remote configuration.

  • asgi: Fixed an issue caused network.client.ip and http.client_ip span tags being missing for FastAPI.

  • iast: A crash has been fixed.

  • lambda: Fixes a spurious Unable to create shared memory warning on every AWS Lambda cold start.

  • LLM Observability: Fixes an issue where an APM_TRACING remote configuration payload that did not include an llmobs section would disable LLM Observability on services where it had been enabled programmatically via LLMObs.enable(). Services that enabled LLM Observability via the DD_LLMOBS_ENABLED environment variable were unaffected. The handler now only changes LLM Observability state when the remote configuration payload explicitly carries an llmobs.enabled directive.

  • LLM Observability: Fixes a circular import in ddtrace.llmobs._writer when anthropic, openai, and botocore is installed.

  • Prevent potential crashes when the client library fails to restart a worker thread due to hitting a system resource limit.

  • internal: This fix resolves an issue where reading unknown attributes from ddtrace.internal.process_tags caused a KeyError instead of raising an AttributeError.

  • rq: Fixes compatibility with RQ 2.0. Replaces the removed Job.get_id() with the job.id property, and handles Job.get_status() now raising InvalidJobOperation for expired jobs (e.g. result_ttl=0) instead of returning None. #16682

  • tornado: Fixes an issue where routes inside a nested Tornado application were matched in reverse declaration order, causing a catch-all pattern to win over a more-specific route defined before it. This resulted in incorrect http.route tags on spans.

  • tornado: The http.route tag is now populated for routes whose regex cannot be reversed by Tornado (e.g. patterns containing non-capturing groups such as (?:a|b)). Capturing groups are still rendered as %s, consistent with Tornado's own route format, while non-capturing constructs are kept verbatim.

  • telemetry: This fix resolves an issue where unhandled exceptions raised by importlib.metadata during interpreter shutdown (for example, when Gunicorn workers exit uncleanly after a failed startup) caused update_imported_dependencies to surface errors through sys.excepthook. Failures while discovering dependencies for the app-dependencies-loaded telemetry payload are now logged at debug level and swallowed so they no longer propagate out of the dependency-reporting path.

  • profiling: Fixes noise caused by the profiler attempting to load its native module even when profiling was disabled,

  • profiling: A race condition which could make asyncio code raise exceptions at exit has been fixed.

  • remote_config: This fix resolves an issue where brief Datadog Agent connection errors could drop Remote Configuration polls, causing products such as Dynamic Instrumentation to temporarily appear disabled.

  • LLM Observability: Change the default model_provider and model_name to "unknown" from "custom" when a model did not match any known provider prefix in the Google GenAI, VertexAI, and Google ADK integrations.

  • LLM Observability: This fix resolves tracing issues for pydantic-ai >= 1.63.0 where tool spans and agent instructions were not being properly captured. This fix adds tracing to the ToolManager.execute_tool_call method for newer versions of the library to resolve this issue.

  • celery: remove unnecessary warning log about missing span when using Task.replace().

  • django: Fixes RuntimeError: coroutine ignored GeneratorExit that occurred under ASGI with async views and async middleware hooks on Python 3.13+. Async view methods and middleware hooks are now correctly detected and awaited instead of being wrapped with sync bytecode wrappers.

  • Code Security (IAST): Fixes a missing return in the IAST taint tracking add_aspect native function that caused redundant work when only the right operand of a string concatenation was tainted.

  • openai: Fixes async streaming spans never being finished when using AsyncAPIResponse (e.g. responses.create(stream=True)). The sync handle_request hook called resp.parse() without awaiting the coroutine, preventing the stream from being wrapped in TracedAsyncStream. This caused disconnected LLM Observability traces for streamed sub-agent calls via the OpenAI Agents SDK.

  • Fixed a race condition with internal periodic threads that could have caused a rare crash when forking.

  • ray: This fix resolves an issue where Ray integration spans could use an incorrect service name when the Ray job name was set after instrumentation initialization.

  • tracing: Fixes the svc.auto process tag attribution logic. The tag now correctly reflects the auto-detected service name derived from the script or module entrypoint, matching the service name the tracer would assign to spans.

  • Fixes an issue where internal background threads could cause crashes or instability in applications that fork (e.g. Gunicorn, uWSGI) or during Python shutdown. Affected applications could experience intermittent crashes or hangs on exit.

  • tracing: This fix resolves an issue where applications started with python -m <module> could report entrypoint.name as -m in process tags.

  • apm: Fixed an issue where network.client.ip and http.client_ip span tags were missing when client IP collection was enabled and request had no headers.

  • litellm: Fix missing LLMObs spans when routing requests through a litellm proxy. Proxy requests were incorrectly suppressed and resulted in empty or missing LLMObs spans. Proxy requests for OpenAI models are now always handled by the litellm integration.

  • profiling: A rare crash occurring when profiling asyncio code with many tasks or deep call stacks has been fixed.

  • serverless: AWS Lambda functions now appear under their function name as the service when DD_SERVICE is not explicitly configured. Service remapping rules configured in Datadog will now apply correctly to Lambda spans.

  • LLM Observability: Fixes an issue where deeply nested tool schemas in Anthropic and OpenAI integrations were not yet supported. The Anthropic and OpenAI integrations now check each tool's schema depth at extraction time. If a tool's schema exceeds the maximum allowed depth, the schema is truncated.

  • Code Security (IAST): This fix resolves a thread-safety issue in the IAST taint tracking context that could cause vulnerability detection to silently stop working under high concurrency in multi-threaded applications.

  • internal: A crash has been fixed.

  • CI Visibility: This fix resolves an issue where a failure response from the /search_commits endpoint caused the git metadata upload to fall back to sending the full 30-day commit history instead of aborting. This fallback could trigger cascading write load on the backend. The upload now aborts when search_commits fails, matching the behavior when the /packfile upload itself fails.

  • LLM Observability: Fixes multimodal OpenAI chat completion inputs being rendered as raw iterable objects in LLM Observability traces. Multimodal content parts (text, image, audio) are now properly materialized and formatted as readable text.

  • profiling: A rare crash that could occur post-fork in fork-based applications has been fixed.

  • profiling: A bug in Lock Profiling that could cause crashes when trying to access attributes of custom Lock subclasses (e.g. in Ray) has been fixed.

  • CI Visibility: This fix resolves an issue where pytest-xdist worker crashes (os._exit, SIGKILL, segfault) caused buffered test events to be lost. To enable eager flushing, set DD_TRACE_PARTIAL_FLUSH_MIN_SPANS=1.

  • profiling: Fixes lock profiling samples not appearing in the Thread Timeline view for events collected on macOS.

  • internal: Fix a potential internal thread leak in fork-heavy applications.

  • internal: This fix resolves an issue where a ModuleNotFoundError could be raised at startup in Python environments without the _ctypes extension module.

  • internal: A crash that could occur post-fork in fork-heavy applications has been fixed.

  • LLM Observability: Fixes incorrect span hierarchy in LLMObs traces when using the ddtrace SDK alongside OTel-based instrumentation (e.g. Strands Agents). OTel gen_ai spans (e.g. invoke_agent) were incorrectly appearing as siblings of their SDK parent span (e.g. call_agent) rather than being nested under it.

  • LLM Observability: Fixes model_name and model_provider reported on AWS Bedrock LLM spans as the model_id full model identifier value (e.g., "amazon.nova-lite-v1:0") and "amazon_bedrock" respectively. Bedrock spans' model_name and model_provider` now correctly match backend pricing data, which enables features including cost tracking.

  • LLM Observability: Fixes an issue where deferred tools (defer_loading=True) in Anthropic and OpenAI integrations caused LLMObs span payloads to include full tool descriptions and JSON schemas for every tool in a large catalog. Deferred tool definitions now have their description and schema stripped from span metadata, with only the tool name preserved.

Other Changes
  • remote config: Removes noisy warning log that was being emitted when an unsupported agent config payload was received.

  • ASM: Update default security rules to 1.18.0. Notably, this adds business logic event coverage for Stripe auto-instrumentation and expands WAF rule coverage (ZipSlip detection, file upload with double extension, broader header scanning, and expanded XXE detection).

Estimated end-of-life date, accurate to within three months: 10-2027 See the support level definitions for more information.

Upgrade Notes
  • ray: Adds DD_TRACE_RAY_IGNORED_ACTORS configuration to exclude specific Ray actor methods from instrumentation. Set DD_TRACE_RAY_IGNORED_ACTORS='{"ActorA": ["method1"], "ActorB": "*"}' to leave matching methods or actors uninstrumented while continuing to trace other Ray actor methods. Matching is based on actor class name only.
New Features
  • AI Guard: When DD_AI_GUARD_ENABLED=true is set and an ai_guard span is created during a request, the tracer now populates http.client_ip and network.client.ip on the service-entry (local root) span, mirroring the behavior used for Application Security. If AI Guard does not run during the request, no client IP tags are added. DD_TRACE_CLIENT_IP_ENABLED is ignored once AI Guard reports, and DD_TRACE_CLIENT_IP_HEADER continues to override header resolution.
Bug Fixes
  • CI Visibility: Fixes code coverage instrumentation on Python 3.13, 3.14, and 3.15. Resolves lost per-test line data caused by: sys.monitoring callbacks running in a snapshot context where ContextVar changes are not visible (Python 3.14+); empty modules emitting no LINE events (Python 3.13+); and ProcessPoolExecutor child coverage not being propagated to the parent context. Also fixes a stale-data bug where child process executable lines could inflate coverage denominators after stop_coverage() was called before join().
<!-- -->
  • django: API endpoint discovery now supports Django sub-applications mounted with django.urls.include(...). Endpoints are reported with their full URL path including the parent prefix — for example, a view served at /api/users/ is now reported as /api/users/ instead of losing the /api/ prefix.
<!-- -->
  • django: API endpoint discovery now reports the correct HTTP methods for views decorated with @require_http_methods combined with another decorator such as @csrf_exempt; the declared methods are reported instead of a generic wildcard entry.
<!-- -->
  • LLM Observability: This fix resolves an issue where running an experiment with a dataset whose records had null metadata caused the summary evaluator to crash with a TypeError while preparing evaluator inputs.
<!-- -->
  • LLM Observability: Changes the default model_name and model_provider of LLM and embedding spans from custom to unknown if not provided or empty. This applies to both auto-instrumented spans and manual instrumentation via LLMObs.llm() / LLMObs.embedding() and the @llm / @embedding decorators.
<!-- -->
  • profiling: A crash that could happen in child processes after fork has been fixed.
<!-- -->
  • profiling: A rare crash caused by the memory allocation profiler has been fixed.
<!-- -->
  • RemoteConfig: Fixed an issue where deleted remote configurations were not applied, causing stale settings to persist.
<!-- -->
  • wsgi: This fix resolves an issue where the http.url tag on inbound request spans contained the WSGI mount prefix twice (for example /admin/admin/users instead of /admin/users) when the application was served behind werkzeug.middleware.dispatcher.DispatcherMiddleware or any other in-process mount that preserves the original request URI in RAW_URI / REQUEST_URI while also setting SCRIPT_NAME.
<!-- -->
  • profiling: A rare crash happening on systems with small stack sizes when profiling asyncio code has been fixed.

Estimated end-of-life date, accurate to within three months: 05-2027 See the support level definitions for more information.

Upgrade Notes
  • claude_agent_sdk: Tool span resource names have changed from the tool name (e.g. Read, Bash) to claude_agent_sdk.tool. The specific tool name is still available in the span name (e.g. claude_agent_sdk.tool.Read). Users relying on tool resource names should update them accordingly.

  • ray: Adds DD_TRACE_RAY_SUBMISSION_SPANS_ENABLED (default: False) configuration to control Ray submission tracing. Set DD_TRACE_RAY_SUBMISSION_SPANS_ENABLED=true to trace task.submit and actor_method.submit spans. Leave it unset to trace only execution spans. See Ray integration documentation for more details.

  • ray: ray.job.submit spans are removed. Ray job submission outcome is now reported on the existing ray.job span through ray.job.submit_status.

Deprecation Notes
  • Tracing: DD_TRACE_INFERRED_PROXY_SERVICES_ENABLED is deprecated and will be removed in 5.0.0. Use DD_TRACE_INFERRED_SPANS_ENABLED instead. The old environment variable continues to work but emits a DDTraceDeprecationWarning when set.

  • tracing: The pin parameter in ddtrace.contrib.dbapi.TracedConnection, ddtrace.contrib.dbapi.TracedCursor, and ddtrace.contrib.dbapi_async.TracedAsyncConnection is deprecated and will be removed in version 5.0.0. To manage configuration of DB tracing please use integration configuration and environment variables.

  • LLM Observability: Removes support for the RAGAS integration. As an alternative, if you have RAGAS evaluations, you can manually submit these evaluation results. See LLM Observability external evaluation documentation for more information.

New Features
  • AI Guard: Add DD_AI_GUARD_BLOCK environment variable. Defaults to True, which means the blocking behavior configured in the Datadog AI Guard UI (in-app) will be honored. Set to False to force monitor-only mode locally: evaluations are still performed but AIGuardAbortError is never raised, regardless of the in-app blocking setting.

  • AI Guard response objects now include a dict field tag_probs with the probabilities for each tag.

  • CI Visibility: Adds Bazel offline execution support with two modes: manifest mode (DD_TEST_OPTIMIZATION_MANIFEST_FILE), which reads settings and test data from pre-fetched cache files without network access; and payload-files mode (DD_TEST_OPTIMIZATION_PAYLOADS_IN_FILES), which writes test event, coverage, and telemetry payloads as JSON files instead of sending HTTP requests. Both modes can be used independently or together.

  • LLM Observability: Captures individual LLM spans for each Claude model turn within a Claude Agent SDK session. Each LLM span captures the input messages, output messages, model name, and token usage metrics (for claude_agent_sdk >= 0.1.49).

  • AAP: This adds Application Security support for FastAPI and Starlette applications using mounted sub-applications (via app.mount()). WAF evaluation, path parameter extraction, API endpoint discovery, and http.route reporting now correctly account for mount prefixes in sub-application routing.

  • google_cloud_pubsub: This adds tracing for Google Cloud Pub/Sub admin operations on topic, subscription, snapshot, and schema management methods.

  • google_cloud_pubsub: Adds support for Google Cloud Pub/Sub push subscriptions. When a push subscription delivers a message via HTTP, the integration now creates an inferred gcp.pubsub.receive span that captures subscription and message metadata. Use DD_GOOGLE_CLOUD_PUBSUB_PROPAGATION_AS_SPAN_LINKS to control whether the inferred span becomes a child of the producer trace or starts a new trace with the producer context attached as a span link (default: False).

  • LLM Observability: Add ExperimentRun.as_dataframe() to convert experiment run results into a pandas.DataFrame with a two-level MultiIndex on columns. Each top-level group (input, output, expected_output, evaluations, metadata, error, span_id, trace_id) maps to the first index level. Dict-valued fields are flattened one level deep; scalar fields use an empty string as the sub-column name. Each evaluator gets its own column containing the full evaluation result dict. Requires pandas to be installed (pip install pandas).

  • LLM Observability: Adds an eval_scope parameter to LLMObs.submit_evaluation() (one of "span" (default) or "trace"). Use eval_scope="trace" to associate an evaluation with an entire trace by passing the root span context.

  • LLM Observability: Adds LLMObs.get_spans() to retrieve span events from the Datadog platform API (GET /api/v2/llm-obs/v1/spans/events). Supports filtering by trace ID, span ID, span kind, span name, ML app, tags, and time range. Results are auto-paginated. Requires DD_API_KEY and DD_APP_KEY.

  • profiling: Profiles generated from fork-based servers now include a process_type tag with the value main or worker.

  • tracing: Support for making the default span name for @tracer.wrap include the class name has been added. For now, this is opt-in and can be enabled by setting DD_TRACE_WRAP_SPAN_NAME_INCLUDE_CLASS=true. The new naming will become the default in the next major release.

  • llmobs: Adds support for enabling and disabling LLMObs via Remote Configuration.

  • mysql: This introduces tracing support for mysql.connector.aio.connect in the MySQL integration.

  • profiling: Thread sub-sampling is now supported. This allows to set a maximum number of threads to capture stacks for at each sampling interval. This can be used to reduce the CPU overhead of the Stack Profiler.

  • llama_index: Adds APM tracing and LLM Observability support for llama-index-core>=0.11.0. Traces LLM calls, query engines, retrievers, embeddings, and agents. See the llama_index documentation for more information.

  • ASM: Adds a LiteLLM proxy guardrail integration for Datadog AI Guard. The ddtrace.appsec.ai_guard.integrations.litellm.DatadogAIGuardGuardrail class can be registered as a custom guardrail in the LiteLLM proxy to evaluate requests and responses against AI Guard security policies. Requires the LiteLLM proxy guardrails API v2 available since litellm>=1.46.1.

  • azure_cosmos: Add tracing support for Azure CosmosDB. This integration traces CRUD operations on CosmosDB databases, containers, and items.

  • LLM Observability: Introduces a decorator tag to LLM Observability spans that are traced by a function decorator.

  • CI Visibility: adds automatic log correlation and submission so that test logs appear alongside their corresponding test run in Datadog. Set DD_AGENTLESS_LOG_SUBMISSION_ENABLED=true for agentless setups, or DD_LOGS_INJECTION=true when using the Datadog Agent.

  • tracing: Adds support for exporting traces in OTLP HTTP/JSON format via libdatadog. Set OTEL_TRACES_EXPORTER=otlp to send spans to an OTLP endpoint instead of the Datadog Agent.

  • LLM Observability: Experiments accept a pydantic_evals ReportEvaluator as a summary evaluator when its evaluate return annotation is exactly ScalarResult. The scalar value is recorded as the summary evaluation. Report evaluators that declare a broader analysis return type (for example the full ReportAnalysis union) are not accepted as summary evaluators; use a class-based or function summary evaluator instead. Examples and further documentation can found in our documentation [here](https://docs.datadoghq.com/llm_observability/guide/evaluation_developer_guide).

    Example:

    from pydantic_evals.evaluators import ReportEvaluator
    from pydantic_evals.evaluators import ReportEvaluatorContext
    from pydantic_evals.reporting.analyses import ScalarResult
    
    from ddtrace.llmobs import LLMObs
    
    dataset = LLMObs.create_dataset(
        dataset_name="<DATASET_NAME>",
        description="<DATASET_DESCRIPTION>",
        records=[RECORD_1, RECORD_2, RECORD_3, ...]
    )
    
    class TotalCasesEvaluator(ReportEvaluator):
        def evaluate(self, ctx: ReportEvaluatorContext) -> ScalarResult:
            return ScalarResult(
                title='Total Cases',
                value=len(ctx.report.cases),
                unit='cases',
            )
    
    def my_task(input_data, config):
        return input_data["output"]
    
    equals_expected = EqualsExpected()
    summary_evaluator = TotalCasesEvaluator()
    
    experiment = LLMObs.experiment(
        name="<EXPERIMENT_NAME>",
        task=my_task, 
        dataset=dataset,
        evaluators=[equals_expected],
        summary_evaluators=[summary_evaluator],
        description="<EXPERIMENT_DESCRIPTION>."
    )
    
    result = experiment.run()
Bug Fixes
  • CI visibility: This fix resolves issues where CI provider metadata could omit pull request base branch and head commit details or report incorrect pull request values for some providers.

  • AAP: This fix resolves an issue where Application and API Protection (AAP) was incorrectly reported as an enabled product in internal telemetry for all services by default. Previously, registering remote configuration listeners caused AAP to be reported as activated even when it was not actually enabled. This had no impact on customers as it only affected internal telemetry data. AAP is now only reported as activated when it is explicitly enabled or enabled through remote configuration.

  • asgi: Fixed an issue caused network.client.ip and http.client_ip span tags being missing for FastAPI.

  • iast: A crash has been fixed.

  • lambda: Fixes a spurious Unable to create shared memory warning on every AWS Lambda cold start.

  • LLM Observability: Fixes an issue where an APM_TRACING remote configuration payload that did not include an llmobs section would disable LLM Observability on services where it had been enabled programmatically via LLMObs.enable(). Services that enabled LLM Observability via the DD_LLMOBS_ENABLED environment variable were unaffected. The handler now only changes LLM Observability state when the remote configuration payload explicitly carries an llmobs.enabled directive.

  • LLM Observability: Fixes a circular import in ddtrace.llmobs._writer when anthropic, openai, and botocore is installed.

  • Prevent potential crashes when the client library fails to restart a worker thread due to hitting a system resource limit.

  • internal: This fix resolves an issue where reading unknown attributes from ddtrace.internal.process_tags caused a KeyError instead of raising an AttributeError.

  • rq: Fixes compatibility with RQ 2.0. Replaces the removed Job.get_id() with the job.id property, and handles Job.get_status() now raising InvalidJobOperation for expired jobs (e.g. result_ttl=0) instead of returning None. #16682

  • tornado: Fixes an issue where routes inside a nested Tornado application were matched in reverse declaration order, causing a catch-all pattern to win over a more-specific route defined before it. This resulted in incorrect http.route tags on spans.

  • tornado: The http.route tag is now populated for routes whose regex cannot be reversed by Tornado (e.g. patterns containing non-capturing groups such as (?:a|b)). Capturing groups are still rendered as %s, consistent with Tornado's own route format, while non-capturing constructs are kept verbatim.

  • telemetry: This fix resolves an issue where unhandled exceptions raised by importlib.metadata during interpreter shutdown (for example, when Gunicorn workers exit uncleanly after a failed startup) caused update_imported_dependencies to surface errors through sys.excepthook. Failures while discovering dependencies for the app-dependencies-loaded telemetry payload are now logged at debug level and swallowed so they no longer propagate out of the dependency-reporting path.

  • profiling: Fixes noise caused by the profiler attempting to load its native module even when profiling was disabled,

  • profiling: A race condition which could make asyncio code raise exceptions at exit has been fixed.

  • remote_config: This fix resolves an issue where brief Datadog Agent connection errors could drop Remote Configuration polls, causing products such as Dynamic Instrumentation to temporarily appear disabled.

  • LLM Observability: Change the default model_provider and model_name to "unknown" from "custom" when a model did not match any known provider prefix in the Google GenAI, VertexAI, and Google ADK integrations.

  • LLM Observability: This fix resolves tracing issues for pydantic-ai >= 1.63.0 where tool spans and agent instructions were not being properly captured. This fix adds tracing to the ToolManager.execute_tool_call method for newer versions of the library to resolve this issue.

  • celery: remove unnecessary warning log about missing span when using Task.replace().

  • django: Fixes RuntimeError: coroutine ignored GeneratorExit that occurred under ASGI with async views and async middleware hooks on Python 3.13+. Async view methods and middleware hooks are now correctly detected and awaited instead of being wrapped with sync bytecode wrappers.

  • Code Security (IAST): Fixes a missing return in the IAST taint tracking add_aspect native function that caused redundant work when only the right operand of a string concatenation was tainted.

  • openai: Fixes async streaming spans never being finished when using AsyncAPIResponse (e.g. responses.create(stream=True)). The sync handle_request hook called resp.parse() without awaiting the coroutine, preventing the stream from being wrapped in TracedAsyncStream. This caused disconnected LLM Observability traces for streamed sub-agent calls via the OpenAI Agents SDK.

  • Fixed a race condition with internal periodic threads that could have caused a rare crash when forking.

  • ray: This fix resolves an issue where Ray integration spans could use an incorrect service name when the Ray job name was set after instrumentation initialization.

  • tracing: Fixes the svc.auto process tag attribution logic. The tag now correctly reflects the auto-detected service name derived from the script or module entrypoint, matching the service name the tracer would assign to spans.

  • Fixes an issue where internal background threads could cause crashes or instability in applications that fork (e.g. Gunicorn, uWSGI) or during Python shutdown. Affected applications could experience intermittent crashes or hangs on exit.

  • tracing: This fix resolves an issue where applications started with python -m <module> could report entrypoint.name as -m in process tags.

  • apm: Fixed an issue where network.client.ip and http.client_ip span tags were missing when client IP collection was enabled and request had no headers.

  • litellm: Fix missing LLMObs spans when routing requests through a litellm proxy. Proxy requests were incorrectly suppressed and resulted in empty or missing LLMObs spans. Proxy requests for OpenAI models are now always handled by the litellm integration.

  • profiling: A rare crash occurring when profiling asyncio code with many tasks or deep call stacks has been fixed.

  • serverless: AWS Lambda functions now appear under their function name as the service when DD_SERVICE is not explicitly configured. Service remapping rules configured in Datadog will now apply correctly to Lambda spans.

  • LLM Observability: Fixes an issue where deeply nested tool schemas in Anthropic and OpenAI integrations were not yet supported. The Anthropic and OpenAI integrations now check each tool's schema depth at extraction time. If a tool's schema exceeds the maximum allowed depth, the schema is truncated.

  • Code Security (IAST): This fix resolves a thread-safety issue in the IAST taint tracking context that could cause vulnerability detection to silently stop working under high concurrency in multi-threaded applications.

  • internal: A crash has been fixed.

  • CI Visibility: This fix resolves an issue where a failure response from the /search_commits endpoint caused the git metadata upload to fall back to sending the full 30-day commit history instead of aborting. This fallback could trigger cascading write load on the backend. The upload now aborts when search_commits fails, matching the behavior when the /packfile upload itself fails.

  • LLM Observability: Fixes multimodal OpenAI chat completion inputs being rendered as raw iterable objects in LLM Observability traces. Multimodal content parts (text, image, audio) are now properly materialized and formatted as readable text.

  • profiling: A rare crash that could occur post-fork in fork-based applications has been fixed.

  • profiling: A bug in Lock Profiling that could cause crashes when trying to access attributes of custom Lock subclasses (e.g. in Ray) has been fixed.

  • CI Visibility: This fix resolves an issue where pytest-xdist worker crashes (os._exit, SIGKILL, segfault) caused buffered test events to be lost. To enable eager flushing, set DD_TRACE_PARTIAL_FLUSH_MIN_SPANS=1.

  • profiling: Fixes lock profiling samples not appearing in the Thread Timeline view for events collected on macOS.

  • internal: Fix a potential internal thread leak in fork-heavy applications.

  • internal: This fix resolves an issue where a ModuleNotFoundError could be raised at startup in Python environments without the _ctypes extension module.

  • internal: A crash that could occur post-fork in fork-heavy applications has been fixed.

  • LLM Observability: Fixes incorrect span hierarchy in LLMObs traces when using the ddtrace SDK alongside OTel-based instrumentation (e.g. Strands Agents). OTel gen_ai spans (e.g. invoke_agent) were incorrectly appearing as siblings of their SDK parent span (e.g. call_agent) rather than being nested under it.

  • LLM Observability: Fixes model_name and model_provider reported on AWS Bedrock LLM spans as the model_id full model identifier value (e.g., "amazon.nova-lite-v1:0") and "amazon_bedrock" respectively. Bedrock spans' model_name and model_provider` now correctly match backend pricing data, which enables features including cost tracking.

  • LLM Observability: Fixes an issue where deferred tools (defer_loading=True) in Anthropic and OpenAI integrations caused LLMObs span payloads to include full tool descriptions and JSON schemas for every tool in a large catalog. Deferred tool definitions now have their description and schema stripped from span metadata, with only the tool name preserved.

Other Changes
  • remote config: Removes noisy warning log that was being emitted when an unsupported agent config payload was received.

  • ASM: Update default security rules to 1.18.0. Notably, this adds business logic event coverage for Stripe auto-instrumentation and expands WAF rule coverage (ZipSlip detection, file upload with double extension, broader header scanning, and expanded XXE detection).

Estimated end-of-life date, accurate to within three months: 06-2027 See the support level definitions for more information.

Bug Fixes
  • Fixes a race condition with internal periodic threads that could have caused a rare crash when forking.
<!-- -->
  • Fixes an issue where internal background threads could cause crashes or instability in applications that fork (e.g. Gunicorn, uWSGI) or during Python shutdown. Affected applications could experience intermittent crashes or hangs on exit.
<!-- -->
  • CI Visibility: This fix resolves an issue where pytest-xdist worker crashes (os._exit, SIGKILL, segfault) caused buffered test events to be lost. To enable eager flushing, set DD_TRACE_PARTIAL_FLUSH_MIN_SPANS=1.

Estimated end-of-life date, accurate to within three months: 06-2027 See the support level definitions for more information.

Bug Fixes
  • Fixed a race condition with internal periodic threads that could have caused a rare crash when forking.
<!-- -->
  • Fixes an issue where internal background threads could cause crashes or instability in applications that fork (e.g. Gunicorn, uWSGI) or during Python shutdown. Affected applications could experience intermittent crashes or hangs on exit.

Estimated end-of-life date, accurate to within three months: 06-2027 See the support level definitions for more information.

Bug Fixes
  • Fixed a race condition with internal periodic threads that could have caused a rare crash when forking.

Estimated end-of-life date, accurate to within three months: 05-2027 See the support level definitions for more information.

Upgrade Notes
  • ray
    • ray.job.submit spans are removed. Ray job submission outcome is now reported on the existing ray.job span through ray.job.submit_status.
Deprecation Notes
  • LLM Observability
    • Removes support for the RAGAS integration. As an alternative, if you have RAGAS evaluations, you can manually submit these evaluation results. See LLM Observability external evaluation documentation for more information.
  • tracing
    • The pin parameter in ddtrace.contrib.dbapi.TracedConnection, ddtrace.contrib.dbapi.TracedCursor, and ddtrace.contrib.dbapi_async.TracedAsyncConnection is deprecated and will be removed in version 5.0.0. To manage configuration of DB tracing please use integration configuration and environment variables.
    • DD_TRACE_INFERRED_PROXY_SERVICES_ENABLED is deprecated and will be removed in 5.0.0. Use DD_TRACE_INFERRED_SPANS_ENABLED instead. The old environment variable continues to work but emits a DDTraceDeprecationWarning when set.
New Features
  • profiling

    • Thread sub-sampling is now supported. This allows to set a maximum number of threads to capture stacks for at each sampling interval. This can be used to reduce the CPU overhead of the Stack Profiler.
  • ASM

    • Adds a LiteLLM proxy guardrail integration for Datadog AI Guard. The ddtrace.appsec.ai_guard.integrations.litellm.DatadogAIGuardGuardrail class can be registered as a custom guardrail in the LiteLLM proxy to evaluate requests and responses against AI Guard security policies. Requires the LiteLLM proxy guardrails API v2 available since litellm>=1.46.1.
  • azure_cosmos

    • Add tracing support for Azure CosmosDB. This integration traces CRUD operations on CosmosDB databases, containers, and items.
  • CI Visibility

    • adds automatic log correlation and submission so that test logs appear alongside their corresponding test run in Datadog. Set DD_AGENTLESS_LOG_SUBMISSION_ENABLED=true for agentless setups, or DD_LOGS_INJECTION=true when using the Datadog Agent.
  • llama_index

    • Adds APM tracing and LLM Observability support for llama-index-core>=0.11.0. Traces LLM calls, query engines, retrievers, embeddings, and agents. See the llama_index documentation for more information.
  • tracing

    • Adds support for exporting traces in OTLP HTTP/JSON format via libdatadog. Set OTEL_TRACES_EXPORTER=otlp to send spans to an OTLP endpoint instead of the Datadog Agent.
  • mysql

    • This introduces tracing support for mysql.connector.aio.connect in the MySQL integration.
  • LLM Observability

    • Adds support for enabling and disabling LLMObs via Remote Configuration.
    • Introduces a decorator tag to LLM Observability spans that are traced by a function decorator.
    • Experiments accept a pydantic_evals ReportEvaluator as a summary evaluator when its evaluate return annotation is exactly ScalarResult. The scalar value is recorded as the summary evaluation. Report evaluators that declare a broader analysis return type (for example the full ReportAnalysis union) are not accepted as summary evaluators; use a class-based or function summary evaluator instead. Examples and further documentation can found in our documentation [here](https://docs.datadoghq.com/llm_observability/guide/evaluation_developer_guide).

    Example:

    from pydantic_evals.evaluators import ReportEvaluator
    from pydantic_evals.evaluators import ReportEvaluatorContext
    from pydantic_evals.reporting.analyses import ScalarResult
    
    from ddtrace.llmobs import LLMObs
    
    dataset = LLMObs.create_dataset(
        dataset_name="<DATASET_NAME>",
        description="<DATASET_DESCRIPTION>",
        records=[RECORD_1, RECORD_2, RECORD_3, ...]
    )
    
    class TotalCasesEvaluator(ReportEvaluator):
        def evaluate(self, ctx: ReportEvaluatorContext) -> ScalarResult:
            return ScalarResult(
                title='Total Cases',
                value=len(ctx.report.cases),
                unit='cases',
            )
    
    def my_task(input_data, config):
        return input_data["output"]
    
    equals_expected = EqualsExpected()
    summary_evaluator = TotalCasesEvaluator()
    
    experiment = LLMObs.experiment(
        name="<EXPERIMENT_NAME>",
        task=my_task,
        dataset=dataset,
        evaluators=[equals_expected],
        summary_evaluators=[summary_evaluator],
        description="<EXPERIMENT_DESCRIPTION>."
    )
    
    result = experiment.run()
Bug Fixes
  • profiling
    • Fixes lock profiling samples not appearing in the Thread Timeline view for events collected on macOS.
    • A rare crash that could occur post-fork in fork-based applications has been fixed.
    • A bug in Lock Profiling that could cause crashes when trying to access attributes of custom Lock subclasses (e.g. in Ray) has been fixed.
  • internal
    • Fix a potential internal thread leak in fork-heavy applications.
    • This fix resolves an issue where a ModuleNotFoundError could be raised at startup in Python environments without the _ctypes extension module.
    • A crash that could occur post-fork in fork-heavy applications has been fixed.
    • A crash has been fixed.
  • LLM Observability
    • Fixes incorrect span hierarchy in LLMObs traces when using the ddtrace SDK alongside OTel-based instrumentation (e.g. Strands Agents). OTel gen_ai spans (e.g. invoke_agent) were incorrectly appearing as siblings of their SDK parent span (e.g. call_agent) rather than being nested under it.
    • Fixes multimodal OpenAI chat completion inputs being rendered as raw iterable objects in LLM Observability traces. Multimodal content parts (text, image, audio) are now properly materialized and formatted as readable text.
    • Fixes model_name and model_provider reported on AWS Bedrock LLM spans as the model_id full model identifier value (e.g., "amazon.nova-lite-v1:0") and "amazon_bedrock" respectively. Bedrock spans' model_name and model_provider now correctly match backend pricing data, which enables features including cost tracking.
    • Fixes an issue where deferred tools (defer_loading=True) in Anthropic and OpenAI integrations caused LLMObs span payloads to include full tool descriptions and JSON schemas for every tool in a large catalog. Deferred tool definitions now have their description and schema stripped from span metadata, with only the tool name preserved.
    • Fixes an issue where deeply nested tool schemas in Anthropic and OpenAI integrations were not yet supported. The Anthropic and OpenAI integrations now check each tool's schema depth at extraction time. If a tool's schema exceeds the maximum allowed depth, the schema is truncated.
  • CI Visibility
    • This fix resolves an issue where pytest-xdist worker crashes (os._exit, SIGKILL, segfault) caused buffered test events to be lost. To enable eager flushing, set DD_TRACE_PARTIAL_FLUSH_MIN_SPANS=1.
    • This fix resolves an issue where a failure response from the /search_commits endpoint caused the git metadata upload to fall back to sending the full 30-day commit history instead of aborting. This fallback could trigger cascading write load on the backend. The upload now aborts when search_commits fails, matching the behavior when the /packfile upload itself fails.
  • Code Security (IAST)
    • This fix resolves a thread-safety issue in the IAST taint tracking context that could cause vulnerability detection to silently stop working under high concurrency in multi-threaded applications.
    • Fixes a missing return in the IAST taint tracking add_aspect native function that caused redundant work when only the right operand of a string concatenation was tainted.
  • celery:
    • remove unnecessary warning log about missing span when using Task.replace().
  • django:
    • Fixes RuntimeError: coroutine ignored GeneratorExit that occurred under ASGI with async views and async middleware hooks on Python 3.13+. Async view methods and middleware hooks are now correctly detected and awaited instead of being wrapped with sync bytecode wrappers.
  • ray
    • This fix resolves an issue where Ray integration spans could use an incorrect service name when the Ray job name was set after instrumentation initialization.
  • Other
    • Fixed a race condition with internal periodic threads that could have caused a rare crash when forking.
    • Fixes an issue where internal background threads could cause crashes or instability in applications that fork (e.g. Gunicorn, uWSGI) or during Python shutdown. Affected applications could experience intermittent crashes or hangs on exit.
  • tracing:
    • Fixes the svc.auto process tag attribution logic. The tag now correctly reflects the auto-detected service name derived from the script or module entrypoint, matching the service name the tracer would assign to spans.
    • This fix resolves an issue where applications started with python -m <module> could report entrypoint.name as -m in process tags.
    • Fixed an issue where network.client.ip and http.client_ip span tags were missing when client IP collection was enabled and request had no headers.
  • litellm:
    • Fix missing LLMObs spans when routing requests through a litellm proxy. Proxy requests were incorrectly suppressed and resulted in empty or missing LLMObs spans. Proxy requests for OpenAI models are now always handled by the litellm integration.
  • serverless:
    • AWS Lambda functions now appear under their function name as the service when DD_SERVICE is not explicitly configured. Service remapping rules configured in Datadog will now apply correctly to Lambda spans.

Estimated end-of-life date, accurate to within three months: 05-2027 See the support level definitions for more information.

Upgrade Notes
  • profiling
    • This compiles the lock profiler's hot path to C via Cython, reducing per-operation overhead. At the default 1% capture rate, lock operations are ~49% faster for both contended and uncontended workloads. At 100% capture, gains are ~15-19%. No configuration changes are required.
  • openfeature
    • The minimum required version of openfeature-sdk is now 0.8.0 (previously 0.6.0). This is required for the finally_after hook to receive evaluation details for metrics tracking.
API Changes
  • openfeature
    • Flag evaluations for non-existent flags now return Reason.ERROR with ErrorCode.FLAG_NOT_FOUND instead of Reason.DEFAULT when configuration is available but the flag is not found. The previous behavior (Reason.DEFAULT) is preserved when no configuration is loaded. This aligns Python with other Datadog SDK implementations.
New Features
  • mlflow

    • Adds a request header provider (auth plugin) for MLFlow. If the environment variables DD_API_KEY, DD_APP_KEY and DD_MODEL_LAB_ENABLED are set, HTTP requests to the MLFlow tracking server will include the DD-API-KEY and DD-APPLICATION-KEY headers. #16685
  • ai_guard

    • Calls to evaluate now block if blocking was enabled for the service in the AI Guard UI. This behavior can be disabled by passing the parameter block=False, which now defaults to block=True.
    • This updates the AI Guard API client to return Sensitive Data Scanner (SDS) results in the SDK response.
    • This introduces AI Guard support for Strands Agents. The Plugin API requires strands-agents>=1.29.0; the HookProvider works with any version that exposes the hooks system.
  • azure_durable_functions

    • Add tracing support for Azure Durable Functions. This integration traces durable activity and entity functions.
  • profiling

    • This adds process tags to profiler payloads. To deactivate this feature, set DD_EXPERIMENTAL_PROPAGATE_PROCESS_TAGS_ENABLED=false.
  • runtime metrics

    • This adds process tags to runtime metrics tags. To deactivate this feature, set DD_EXPERIMENTAL_PROPAGATE_PROCESS_TAGS_ENABLED=false.
  • remote configuration

    • This adds process tags to remote configuration payloads. To deactivate this feature, set DD_EXPERIMENTAL_PROPAGATE_PROCESS_TAGS_ENABLED=false.
  • dynamic instrumentation

    • This adds process tags to debugger payloads. To deactivate this feature, set DD_EXPERIMENTAL_PROPAGATE_PROCESS_TAGS_ENABLED=false.
  • crashtracking

    • This adds process tags to crash tracking payloads. To deactivate this feature, set DD_EXPERIMENTAL_PROPAGATE_PROCESS_TAGS_ENABLED=false.
  • data streams monitoring

    • This adds process tags to Data Streams Monitoring payloads. To deactivate this feature, set DD_EXPERIMENTAL_PROPAGATE_PROCESS_TAGS_ENABLED=false.
  • database monitoring

    • This adds process tags to Database Monitoring SQL service hash propagation. To deactivate this feature, set DD_EXPERIMENTAL_PROPAGATE_PROCESS_TAGS_ENABLED=false.
  • Stats computation

    • This adds process tags to stats computation payloads. To deactivate this feature, set DD_EXPERIMENTAL_PROPAGATE_PROCESS_TAGS_ENABLED=false.
  • LLM Observability

    • Adds support for capturing stop_reason and structured_output from the Claude Agent SDK integration.

    • Adds support for user-defined dataset record IDs. Users can now supply an optional id field when creating dataset records via Dataset.append(), Dataset.extend(), create_dataset(), or create_dataset_from_csv() (via the new id_column parameter). If no id is provided, the SDK generates one automatically.

    • Experiment tasks can now optionally receive dataset record metadata as a third metadata parameter. Tasks with the existing (input_data, config) signature continue to work unchanged.

    • This introduces RemoteEvaluator which allows users to reference LLM-as-Judge evaluations configured in the Datadog UI by name when running local experiments. For more information, see the documentation: https://docs.datadoghq.com/llm_observability/guide/evaluation_developer_guide/#using-managed-evaluators

    • This adds cache creation breakdown metrics for the Anthropic integration. When making Anthropic calls with prompt caching, ephemeral_5m_input_tokens and ephemeral_1h_input_tokens metrics are now reported, distinguishing between 5 minute and 1 hour prompt caches.

    • Adds support for reasoning and extended thinking content in Anthropic, LiteLLM, and OpenAI-compatible integrations. Anthropic thinking blocks (type: "thinking") are now captured as role: "reasoning" messages in both streaming and non-streaming responses, as well as in input messages for tool use continuations. LiteLLM now extracts reasoning_output_tokens from completion_tokens_details and captures reasoning_content in output messages for OpenAI-compatible providers.

    • LLMJudge now forwards any extra client_options to the underlying provider client constructor. This allows passing provider-specific options such as base_url, timeout, organization, or max_retries directly through client_options.

    • Dataset records' tags can now be operated on with 3 new Dataset methods: `dataset.add_tags<span class="title-ref">, </span>dataset.remove_tags<span class="title-ref">, and </span>dataset.replace_tags<span class="title-ref">. All 3 new methods accepts an int indicating the zero based index of the record to operate on, and a list of strings in the format of key:values representing the tags. For example, if the tag "env:prod" exists on the 1st record of the dataset </span><span class="title-ref">ds</span><span class="title-ref">, calling </span><span class="title-ref">ds.remove_tags(0, ["env:prod"]</span>` will update the local state of the dataset record to have the "env:prod" tag removed.

    • Change experiment execution to run evaluators immediately after each record's task completes instead of batching all tasks first. Experiment spans and evaluation metrics are now posted incrementally as records complete rather than waiting until the end. This improves progress visibility and preserves partial results if a run fails midway.

    • Adds support for Pydantic AI evaluations in LLM Observability Experiments by allowing users to pass a pydantic evaluation (which inherents from Evaluator) in an LLM Obs Experiment.

      Example:

      from pydantic_evals.evaluators import EqualsExpected

      from ddtrace.llmobs import LLMObs

      dataset = LLMObs.create_dataset( dataset_name="<DATASET_NAME>", description="<DATASET_DESCRIPTION>", records=[RECORD_1, RECORD_2, RECORD_3, ...]

      )

      def my_task(input_data, config): return input_data["output"]

      def my_summary_evaluator(inputs, outputs, expected_outputs, evaluators_results): return evaluators_results["Correctness"].count(True)

      equals_expected = EqualsExpected()

      experiment = LLMObs.experiment( name="<EXPERIMENT_NAME>", task=my_task, dataset=dataset, evaluators=[equals_expected], summary_evaluators=[my_summary_evaluator], # optional, used to summarize the experiment results description="<EXPERIMENT_DESCRIPTION>."

      )

      result = experiment.run()

  • tracer

    • This introduces API endpoint discovery support for Tornado applications. HTTP endpoints are now automatically collected at application startup and reported via telemetry, bringing Tornado in line with Flask, FastAPI, and Django.
    • This adds process tags to trace payloads. To deactivate this feature, set DD_EXPERIMENTAL_PROPAGATE_PROCESS_TAGS_ENABLED=false.
    • Adds instrumentation support for <span class="title-ref">`mlflow>=2.11.0</span>. See the <span class="title-ref">mlflow <https://ddtrace.readthedocs.io/en/stable/integrations.html#mlflow\></span> documentation for more information.
    • Add process tags to client side stats payload
  • aiohttp

    • Fixed an issue where spans captured an incomplete URL (e.g. /status/200) when aiohttp.ClientSession was initialized with a base_url. The span now records the fully-resolved URL (e.g. http://host:port/status/200), matching aiohttp's internal behaviour.
  • pymongo

    • Add a new configuration option called DD_TRACE_MONGODB_OBFUSCATION to allow the mongodb.query to be obfuscated or not. Resource names always remain normalized regardless of the value. To preserve raw mongodb.query values, pair with DD_APM_OBFUSCATION_MONGODB_ENABLED=false on the Datadog Agent. See Datadog trace obfuscation docs: Trace obfuscation.
  • google_cloud_pubsub

    • Add tracing support for the google-cloud-pubsub library. Instruments PublisherClient.publish() and SubscriberClient.subscribe() to generate spans for message publishing and consuming, with optional distributed trace context propagation via message attributes. Use DD_GOOGLE_CLOUD_PUBSUB_PROPAGATION_ENABLED to control context propagation (default: True) and DD_GOOGLE_CLOUD_PUBSUB_PROPAGATION_AS_SPAN_LINKS to attach propagated context as span links instead of re-parenting subscriber spans under the producer trace (default: False).
Bug Fixes
  • AAP
    • Fix multipart request body parsing to preserve all values when the same field name appears multiple times. Previously, only the last value was kept for duplicate keys in multipart/form-data bodies, which could allow an attacker to bypass WAF inspection by hiding a malicious value among safe ones.
    • Fixes a minor issue where the ASGI middleware used the framework-resolved scope["path"] instead of scope["raw_path"] for WAF URI evaluation. In rare cases where the URI contained path traversal sequences, these could be resolved before reaching the WAF, potentially affecting a small number of URI-based detection rules on ASGI-based frameworks like FastAPI and Starlette.
    • This fix resolves an issue where RASP exploit prevention stack traces incorrectly included ddtrace internal frames at the top. Stack traces now correctly show only user and library frames.
  • IAST
    • This fix resolves an issue where duplicate UNVALIDATED_REDIRECT vulnerabilities could be reported for a single redirect() call.
  • Fixed an issue with internal periodic threads that could have caused a crash during shutdown if a fork occurred.
  • telemetry
    • fix extended heartbeat payload key from "configurations" to "configuration" to match the telemetry v2 API spec.
  • langgraph
    • Fixed an issue where GraphInterrupt exceptions were incorrectly marked as errors in APM traces. GraphInterrupt is a control-flow exception used in LangGraph's human-in-the-loop workflows and should not be treated as an error condition.
  • mcp
    • Fixes anyio.ClosedResourceError raised during MCP server session teardown when the ddtrace MCP integration is enabled.
  • CI Visibility
    • Fixes an issue where pytest plugins pytest-rerunfailures and flaky were silently overridden by the ddtrace plugin. With this change, external rerun plugins will now drive retries as expected when Auto Test Retries and Early Flake Detection features are both disabled, otherwise our retry mechanism takes precedence and a warning is emitted.
    • Fixed an issue where a test marked as both quarantined and attempt-to-fix was incorrectly treated as plain quarantined, preventing attempt-to-fix flow from running.
    • Fix an unhandled RuntimeError that occurred when the git binary was not available. Git metadata upload is now skipped gracefully with a warning instead of aborting pytest startup.
    • pytest:
      • Fixed missing ITR tags in the new pytest plugin that caused time saved by Test Impact Analysis to not appear in dashboards.
    • Fixes an issue where HTTP 429 (Too Many Requests) responses from the Datadog backend were treated as non-retriable errors, causing CI visibility data to be dropped when the backend applied rate limiting. The backend connector now retries on 429 responses and respects the X-RateLimit-Reset header when present to determine the retry delay.
  • tracing
    • Resolves an issue where a RuntimeError could be raised when iterating over the context._meta dictionary while creating spans or generating distributed traces.
    • fixes an issue where telemetry debug mode was incorrectly enabled by DD_TRACE_DEBUG instead of its own dedicated environment variable DD_INTERNAL_TELEMETRY_DEBUG_ENABLED. Setting DD_TRACE_DEBUG=true no longer enables telemetry debug mode. To enable telemetry debug mode, set DD_INTERNAL_TELEMETRY_DEBUG_ENABLED=true.
    • Fix _dd.p.ksr span tag formatting for very small sampling rates. Previously, rates below 0.001 could be output in scientific notation (e.g. 1e-06). Now always uses decimal notation with up to 6 decimal digits.
    • sampling rules do not early exit anymore if a single rule is missing service and name.
  • LLM Observability
    • This fix avoids potential JSONDecodeError when parsing tool call arguments from streamed Anthropic response message chunks.
    • Fixes a FileNotFoundError in prompt optimization where the system prompt template was stored as a .md file that was excluded from release wheels. The template is now embedded in a Python module to ensure it is always available at runtime.
    • Corrected the DROPPED_VALUE_TEXT warning message to reference the actual 5MB size limit. The size limit itself has not changed; only the message text was updated from an incorrect 1MB reference to the correct 5MB value.
    • This fix resolves an issue where cache_creation_input_tokens and cache_read_input_tokens were not captured when using the LiteLLM integration with providers that support prompt caching (e.g., Anthropic, OpenAI, Deepseek).
    • Fixes an issue where the @llm decorator raised a LLMObsAnnotateSpanError exception when a decorated function returned a value that could not be parsed as LLM messages. Note that manual annotation still overrides this automatic annotation.
    • Fixes an issue where the @llm decorator did not automatically annotate the return value as output in traces. The decorator now captures the return value and annotates it as output, consistent with @workflow and @task decorators. Manual annotations via LLMObs.annotate() still take precedence.
    • Fixes an issue where Pydantic model outputs nested inside lists, tuples, or dicts were serialized as unreadable repr() strings instead of JSON. Pydantic v1 and v2 models are now properly serialized using model_dump() or .dict() respectively.
    • Fixes an issue where SDK-side LLMObs spans (e.g. LLMObs.workflow()) and OTel-bridged spans (e.g. from Strands Agents with DD_TRACE_OTEL_ENABLED=1) produced separate LLMObs traces instead of a single unified trace.
    • Fixes an issue where the payload size limit and event size limit were hardcoded and could not be configured. These are now configurable via the DD_LLMOBS_PAYLOAD_SIZE_BYTES and DD_LLMOBS_EVENT_SIZE_BYTES environment variables respectively. These default to 5242880 (5 MiB) and 5000000 (5 MB), matching the previous hardcoded values.
    • Fixes an issue where Anthropic LLM spans were dropped when streaming responses from Anthropic beta API features with tool use, such as tool_search_tool_regex.
    • Fixes an issue where streamed Anthropic responses with generic content block types were not captured in output messages.
    • Fixes an issue where streamed Anthropic responses reported input_tokens from the initial message_start chunk instead of the final message_delta chunk, which contains the accurate cumulative input token count.
    • Fixes an issue where tool_result message content in Anthropic responses were not captured.
    • Fixes an issue where tool calls and function call outputs passed as OpenAI SDK objects (e.g. ResponseFunctionToolCall) in the input list of the OpenAI Responses API were silently dropped from LLM Observability traces. Previously, the input parser used dict-only access patterns that failed for SDK objects; it now uses attribute-safe access that handles both plain dicts and SDK objects.
    • Defaults model_provider to "unknown" when a custom base URL is configured that does not match a recognized provider in the OpenAI, Anthropic, and LiteLLM integrations.
  • profiling
    • Fixes an issue where enabling the profiler with gevent workers caused gunicorn to skip graceful shutdown, killing in-flight requests immediately on SIGTERM instead of honoring --graceful-timeout. #16424
    • Fix potential reentrant crashes in the memory profiler by avoiding object allocations and frees during stack unwinding inside the allocator hook. #16661
    • the Profiler now correctly flushes profiles at most once per upload interval.
    • Fixes an AttributeError crash that occurred when the lock profiler or stack profiler encountered _DummyThread instances. _DummyThread lacks the _native_id attribute, so accessing native_id raises AttributeError. The profiler now falls back to using the thread identifier when native_id is unavailable.
    • Lock acquire samples are now only recorded if the acquire call was successful.
    • A rare crash which could happen on Python 3.11+ was fixed.
    • the memory profiler now uses the weighted allocation size in heap live size, fixing a bug where the reported heap live size was much lower than it really was.
    • A crash that could happen on Python < 3.11 when profiling asynchronous code was fixed.
    • A rare crash that could happen when profiling asyncio code has been fixed.
    • Fixes two bugs in gevent task attribution. gevent.wait called with the objects keyword argument (e.g. gevent.wait(objects=[g1, g2])) now correctly links the greenlets to their parent task. Additionally, greenlets joined via gevent.joinall or gevent.wait from a user-level greenlet are now attributed to that greenlet instead of always being attributed to the Hub.
    • Fixes an issue where setting an unlimited stack size (ulimit -s unlimited) on Linux caused the stack profiler sampling thread to fail to start, resulting in empty CPU and wall-time profiles. #17132
    • A KeyError that could occur when using gevent.Timeout has been fixed.
  • Fix for potential crashes at process shutdown due to incorrect detection of the VM finalization state when stopping periodic worker threads.
  • dynamic instrumentation
    • fixed an issue that prevented Dynamic Instrumentation from being re-enabled once disabled via the UI while being originally enabled via environment variable.
  • flask
    • The Flask integration now properly captures the template parameter value for all Flask versions.
  • internal
    • A bug preventing certain periodic threads of ddtrace (like the profile uploader) from triggering in fork-heavy applications has been fixed.

Estimated end-of-life date, accurate to within three months: 06-2027 See the support level definitions for more information.

Bug Fixes
  • Fixes an issue where internal background threads could cause crashes or instability in applications that fork (e.g. Gunicorn, uWSGI) or during Python shutdown. Affected applications could experience intermittent crashes or hangs on exit.
<!-- -->
  • CI Visibility: This fix resolves an issue where pytest-xdist worker crashes (os._exit, SIGKILL, segfault) caused buffered test events to be lost. To enable eager flushing, set DD_TRACE_PARTIAL_FLUSH_MIN_SPANS=1.
Last Checked
34m ago
Latest
v4.10.1
Tracking since Sep 4, 2025