-
Added support for all public registries to the K8s SSI gradual rollout feature.
- The default list of Datadog registries is now:
- gcr.io/datadoghq
- docker.io/datadog
- public.ecr.aws/datadog
- datadoghq.azurecr.io
- us-docker.pkg.dev/datadoghq/gcr.io
- europe-docker.pkg.dev/datadoghq/eu.gcr.io
- asia-docker.pkg.dev/datadoghq/asia.gcr.io
- registry.datad0g.com
- registry.datadoghq.com
-
Sends status updates for kubernetes actions through the EVP pipeline.
-
Add datadog-apm-library-nginx to the fleet installer so it is installed alongside the other APM libraries when APM instrumentation is enabled.
-
The cluster agent readiness probe now includes the admission controller webhook server. Newly started cluster agents will not be marked as ready until the webhook can serve requests, preventing missed pod mutations during rollouts.
-
Added new additional_metric_tags field to APM metrics payload to allow tracers to send customer configured span derived primary tags.
-
APM: Fetch Org Propagation Marker on startup to Org Propagation Guard. The trace-agent now fetches /api/v2/validate at startup to derive an Org Propagation Marker (OPM) and exposes it in the /info endpoint.
-
Agents are now built with Go 1.25.10.
-
Agents are now built with Go 1.25.9.
-
Bump rshell to v0.0.10 for the Private Action Runner. Shell commands now follow symlinks that cross between allowed roots and resolve host-mounted paths correctly in containerized deployments.
-
Bump rshell to v0.0.14.
-
Added internal telemetry counters to measure the impact of enabling auto_multi_line_detection by default. The counters track how many log lines would be combined and how many would risk truncation, without changing any log processing behavior.
-
system-probe: The discovery module (discovery.enabled) and system-probe-lite (discovery.use_system_probe_lite) are now enabled by default on Linux. When discovery is the only enabled system-probe module, system-probe-lite is automatically used to minimize resource usage. To disable discovery, set discovery.enabled: false in system-probe.yaml.
-
Add ECS Fargate task ARN to X-Datadog-Additional-Tags header on data-streams-message HTTP requests.
-
Dynamic Instrumentation: Add support for conditional probes via the when clause. Probes can now include equality conditions that compare captured variables against literal values (integers, floats, booleans, strings, and null). When a condition evaluates to false, the probe event is suppressed, reducing overhead for high-traffic instrumentation points.
-
Dynamic Instrumentation: Add support for probing Go generic functions. Snapshots and log probes now display concrete types for generic parameters.
-
Enables network monitoring for devices with infrastructure_mode: end_user_device.
-
When using RDS Aurora Autodiscovery, tags present on the cluster are now inherited by the instances. For example, if a cluster has the tag datadoghq.com/dbm: true<span class="title-ref">, all instances in that cluster will have </span><span class="title-ref">extra_dbm_enabled: true</span>`. Tags on the instances will override tags on the cluster.
-
Add SandboxId field to the workloadmeta structure. Update collectors (crio and containerd) accordingly.
-
The kubelet core check now reports container kubernetes.containers.cpu.requests, kubernetes.containers.cpu.limits, kubernetes.containers.memory.requests, and kubernetes.containers.memory.limits metrics using the live values from pod.status.containerStatuses[].resources when available, so the metrics reflect the effective runtime values after an in-place vertical resize. Resources declared only in the pod spec (for example GPUs or custom resources) are preserved, and clusters where the kubelet does not yet populate status.resources continue to report the spec values as before.
-
The logs agent now retries log payloads on HTTP 403 (Forbidden) responses instead of dropping them, when the endpoint's API key was resolved from a secrets backend. On 403, the agent triggers an asynchronous secrets refresh and retries the payload. This applies to the core logs agent, CWS security reporter, compliance reporter, and the event platform forwarder. Endpoints whose API key is not managed by the secrets backend retain the original drop behavior.
-
Hide DMG mount in MacOS agent installation process.
-
Send device metadata for devices monitored by Network Configuration Management.
-
NPM connection payloads now include a process_name:<name> tag identifying the process executable that owns each connection. The tag is populated from the process agent's process list and requires process_config.process_collection.enabled to be set to true.
-
Switch config implementation to an improved version by default. Can be disabled with the env var DD_CONF_NODETREEMODEL=viper, or the config setting conf_nodetreemodel: viper in datadog.yaml.
-
The OTel Agent now supports a standalone mode (DD_OTEL_STANDALONE=true) that runs without a co-resident core Datadog Agent. In standalone mode a new dogtelextension OpenTelemetry Collector extension provides Datadog Agent functionality directly.
-
OTLP ingest configuration keys now register explicit default values matching the upstream OpenTelemetry Collector defaults. Previously these keys were bound without defaults, which caused agent config and similar introspection commands to omit them. Runtime behavior is unchanged: only user-configured values are forwarded to the OTel Collector pipeline, so unconfigured settings continue to use the Collector's own built-in defaults.
Notable default changes in pkg/config/config_template.yaml:
-
Added Translate, TranslateK8sObjects, and NewManifestCache to otlp/logs so exporters can share log translation and manifest deduplication logic without duplicating code.
-
Add private_action_runner.api_key_only_enrollment configuration flag to explicitly control Private Action Runner enrollment mode. When set to true, enrollment uses the API key only (no app key required, no auto-connections created). When false (default), the app key is required and connections are auto-created during enrollment.
-
The private action runner binary now has the CAP_NET_RAW capability.
-
The Private Action Runner default enabled actions now include runNetworkPath and runCommand.
-
The Private Action Runner now includes default enabled actions that are automatically allowed. To opt out, set private_action_runner.default_actions_enabled to false in datadog.yaml. This still requires explicit opt-in into the Private Action Runner feature.
-
Make app key optional during installation to prepare for app-key-less PAR enrollment.
-
Add private_action_runner.skip_connection_creation configuration flag to control auto-connection creation during Private Action Runner enrollment. When set to true, the runner skips creating connections during app-key enrollment. Defaults to false, which preserves the existing behavior of auto-creating connections.
-
Retry transactions on API key errors (HTTP 403 responses) when API key refresh is enabled via secrets management in the Agent configuration.
-
Bumped the Security Agent policies to v0.79.0
-
NDM: SNMP default scan is now enabled by default. Discovered SNMP devices will be automatically scanned to collect OID data. To disable, set network_devices.default_scan.enabled to false.
-
Upgrade OpenTelemetry Collector dependencies from v0.147.0 to v0.150.0 (core v1.53.0 to v1.56.0).
Notable upstream changes:
- The
exporter.datadogexporter.DisableAllMetricRemapping feature gate has been promoted to beta (enabled by default). Metric remappings are now handled by the Datadog backend. If you experience issues, disable the gate with --feature-gates=-exporter.datadogexporter.DisableAllMetricRemapping and contact Datadog support.
- Semantic conventions updated from v1.38.0 to v1.40.0.
- The
datadogextension now supports gateway_service and gateway_destination config fields for Fleet Automation gateway topology view.
- Fix for use-after-free bug in quantile sketches when exporting ExponentialHistogram metrics with multiple attribute sets.
- OTTL context setters (used by
transform, filter, and tailsampling processors) now validate value types and return errors on type mismatches instead of silently ignoring them. Users with error_mode: propagate (the default for the transform processor) may see new errors if their OTTL statements had pre-existing type mismatches. Switch to error_mode: ignore to preserve the previous behavior while fixing the statements.
See the full upstream changelogs: collector-contrib v0.150.0, collector core v0.150.0.
-
Add environment variable overrides to selectively keep infrastructure checks enabled in Windows containers. By default, the disk, network, winproc, file_handle, and io checks are still removed at startup for backward compatibility. Set DD_WINDOWS_HOST_METRICS=true to keep all infra checks, or use per-check variables (e.g. DD_WINDOWS_ENABLE_DISK_CHECK=true, DD_WINDOWS_ENABLE_IO_CHECK=true) to enable individual checks.
-
The api_server.request_duration_seconds internal metric now tags requests with the gorilla/mux route template (e.g. /{component}/status) instead of the raw request path. This prevents arbitrary user-provided path values from creating high-cardinality metric tags. Requests that do not match any registered route are tagged with unknown.
-
Adds a new tag 'is_physical_storage' to every 'system.disk.*' metric if 'tag_by_physical_storage' configuration option (defaults to false) is enabled. Emits a new set of metrics: 'system.disk.physical_total','system.disk.physical_used', 'system.disk.physical_free', 'system.disk.physical_utilized', and 'system.disk.physical_in_use' if 'collect_physical_metrics' configuration option (defaults to false) is enabled. Requires the Go disk check v2 (disk_check.use_core_loader: true). Linux only.
-
Fix span stats and priority sampling for Cloud Run job tasks by properly waiting for the trace agent shutdown sequence to complete, ensuring in-flight traces are flushed before the serverless function exits.
-
APM : Fix missing tracer language in stats aggregation key when the V1 stats path is enabled. This issue only affects users with the V1 feature flag enabled or using the 'convert-traces' flag.
-
APM: Fixed unnecessary CPU load on the core Agent in non-containerized environments by skipping container ID resolution (header parsing and cgroup lookups) in the trace API when not running in a container.
-
Dynamic Instrumentation: Fix a bug where evaluationErrors were reported in the wrong location in snapshot payloads, causing them to not appear properly in the UI.
-
Fix AKS cluster name parsing from kubernetes.azure.com/cluster label.
-
Fixes a bug where autodiscovered services were not being deleted if GetAuroraClustersFromTags or GetRdsInstancesFromTags returned no matches.
-
SNMP: Fix bandwidth usage rate metrics (snmp.ifBandwidthInUsage.rate and snmp.ifBandwidthOutUsage.rate) not being emitted when there are intermittent check failures.
-
Fix a concurrent map write crash in the config package when multiple goroutines call config getters with unknown keys simultaneously. This could cause the agent to crash with fatal error: concurrent map writes when Docker log collection with container_collect_all is enabled.
-
Fix a deadlock that could make the Agent become unresponsive after a remote configuration value was cleared.
-
Fixes a caching bug in dbm rds instance and aurora cluster autodiscovery. When service metatadata changed (DbName for example) the service check would not be updated with the new metadata if the service was already in the cache. Now the cached service is deleted and the updated service is added as a new check.
-
Fix a regression introduced in Agent 7.76 where anchored log_processing_rules (using ^ and $) stopped matching log lines. This was caused by the new default auto-multiline detection tagging path not trimming trailing whitespace from log content before forwarding it to processing rules.
-
Fixed a panic in the system-probe container store caused by gopsutil parsing malformed /proc/[pid]/stat files during process termination race conditions.
-
Fix agent status failing when the HA Agent feature is enabled. The status templates attempted to iterate over a struct with range, which is not supported by Go templates. The HA Agent Metadata section now renders correctly.
-
Fix IPv6 address formatting when constructing the Cluster Agent endpoint URL from Kubernetes service environment variables. IPv6 addresses are now properly wrapped in brackets (e.g. https://[fd38:552b:2959::4f4a]:5005 instead of https://fd38:552b:2959::4f4a:5005), which previously caused the remote tagger and other gRPC clients to fail with "too many colons in address" errors on IPv6-only clusters.
-
Fixed Oracle Data Guard metrics query that caused ORA-01873 (interval precision overflow).
-
Fix spurious warn log on otel-agent startup about conflicting dd_url and logs_no_ssl settings.
-
DD_PROXY_HTTP, DD_PROXY_HTTPS, HTTP_PROXY, HTTPS_PROXY, DD_PROXY_NO_PROXY, and NO_PROXY environment variables are now respected by the standalone OTel agent without requiring --core-config.
-
NTP: renames ntp.offset with the tag source:intake to ntp.intake_offset and removes the source:ntp tag from ntp.offset, restoring it to its pre-7.77.0 single-series behavior. This fixes false alerts on existing monitors querying ntp.offset without a tag filter.
-
OTel logs exported via the Datadog Exporter (otel_source:datadog_exporter) now correctly populate otel.event_name from the OTLP event_name field, and fall back to observed_time_unix_nano for the timestamp when time_unix_nano is unset (per the OTLP spec). Previously, both fields were missing for this ingestion path, causing OTel RUM events to be dropped or timestamped at the Unix epoch.
-
Fixed a bug (only present when deduplication is enabled) where SNMP devices loaded from the cache on agent restart were not registered immediately, causing them to be temporarily unavailable until the next discovery cycle completed. Cached devices are now registered right away and tracked for deduplication so that subsequent scans for the same physical device are correctly deduplicated.
-
Fixed an issue in SNMP autodiscovery where the IP processing counter was not reset immediately after processing, potentially delaying or preventing device registration when deduplication was enabled.
-
Windows: Fixed a remote update failure in datadog-installer when validating Agent domain accounts.
When querying some domain account names, NetQueryServiceAccount can return NTSTATUS 0xC0000106 (STATUS_NAME_TOO_LONG) during gMSA detection. This status is now treated like STATUS_INVALID_ACCOUNT_NAME so the account is handled as a regular domain account instead of incorrectly failing the update.