Released on: 2026-04-15
APM OTLP: Changed attribute precedence behavior when looking up OpenTelemetry semantic convention attributes that have multiple equivalent keys (e.g., http.status_code vs http.response.status_code, deployment.environment vs deployment.environment.name).
Previous behavior: When both old and new semantic convention keys existed, the lookup would check ALL keys in span attributes before checking ANY key in resource attributes. So whichever key appeared in span attributes would win, regardless of which key was in resource attributes.
New behavior: The lookup now uses a per-concept precedence order. For each semantic concept, the registry defines an ordered list of attribute keys; the first key that has a value is returned. The precedence order (which key takes priority) depends on the concept and may prefer either the newer or the older convention key. Span vs resource precedence (which map is checked first) is unchanged and still depends on the function.
Who is affected: This change only affects users who have the same concept represented by different convention-version keys in span vs resource attributes. The returned value may now come from a different key than before, according to the concept's precedence order.
This is an uncommon configuration since most instrumentation libraries use consistent semantic convention versions across span and resource attributes.
Allows the Agent to get an API key in exchange for an AWS cloud authorization proof. This allows you to use your AWS credentials against Datadog and removes the need for you to manage an API key. More details can be found here: https://docs.datadoghq.com/account_management/cloud_provider_authentication/
The autoscaling vertical controller now supports in-place vertical pod resizing.
Add a new configuration provider, which schedules new instances of KSM checks to generate metrics from CustomResourceDefinitions.
This new provider works with the kube_crd listener which listens for CustomResourceDefinitions created on the cluster and triggers a new autodiscovery-service for each one.
This new configuration provider must use the standard kubernetes GroupVersionKind format in its AdvancedADIdentifier section to apply to a matching CustomResourceDefinition.
The rest of the configuration is a standard KSM configuration instance.
CNM - Add 7 per-connection TCP congestion signals: rto_count (RTO loss events), recovery_count (fast recovery events), reord_seen (send-side reordering), rcv_ooopack (receive-side out-of-order packets), delivered_ce (ECN CE-marked segments), ecn_negotiated (ECN negotiation status), and probe0_count (zero-window probes). Collected via eBPF on CO-RE and runtime-compiled tracers, Linux only.
dd-procmgrd can now read process definitions and manage child process lifecycles with graceful shutdown.
dd-procmgrd now supervises managed processes with configurable restart policies, exponential backoff, and burst limiting.
dd-procmgrd can now manage the DDOT (Datadog Distribution of OpenTelemetry) collector process via a dual-mode mechanism. When a processes.d/datadog-agent-ddot.yaml config is present, dd-procmgrd takes over DDOT lifecycle management; otherwise the existing systemd unit manages it directly.
Automatic SBOM generation for running containers via system-probe
Runtime usage tracking - identifies which files and packages are actively accessed by running processes
Security enrichment - flags SUID binaries and processes running as root
gRPC streaming from system-probe to core agent for efficient SBOM forwarding
Automatic CWS policy generation based on running container SBOMs.
On Windows, the APM SSI installer now automatically enables system-probe to report injection telemetry from the ddinjector driver.
Kubernetes pod check annotations: Invalid JSON in pod check annotations (ad.datadoghq.com/<container>.checks) now produces a clear error message in the "Configuration Errors" section of agent status. A new CLI command agent validate-pod-annotation validates annotation JSON from a file or stdin and exits with an error on invalid syntax, so you can catch mistakes before applying annotations to pods.
source and provider fields to rtloader API and add integration_security configuration properties.X-Vault-AWS-IAM-Server-ID header for Hashicorp Vault AWS authentication method. Helps to prevent different types of replay attacks.azure_session block (azure_tenant_id, azure_client_id, azure_client_secret or azure_client_certificate_path).1.25.8.after/before config fields with topological sort and reverse shutdown order.lsblk when blkid fails or returns no labels for disk label tagging. This ensures label and device_label tags are present on disk metrics even when the agent runs as a non-root user, since lsblk reads from sysfs and does not require elevated privileges.X-Datadog-Additional-Tags header with hostname and agent version to data-streams-message HTTP requests.kafka_actions check now automatically inherits Schema Registry configuration (URL, credentials, TLS, OAuth) from the kafka_consumer integration, enabling schema registry support without additional configuration.deployment_type on the Datadog extension to daemonset by default, or gateway when Gateway mode is enabled.podman_db_path configuration option now accepts a comma-separated list of paths to support monitoring containers from multiple users simultaneously (e.g. root and rootless users). Example: podman_db_path: "/var/lib/containers/storage/db.sql,/home/myuser/.local/share/containers/storage/db.sql". When podman_db_path is not set, the Agent automatically discovers Podman databases for the root user and for all users under /home/. Log collection (logs_config.use_podman_logs) is also updated to work correctly with both explicit multi-path configuration and auto-discovery.ddot-collector and agent -full images are now published.system-probe-lite) now wraps system-probe, acting as a loader for it. system-probe-lite will automatically fallback to system-probe when one of the following is true:
discovery.useSystemProbeLite is set to false (the default).system-probe is enabled.APM: Fix an issue where SQL stats group resources longer than 5000 characters were truncated before obfuscation, causing the trace-agent to fail to parse mid-token fragments and log an error instead of correctly obfuscating the query.
Use atomic file replacement (write to temp file then rename) when writing APM workload selection policy files, preventing concurrent readers from seeing partially-written data.
Fixed a race condition in the logs auditor where Flush() could write a stale registry to disk during a transport restart. The auditor now drains all pending payloads from its input channel before flushing, ensuring file offsets are up to date and reducing duplicate log processing after a TCP-to-HTTP transport switch.
[DBM] Bump go-sqllexer to v0.2.1 to fix the following bugs:
SELECT * FROM t1, t2).The diagnose command now returns an error if an API key is not configured.
Fixes panic when advanced dispatching is disabled when KSM Core is ran as a cluster check.
Fix support of Kafka actions for configurations where kafka_connect_str is a list.
Fixed a bug in the disk Go check (diskv2) where partition enumeration could hang indefinitely on Windows when an orphaned or offline volume is present on the system. The check now applies the configured timeout (default 5s) to partition discovery and guards against spawning duplicate goroutines on subsequent check runs, preventing permanent worker starvation, goroutine buildup, and high CPU utilization.
The process check now reports the correct container host type on ECS Managed Instances when the agent runs as a daemon.
Fixed kafka actions failing to match the local kafka_consumer integration when the bootstrap_servers tag exceeds the 200-character backend tag limit. Long broker lists (e.g. 3+ MSK brokers) are now truncated to match the backend's tag normalization.
APM: Fix base_service tag being missed on a subset of APM stats matching span.kind=server.
Fix kube_distribution tag value detection logic by analyzing node system info first.
Fixed a memory leak in the kubernetes_state_core check caused by orphaned reflector goroutines in the KSM store during rebuilds. This led to unbounded memory growth and potential OOM kills.
The Go network v2 check now correctly monitors the host network namespace when running in a container, similar to the Python version's behavior.
Fixes system.net.* metrics when the Agent runs in Docker with the host's procfs mounted (for example /host/proc with host PID namespace). The Go network check (network v2) now reads /proc/1/net/dev under that mount so interface stats match the host; previously /proc/net/dev could resolve in the container network namespace and report wrong or missing traffic (regression in Agent 7.73+).
Fixed a race condition in the workloadmeta process collector where a containerized process could be permanently stuck with an empty container ID if it was collected before the container runtime reported the PID-to-container mapping.
Fixed a bug in the kubeapiserver check where the eventText length was reported as 0 when it did not fit in the event bundle.
The API server now logs errors from srv.Serve that were previously silently discarded.
When a multiline log processing rule has a pattern that never matches, the logs agent now sends lines individually instead of joining all lines into a single oversized message. Normal multiline aggregation begins once the pattern matches for the first time.
Fixed the network check (v2) ignoring the combine_connection_states configuration option. When set to false, the check now emits granular per-state TCP metrics (e.g. system.net.tcp4.close_wait, system.net.tcp4.syn_sent) instead of only the combined ones (e.g. system.net.tcp4.closing, system.net.tcp4.opening), restoring parity with the previous Python-based network check.
Fixes a bug in the Network Configuration Management (NCM) module where the SSH Timeout settings were parsed as nanoseconds instead of seconds. This issue caused SSH sessions to time out prematurely, leading to errors like:
Error running check: failed to connect to 192.168.0.1:22: dial tcp 192.168.0.1:22: i/o timeout
Fixed the Datadog Agent installer on Windows: when DD_PRIVATE_ACTION_RUNNER_ENABLED=true is set without an explicit DD_PRIVATE_ACTION_RUNNER_ACTIONS_ALLOWLIST, the Private Action Runner now defaults to com.datadoghq.script.runPredefinedPowershellScript on Windows and com.datadoghq.script.runPredefinedScript on Linux/macOS.
Preserve odbc.ini and odbcinst.ini across Fleet Automation upgrades on Linux.
Add missing node name to the manifests for Kubernetes resources in the OTEL logs agent exporter.
With systemd, the system-probe service now checks environment variables for configuration even if system-probe.yaml does not exist.
Fixed an issue on Windows where Cloud Network Monitoring reported TCP failure rates greater than 100%. The Windows kernel driver can report a TCP failure (reset, timeout, or refused connection) without also setting the flow-closed flag. The agent now correctly marks any connection with a TCP failure as closed.
Fixed discovery of Windows processes to identify reused PIDs between process snapshots and correctly track these processes.
agent status output and process-agent endpoint list now display only the last 4 characters of the API key (previously 5), aligning with the Datadog UI.Released on: 2026-04-15 Pinned to datadog-agent v7.78.0: CHANGELOG.
agent status output under the Admission Controller section. The probe is disabled by default and can be enabled by setting admission_controller.probe.enabled to true. The probe uses dry-run ConfigMap creation requests in the cluster agent's namespace.datadog-cluster-agent status output and flares. This displays whether RC is enabled for the organization, whether the API key is authorized for Remote Configuration, and any last errors, matching the node agent's existing behavior.Released on: 2026-04-08
Released on: 2026-04-08 Pinned to datadog-agent v7.77.3: CHANGELOG.
Released on: 2026-04-01
Released on: 2026-04-01 Pinned to datadog-agent v7.77.2: CHANGELOG.
Released on: 2026-03-24
1.25.8.Released on: 2026-03-24 Pinned to datadog-agent v7.77.1: CHANGELOG.
Released on: 2026-03-18
APM OTLP: The datadog.* namespaced span attributes are no longer used to construct Datadog span fields. Previously, attributes like datadog.service, datadog.env, and datadog.container_id were used to directly set corresponding Datadog span fields. This functionality has been removed and the Agent now relies solely on standard OpenTelemetry semantic conventions.
Exceptions:
datadog.host.name attribute continues to be respected for hostname resolution as documented at https://docs.datadoghq.com/opentelemetry/mapping/hostname/.datadog.container.tag.* attributes continue to be supported for custom container tags.The configuration option otlp_config.traces.ignore_missing_datadog_fields (and corresponding environment variable DD_OTLP_CONFIG_IGNORE_MISSING_DATADOG_FIELDS) is deprecated and no longer has any effect. The Agent now always uses standard OTel semantic conventions.
Migration: If you were using datadog.* attributes, switch to the standard OpenTelemetry semantic conventions:
datadog.service → service.namedatadog.env → deployment.environment.name (OTel 1.27+) or deployment.environmentdatadog.version → service.versiondatadog.container_id → container.idWho is affected: Users who explicitly set datadog.* attributes (other than datadog.host.name and datadog.container.tag.*) in their OpenTelemetry instrumentation to override default field mappings. Users relying solely on standard OpenTelemetry semantic conventions are not affected.
dd-procmgrd, a minimal Rust daemon for the Datadog process manager. The daemon starts, logs, and waits for a shutdown signal. It does not provide user-facing functionality.logs_config.pipeline_failover.enabled: true (default: false). When all pipelines are blocked, backpressure is applied to prevent data loss.collect_memory_pressure: true in the memory check configuration. New metrics: system.mem.allocstall (with zone tag), system.mem.pgscan_direct, system.mem.pgsteal_direct, system.mem.pgscan_kswapd, system.mem.pgsteal_kswapd.apm_config.span_derived_primary_tags that will be extracted from span tags and used as additional aggregation dimensions for APM statistics.ad.datadoghq.com/tags and ad.datadoghq.com/<container>.tags Kubernetes pod annotations. Template variables are resolved at runtime, enabling dynamic tagging based on pod and container metadata. This allows centralized tag configuration that applies to all checks, logs, and traces without hardcoding pod-specific values.private_action_runner.enabled is set in datadog.yaml.datadog-agent-action Windows service. The service is installed as demand-start with a dependency on the main Agent service, and its credentials and ACLs are managed alongside the other Agent services during install, upgrade, and repair.runPredefinedPowershellScript action to the Private Action Runner on Windows. This action allows running predefined PowerShell scripts (inline or file-based) with optional parameter templating, JSON schema parameter validation, environment variable allowlisting, configurable timeouts, and a 10 MB output limit.The Agent's embedded Python has been upgraded from 3.13.11 to 3.13.12.
Add ntp.offset metric with source:intake tag to monitor clock drift using Datadog intake server timestamps. Original ntp.offset metric calculated from an NTP server is now tagged source:ntp.
As of Kubernetes version 1.33, the Endpoint API object has been deprecated in favor of EndpointSlice. Autodiscovery now supports the use of an EndpointSlice listener and provider to collect endpoint checks. To enable this feature, set kubernetes_use_endpoint_slices to true in your Datadog Agent configuration.
Add bucket label to image_resolution_attempts telemetry to track gradual rollout progress.
Added a private action runner bundle that exposes the Network Path traceroute functionality through the getNetworkPath action.
Sends telemetry for synthetics tests run on the agent, including checks received, checks processed, and error counts for test configuration, traceroute, and event platform result submission.
Added support for two new configurations for tag-based gradual rollout in Kubernetes SSI deployments. The gradual rollout can be configured using the following parameters:
DD_ADMISSION_CONTROLLER_AUTO_INSTRUMENTATION_GRADUAL_ROLLOUT_ENABLED: Whether to enable gradual rollout (default: true)
DD_ADMISSION_CONTROLLER_AUTO_INSTRUMENTATION_GRADUAL_ROLLOUT_CACHE_TTL: The cache TTL duration for the gradual rollout image cache (default: 1h)
Agent metrics now include a connection_type tag with a value of tcp, uds, or pipe for lib-to-agent communications.
Automatically collect the team tag when a Kubernetes resource has a team label or annotation and explicit team tag extraction is not configured.
Enables the agent to support built-in credentials like IRSA for AWS cloud environments.
Bump go-sqllexer to v0.1.13, improving SQL obfuscation performance and fixing incorrect tokenization of multi-byte UTF-8 characters (e.g., CJK characters, full-width punctuation).
Agents are now built with Go 1.25.7.
NDM: Cisco SD-WAN interface metadata now includes the is_physical field to distinguish physical from virtual interfaces (loopback, tunnel). cEdge interfaces also include the type field with the IANA interface type number.
In the Cluster Autoscaling controller, use Kubernetes client update instead of patch.
On ECS Managed Instances, detect hostname from IMDS when the agent runs in daemon mode.
On ECS Managed Instances with daemon scheduling, the agent uses ECS_CONTAINER_METADATA_URI_V4 environment variable as a fallback signal for v4 availability.
Expose a new metric kube_apiserver.api_resource that holds the name, kind, group, and version of all known cluster-wide (non namespaced) resources on the cluster.
Add new DDOT feature gate 'exporter.datadogexporter.DisableAllMetricRemapping' to disable all client-side metric remapping.
Increases the reliability of namespaceLabelsAsTags and namespaceAnnotationsAsTags for new pods by caching the last seen namespace metadata.
Added a new, optional configuration setting for journald logs: default_application_name. If set to a non-empty string, the value will replace "docker" as the default application name for contained based journald logs. If set to an empty string, the application name will be determined by the systemd journal fields, like all non-container based journald logs.
Simplified location permission detection on MacOS by removing the first detection with polling at the time of app startup. The permission detection now happens only at the time of WLAN data collection.
Use config flag 'request_location_permission' in WLAN config to gate location permission request on MacOS
Added the enable_otlp_container_tags_v2 feature flag, which may reduce the Agent's outgoing traffic when ingesting OTLP traces from containerized applications.
However, the flag introduces some breaking changes:
@);k8s.pod.uid attribute as a fallback container ID is no longer supported;The datadog.yaml configuration file now includes a commented-out private_action_runner section on all platforms.
The Private Action Runner now supports Datadog's secret management features. It can now resolve secrets using the ENC[...] notation in configuration files, supporting all secret backends via secret_backend_type and secret_backend_config settings.
Private Action Runner now supports running as a Windows service via Service Control Manager (SCM).
Bumped the Security Agent policies to v0.77.0
SNMP interface metadata now includes type (IF-MIB ifType) and is_physical fields. The is_physical field is set to true for physical ethernet interface types (ethernetCsmacd, fastEther, fastEtherFX, gigabitEthernet).
Add support for unconnected UDP sockets in the SNMP corecheck. Automatically fallback to unconnected UDP sockets if the connected UDP socket times out.
APM: Added a new health metric, datadog.trace_agent.receiver.payload_timeout, to track incoming trace payload timeouts caused by client connection closures or middleware timeouts.
Upgraded the Datadog Agent Windows installer from WiX 3 to WiX 5.
Reports telemetry from the Windows Injector, enabled by default. Disable this feature by setting injector.enable_telemetry=false in system-probe.yaml when running system-probe.
Add Windows version information to the Private Action Runner executable. The version info is now visible in Windows Explorer file properties.
Added a telemetry metric to track pending events in workloadmeta: "workloadmeta.pending_event_bundles".
Avoid blocking workloadmeta collectors when streaming events to remote agents.
ALTER SESSION SET CONTAINER statements are now properly quoted to prevent SQL injection.tegrastats_path configuration option to prevent command injection. The path must be absolute and cannot contain shell metacharacters or whitespace.agent check --flare created the checks directory with 0000 permissions, preventing check output files from being written. The directory is now created with 0750 permissions.sendMetric when the sender or metric function cannot be resolved.GetSender call in custom query handling in favor of the existing commit helper.logs_config.max_message_size_bytes limit (default 900KB) were incorrectly marked as truncated. This caused the ...TRUNCATED... marker to appear in logs that fit within the size limit, and incorrectly marked the subsequent log line as a truncated remainder. Additionally, improved truncation detection by extending the FrameMatcher interface to explicitly signal when content is truncated, ensuring consistent truncation state across the framer and handler components.private_action_runner.* snake-case names.C:\ProgramData\Datadog\private-action-runner\powershell-script-config.yaml.logs_config.max_message_size_bytes limit (default 900KB). The truncation was performed at the byte level without respecting 2-byte UTF-16 character boundaries, which could split a character in half and produce Unicode replacement characters (U+FFFD) after decoding. The framer now aligns the truncation limit to a 2-byte boundary for UTF-16 encodings, ensuring that truncated frames always contain valid UTF-16 data.Released on: 2026-03-18 Pinned to datadog-agent v7.77.0: CHANGELOG.
eks.amazonaws.com API group) by default.orchestrator_explorer.terminated_pods_improved.enabled.Released on: 2026-03-09
Released on: 2026-03-09 Pinned to datadog-agent v7.76.3: CHANGELOG.
Released on: 2026-03-05
infra_mode tag is now correctly added to system.cpu.user on Windows when infrastructure_mode is not set to "full", matching the behavior of the Linux cpu check.Released on: 2026-03-05 Pinned to datadog-agent v7.76.2: CHANGELOG.
Released on: 2026-02-26
ACL command.gpu.nvlink.speed metric is emitted in Blackwell or newer devices.Released on: 2026-02-26 Pinned to datadog-agent v7.76.1: CHANGELOG.
Released on: 2026-02-23
otelcollector.converter.features, you may need to add the datadog feature to enable Fleet Automation, as DDOT Fleet Automation metadata is no longer submitted through the ddflareextension.Allow users to filter agent check instances using a new --instance-id parameter, which filters by the instance hash found in the agent status.
Add privateactionrunner binary in Agent artifacts to allow running actions using the Agent, and enable running it on Linux. The binary is disabled by default. To enable it, set privateactionrunner.enabled: true in your configuration file.
Integration check failures are now automatically reported to the Agent Health Platform component when enabled via health_platform.enabled: true. This provides structured health issue tracking with:
This feature helps users proactively identify and troubleshoot integration issues across their fleet.
The Agent Profiling check now supports automatic Agent termination after flare generation when memory or CPU thresholds are exceeded. This feature is useful in resource-constrained environments where the Agent needs to be restarted after generating diagnostic information.
Enable this feature by setting terminate_agent_on_threshold: true in the Agent Profiling check configuration. When enabled, the Agent uses its established shutdown mechanism to trigger graceful shutdown after successfully generating a flare, ensuring proper cleanup before exit.
Warning: This feature will cause the Agent to exit. This feature is disabled by default and should be used with caution.
Experimental support the ConfigSync HTTP endpoints over unix sockets with agent_ipc.use_socket: true (defaults to false).
Implements the flare command for the otel-agent binary. Now you can run otel-agent flare directly in the otel-agent container to get OTel flares.
Adds system info metadata collection for macOS end-user devices.
Adds system info metadata collection for Windows end-user devices.
Added GPU runtime discovery support for ECS EC2 environments. The Datadog Agent can now detect GPU device UUIDs assigned to containers by extracting the NVIDIA_VISIBLE_DEVICES environment variable from the Docker container configuration. This enables GPU-to-container mapping for GPU metrics without requiring the Kubernetes PodResources API, which is not available in ECS environments.
After falling back to TCP, the Logs Agent periodically retries to establish HTTP and upgrades the connection once HTTP connectivity is available.
Container logs now include a LogSource tag indicating whether each log message originated from stdout or stderr. This applies to logs parsed via Docker and Kubernetes CRI runtimes.
Added paging file metrics to the Windows memory check for pagefile.sys usage.
Add a new global_view_db variable to AWS Autodisovery templates. By default this is the value of the datadoghq.com/global_view_db tag on the instance or cluster.
Add NotReady endpoint processing to be on par with EndpointSlices processing.
The agentprofiling check now retries flare generation 2 times with exponential backoff (1 minute after first failure, 5 minutes after second failure) when flare creation or sending fails. This improves reliability when encountering transient failures during flare generation.
Adds a kubernetes_kube_service_new_behavior flag (default false) to alter kube_service tag behavior. If the flag is set to true, kube_service tag is attached unconditionally. Previously, the tag was only attached when the Kubernetes service has the status Ready.
APM: Add custom protobuf encoder for trace writer v1 with string compaction to reduce payload size.
Extended the autodiscovery secret resolver to support refreshing secrets.
Agents are now built with Go 1.25.7.
The datadog-installer setup command now prints human-readable errors instead of mixing JSON and text.
Added GPUDeviceIDs field to the workloadmeta Container entity to store GPU device UUIDs. This field is populated by the Docker collector in ECS environments from the NVIDIA_VISIBLE_DEVICES environment variable (e.g., GPU-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx).
The GPU collector now uses GPUDeviceIDs from workloadmeta as the primary source for GPU-to-container mapping in ECS, with fallback to procfs for regular Docker environments and PodResources API for Kubernetes.
GPU: add new tag gpu_type to the GPU metrics to identify the type of GPU (e.g., a100, h100).
Improve eBPF conntracker support by using alternate probes when the primary probe is unavailable, enabling compatibility with GKE Autopilot and other environments running Google COS.
The logs.dropped metric now tracks dropped logs for both TCP and HTTP log transports. Previously, this metric was only available when using TCP transport. Customers can now monitor dropped logs with a single unified metric regardless of which transport protocol is configured, making it easier to detect and troubleshoot log delivery issues.
The logs agent now supports using start_position: beginning and start_position: forceBeginning with wildcard file paths. Previously, configurations like path: /var/log/*.log with start_position: beginning would fail validation. The agent's fingerprinting system when enabled prevents duplicate log reads during file rotation, making this combination safe to use.
Site config URLs are now lowercased for consistent handling.
APM: Add tags databricks_job_id, databricks_job_run_id, databricks_task_run_id, config.spark_app_startTime, config.spark_databricks_job_parentRunId to the default list of tags that are known to not be credit card numbers so they are skipped by the credit card obfuscator.
Add option to switch on/off Infra-Attribute-Processor for traces in the OTLP ingest pipeline.
otlp_config:
traces:
infra_attributes:
enabled: false
These settings can be configured in the Agent config file or by using the environment variables.
The Datadog Agent now collects AWS Spot preemption events (requires IMDS access) as Datadog events.
Added network_config.dns_monitoring_ports, which is a list of DNS ports Cloud Network Monitoring will use to monitor DNS traffic on.
Automatically tag, but don't aggregate, multiline logs. Logs are tagged with the number of other logs they could potentially be aggregated with.
Update the histogram helpers API in the pkg/opentelemetry-mapping-go/otlp/metrics package. The API now accepts accept pointers to the OTLP data points, and returns blank DDSketches when the pointer is nil.
Update image resolution attempt telemetry to include the tag specified in the configuration, and remove the registry and digest_resolution tags.
Windows: Add a new flare artifact agent_loaded_modules.json listing loaded DLLs with metadata (full path, timestamp, size, perms) and version info (CompanyName, ProductName, OriginalFilename, FileVersion, ProductVersion, InternalName). Keeps <flavor>_open_files.txt for compatibility.
agent diagnose show-metadata inventory-otel has been removed. To display DDOT metadata, you can query the datadog extension endpoint: http://localhost:9875/metadata.DD_APM_MODE environment variable was unset.image_tag tag when defining a container spec that uses both an image tag and a digest like nginx:1.23@sha256:xxx.ERROR_INVALID_FUNCTION from DeviceIoControl on Windows Server 2016) caused all disk metrics to be discarded, including successfully collected partition/usage metrics such as system.disk.total, system.disk.used, and system.disk.free. IO counter collection is now best-effort: known errors such as ERROR_INVALID_FUNCTION are logged at debug level, while unexpected errors are logged as warnings. Neither prevent partition metrics from being reported.DD_LOGS_ENABLED environment variable is honored again when running setup scripts, so Windows installs using the new installer flow properly. Sets logs_enabled in datadog.yaml.C:\Windows\SystemTemp\datadog-installer\rollback\InstallOciPackages.json file is present.logs.sent metric for the HTTP log transport to no longer increment when logs are dropped due to non-retryable errors. This ensures more accurate reporting of successfully delivered logs.datadog.agent.check_ready to always include the check_name tag value for Python checks.kubernetes_kube_service_new_behavior to kubernetes_kube_service_ignore_readiness to better reflect the behavior.Released on: 2026-02-23 Pinned to datadog-agent v7.76.0: CHANGELOG.
apm_config.instrumentation.injection_mode configuration option to control APM library injection method. Possible values are auto (default), init_container, and csi. The auto mode automatically selects the best injection mode (currently uses init containers). The init_container mode is the legacy method that copies APM libraries into pods using init containers. The csi mode mounts APM libraries directly into pods using the Datadog CSI driver. It is experimental and requires Cluster Agent 7.76+ and the Datadog CSI driver.internal. The full annotation is now: internal.apm.datadoghq.com/injection-errorinjectTracers function to use a modular, explicit mutation pattern. This improves code readability and maintainability. Edge case behavior may differ slightly, but overall functionality remains unchanged.Released on: 2026-02-17
1.25.7.HELLO and MIGRATE Redis commands.AUTH, all arguments passed to these commands will be obfuscated and replaced with ?.Released on: 2026-02-17 Pinned to datadog-agent v7.75.4: CHANGELOG.
Released on: 2026-02-11
Released on: 2026-02-11 Pinned to datadog-agent v7.75.3: CHANGELOG.
Released on: 2026-02-04
Released on: 2026-02-04 Pinned to datadog-agent v7.75.2: CHANGELOG.
Release on: 2026-01-28
1.25.6.Released on: 2026-01-28 Pinned to datadog-agent v7.75.1: CHANGELOG.
Release on: 2026-01-21
infrastructure_mode: end_user_device configuration option. When enabled, this mode automatically activates key monitoring features tailored for end-user devices including process collection, software inventory tracking, and notable events monitoring. These settings can still be individually overridden in the configuration file if needed.Add a new azure_metadata_api_version configuration option to allow customers to specify the Azure Instance Metadata Service (IMDS) API version used by the Agent. The default value is now 2021-02-01. This setting can be configured via azure_metadata_api_version in datadog.yaml or the DD_AZURE_METADATA_API_VERSION environment variable.
The Agent's embedded Python has been upgraded from 3.13.10 to 3.13.11
Fixed a potential race condition in the Cloud Foundry CCCache locking mechanism by replacing custom lock management with <span class="title-ref">singleflight</span>. This change improves handling of concurrent cache misses.
Add the canonical version annotation to the image named <span class="title-ref">internal.apm.datadoghq.com/[lang/injector]-canonical-version</span>. This makes it easier to track the actual version of the image used in the cluster, instead of just a digest or mutable tag.
Dogstatsd named pipe on Windows is now read/writeable for everyone by default. This prevents an Access is denied error when opening a named pipe for dogstatsd server on a Windows Azure App Service Web app. Security descriptor for the named pipe can be customized via dogstatsd_windows_pipe_security_descriptor.
Detect connection issue when using FQDN in agent diagnose
Agents are now built with Go 1.25.5.
The datadog-secret-backend now allows implicit Vault authentication to be set as a config option or an env var Added a configurable max_file_read_size config option to file.yaml, file.json, & file.text to prevent OOM reads
Added Microsoft Store apps to Windows Software Inventory integration.
Added a new boolean environment variable DD_OTELCOLLECTOR_GATEWAY_MODE for precise identification of the DDOT operating mode. The variable automatically configured via the Helm chart, the Operator, or set manually. Acceptable string values are (case insensitive): "true", "false", "1", "0"
The Discovery module is now enabled by default if system-probe is enabled. It can be disabled by setting discovery.enabled: false in system-probe.yaml, or by setting the DD_DISCOVERY_ENABLED environment variable to false.
The Agent's logger has been rewritten with a more modern library to improve security and performance. No visible change is expected for users. In case of issues, the previous logger can still be used by setting <span class="title-ref">log_use_slog</span> to <span class="title-ref">false</span> in the Agent configuration. This configuration will be removed in a future release.
Enable the orchestrator_explorer.kubelet_config_check.enabled by default.
Bump OpenTelemetry Collector dependencies to v0.141.0/v1.47.0
OTLP spans describing an HTTP error without an explicit error message will now fallback to one with a description, eg. "500 Internal Server Error" instead of just "500". Users who relied on the error message to extract the status code should use <span class="title-ref">http.response.status_code</span> instead.
Additionally, the error message is no longer sourced from the deprecated <span class="title-ref">http.status_text</span> attribute. This behavior can be overridden by explicitly setting the span's status message.
On Windows, adds process name to live processes via file properties.
Single Step Instrumentation now uses the Python tracer major version 4 by default. Customers instrumenting Python applications through SSI should review the [4.0.0](https://github.com/DataDog/dd-trace-py/releases/tag/v4.0.0) release notes and the [compatibility guide](https://docs.datadoghq.com/tracing/trace_collection/compatibility/python/) to ensure their Python applications are compatible.
Add flare support for workloadfilter component.
docker.cpu.shares metric values on cgroups v2 systems running runc >= 1.3.2 or crun >= 1.23. The new container runtimes use a different formula to convert CPU shares to cgroup v2 weight, which caused the Agent to report wrong values (e.g., 2597 instead of 1024 for default shares). The Agent now auto-detects which conversion formula the runtime uses and applies the correct inverse transformation.aws-us-gov) and China (aws-cn) regions. Previously, only the standard aws partition was accepted, causing ECS metadata extraction to fail for customers running the Datadog Agent in GovCloud or China regions. This resulted in empty region and account ID values, breaking ECS monitoring for these customers.ntp.offset metric using the timestamp returned by the NTP server rather than the local system clock. This restores the behavior present in Agent v5 and prevents incorrect metric alignment when host clocks are skewed.This feature is currently in development and is protected under the feature flag:
<span class="title-ref">cluster_checks.crd_collection</span>
For up-to-date docs, check out the secret-backend changelog, and the Datadog Secrets Management documentation
Refactored Cloud Foundry CCCache and BBSCache to use dependency injection to improve tests reliability and maintainability.
Released on: 2026-01-21 Pinned to datadog-agent v7.75.0: CHANGELOG.
The Datadog Cluster Agent's mutating webhooks (part of the [Admission Controller](https://docs.datadoghq.com/containers/cluster_agent/admission_controller/?tab=datadogoperator)) previously included Single Step Instrumentation (SSI) settings in their default webhook label selectors. These SSI-specific settings, <span class="title-ref">apm_config.instrumentation.enabled</span> and <span class="title-ref">apm_config.instrumentation.enabled_namespaces</span>, have been removed.
For those using Single Step Instrumentation, no action is required and no behavior changes. For those using the <span class="title-ref">config</span> or <span class="title-ref">tagsfromlabels</span> webhooks for manually instrumented applications, behavior remains consistent with the [documented configuration](https://docs.datadoghq.com/containers/cluster_agent/admission_controller/?tab=datadogoperator#apm-and-dogstatsd). Users that were unintentionally relying on the SSI settings without using SSI should add the appropriate pod label or enable <span class="title-ref">mutate_unlabelled</span> to preserve the previous behavior.
Release on: 2026-01-12
Released on: 2026-01-12 Pinned to datadog-agent v7.74.1: CHANGELOG.
Release on: 2026-01-07
Added the <span class="title-ref">agent workloadfilter</span> CLI command, which shows the active workload filter bundles,
their load status, and the effective filter configuration.
Adds new Cluster Autoscaling controller in Cluster Agent.
Adds the hpflare extension, which provides flare information for the host-profiler.
Introduce a new Health Platform component that provides a unified way to detect, collect, and report host system health issues. The component runs health checks periodically and exposes telemetry for monitoring detected problems.
The datadog-agent now uses datadog-secret-backend v1.4.0 which supports GCP secrets via Google Secret Manager.
Create an inferred span to represent the entire duration of a Cloud Run Job task.
Checks can be scheduled only once with run_once configuration
Data Streams Kafka actions perform actions on Kafka clusters
gpu: add count metrics for NVIDIA ECC errors
Logs Agent is able to restart its pipeline in place to enable switching between endpoint types (HTTP/TCP) without full Agent restart.
The OTEL logs agent exporter now supports exporting Kubernetes orchestrator data. The exporter consumes Kubernetes resource manifests from the k8sobjectsreceiver and forwards them to Datadog's orchestrator endpoint. This enables Kubernetes cluster visibility through the OTEL agent pipeline.
Use the OrchestratorConfig section to configure cluster name, API key, site, endpoint, and enablement toggle.
The SNMP integration now automatically performs a default device scan for each configured and auto-discovered device.
Adds a new argument, DD_INSTALL_ONLY, to the Windows MSI. Set DD_INSTALL_ONLY=true to install the Agent without starting the services.
The Agent's embedded Python has been upgraded from 3.13.7 to 3.13.10
Provide FIPS-compliant builds for the Datadog distribution of OpenTelemetry (DDOT).
Expose new OTLP -> DD semantic transformation methods in the <span class="title-ref">opentelemetry-mapping-go</span> package.
Adding an 'instance-type' field to the inventoryhost payload.
Add Docker log permissions health check that detects when the Agent cannot access container log files due to restrictive filesystem permissions. The check provides remediation guidance and an optional script to fix permission issues.
Agents are now built with Go 1.24.10.
Agents are now built with Go 1.24.11.
Add host tag to associate host to a NodePool
Add annotation to associate a replica NodePool to its target
Move the <span class="title-ref">chmod</span> operation for the <span class="title-ref">dogstatsd</span> binary from runtime (<span class="title-ref">entrypoint.sh</span>) to build time (<span class="title-ref">Dockerfile</span>).
Expand logs file rotation analytics to include more detailed information using telemetry metrics.
Add fingerprint configuration information to the Logs Agent status page.
Add remote config ID tagging to events generated by kafka_action integration for easy UI filtering
Optimized the Kubernetes State Metrics (KSM) check by replacing fmt.Sprintf() calls with direct string concatenation in the ownerTags() function. This reduces memory allocation churn and saves approximately 20% CPU usage for the KSM check.
The kubelet pod list cache is now disabled by default to reduce staleness. The Agent lists pods from the kubelet every 5s. Users who explicitly set kubelet_cache_pods_duration retain their existing behavior (the Agent lists pods approximately every 5 + cache duration seconds).
[pkg/netflow] Add a new config option <span class="title-ref">network.netflow.aggregator_max_flows_per_flush_interval</span> that controls the maximum number of flows to be sent in a flush interval. Only sends the top flows, by # of bytes in the period up to the value in the config.
Add container metric support for any CRI compliant runtime specified in the <span class="title-ref">cri_socket_path</span> configuration.
Openmetrics-based checks using send_histograms_buckets now handle histogram resets without emitting a warning.
Optimize auto multiline detection JSON aggregator to improve performance and reduce memory usage for single line JSON messages
Optimize memory allocation in the KSM Core check by preallocating metric slices and skipping empty metrics in the store's Push() method. This should reduce 15% - 20% memory usage by ksm check, improving performance in clusters with large numbers of pods.
The otel-agent can now be told not to contact the core-agent by setting DD_CMD_PORT to 0
Add support for batch settings in the OTLP ingest endpoint (logs & metrics).
These settings can be configured in the Agent config file or by using the environment variables.
Change serverless-init default log level to error.
Skip noisy Kubernetes metadata error logs in serverless-init.
Change startup failure log level from debug to error.
Increase the default EVP proxy maximum payload size from 5 MB to 10 MB in the Trace Agent.
Fixes missing tags at container startup by buffering spans and APM stats until Kubernetes metadata is resolved.
The agent now can automatically triggers a secret refresh when an API key expires or becomes invalid, either through 403 responses or periodic API key validation. The refresh rate is throttled by <span class="title-ref">secret_refresh_on_api_key_failure_interval</span> configuration option (in minutes).
Enforces that the DDOT service is stopped by the core Agent service.
Included tags for TLS offered versions and TLS chosen version as part of TCP connections stats on Windows.
health_platform.issues_detected tagged by health_check_id to track the number of detected health issues over time.Released on: 2026-01-07 Pinned to datadog-agent v7.74.0: CHANGELOG.
kubernetes_state_core check into multiple shards based on resource type groups (pods, nodes, others), enabling parallel execution across multiple Cluster Check Runners.Release on: 2025-12-31
Fixed device-mapper (LVM) device tagging in the diskv2 check to match Python psutil behavior. Previously, devices were reported as dm-X (e.g., device:dm-0) instead of their friendly /dev/mapper/* names (e.g., device:ocivolume-root). This ensures backward compatibility with the Python disk check and preserves existing dashboards and monitors.
Fix an issue introduced in 7.73.0 that can cause the MSI to overwrite the site option in datadog.yaml with the default value of datadoghq.com.
This issue impacts users who do not provide the SITE option to the MSI when upgrading AND who have an error in their datadog.yaml file that prevents the MSI from reading the existing site option (MSI log contains ReadConfig. User config could not be read).
This issue also impacts users of datadog-installer.exe and Install-Datadog.ps1, introduced in 7.72.0, who do not provide the DD_SITE environment variable when upgrading.
Released on: 2025-12-31 Pinned to datadog-agent v7.73.3: CHANGELOG.
Release on: 2025-12-23
PartitionsWithContext errors gracefully instead of failing entirely. When some partitions fail to load, the check continues collecting metrics for the partitions that succeeded. This aligns the Go implementation with the Python check behavior.device: tag by stripping backslashes and lowercasing, matching the Python disk check behavior. This ensures customers that migrated from Python to Go disk check see consistent device: tag values (e.g., C:\ becomes c:).orchestrator_kubelet_config check when the orchestrator_explorer.kubelet_config_check.enabled config is set to false.Released on: 2025-12-23 Pinned to datadog-agent v7.73.2: CHANGELOG.
Release on: 2025-12-17
1.24.11.Released on: 2025-12-17 Pinned to datadog-agent v7.73.1: CHANGELOG.
Release on: 2025-12-10
Replace batch processor with exporter helper for OTLP ingest due to the upcoming end-of-life for batch processor. More details: https://github.com/open-telemetry/opentelemetry-collector/issues/8122
Remote Agent Management now creates a new directory that is at the same level as the current Agent configuration directory.
This directory is used during remote configuration updates and is deleted after the update is complete.
Added a new core check to send raw Kubelet configuration manifests to the Kubernetes Orchestrator.
Added comprehensive support for AWS ECS Managed Instances, including automatic deployment mode detection, hostname resolution for sidecar deployments, and validation logic to prevent misconfigured deployments.
Configure filtering for collection of autodiscovered metrics and logs through CEL-based rules using <span class="title-ref">cel_workload_exclude</span>.
Collect container metrics for ECS Managed Instances when running in sidecar mode.
APM: A more efficient trace payload encoding through the /v1.0/traces endpoint has been added.
Update JMXFetch to 0.51.0 to add configuration-level dynamic tags for JMX attribute values via dynamic_tags
Remote Agent Management is now enabled by default for Agents running on Linux and Windows hosts. This feature allows you to remotely upgrade and configure the Agent from the Datadog UI in Fleet Automation.
To disable, set remote_updates to false in the Agent configuration file.
The Datadog Installer now supports installing the datadog-apm-inject package on Windows systems.
Adds kubernetes_state.daemonset.rollout_duration metric to the KSM check.
Implement check filtering in the scheduler and CLI to enforce infrastructure basic mode restrictions. When running in basic mode (infrastructure_mode: "basic"), only core system checks (cpu, disk, memory, network, uptime, load, io, file_handle, ntp, system_core, telemetry) are allowed to execute. Additional checks can be allowlisted via the allowed_additional_integrations configuration option.
The Agent's embedded Python has been upgraded from 3.13.7 to 3.13.10
Network Path Collector (network traffic paths) now performs traceroutes using domain names instead of IP addresses.
This release refactors the ECS workloadmeta collector architecture to clearly separate ECS launch type (EC2 vs Fargate) from agent deployment mode (daemon vs sidecar). This improves code organization, reduces duplication, and helps future Managed Instances support.
KSM now supports using a wildcard to collect all resource labels/annotations as tags on metrics.
Added CCRID (Canonical Cloud Resource ID) support for Oracle Cloud Infrastructure hosts.
Add helpers for translating OTLP duration histograms to DDSketch in the <span class="title-ref">pkg/opentelemetry-mapping-go/otlp/metrics</span> package.
APM: The Trace Agent now omits infrequently used statistics when their values are zero, reducing overhead.
This can be overridden by setting the new configuration option <span class="title-ref">apm_config.send_all_internal_stats</span> to true.
Agents are now built with Go 1.24.9.
The Cluster Agent now enables both <span class="title-ref">DD_CLUSTER_CHECKS_ADVANCED_DISPATCHING_ENABLED</span> and <span class="title-ref">DD_CLUSTER_CHECKS_REBALANCE_WITH_UTILIZATION</span> by default. These options are now set to <span class="title-ref">true</span> in both the configuration template and the code, improving cluster check dispatching and balancing based on node utilization out-of-the-box. To disable these features, a user must now explicitly set them to <span class="title-ref">false</span> with the following config options: - name: DD_CLUSTER_CHECKS_ADVANCED_DISPATCHING_ENABLED value: "false" - name: DD_CLUSTER_CHECKS_REBALANCE_WITH_UTILIZATION value: "false"
Enable the Go disk and network core checks by default for Windows and Linux. These are direct ports of the existing Python disk and network checks and allow the Python runtime to be lazy loaded when other integrations are enabled. It can be disabled with setting use_diskv2_check and use_networkv2_check respectively along with the loader in your configuration to use the Python version.
Python runtime will now be lazy loaded when there are no Python integrations configured. This can be disabled by setting python_lazy_loading: false in your configuration.
Allow check configurations to be matched to services using CEL selectors in Autodiscovery. This allows for more granular targeting of configurations to services based on their metadata.
Adds the count of total GPU devices to the telemetry metrics emitted to Datadog.
GPU: emit count metrics for NVIDIA Xid errors
Adds DD_INFRASTRUCTURE_MODE install option to the datadog-installer-x86_64.exe installer and the Windows MSI installer. Set DD_INFRASTRUCTURE_MODE to configure the infrastructure_mode configuration option at installation.
The infraattributes processor can now be run when the Datadog Exporter is not configured.
Add <span class="title-ref">--enable</span> and <span class="title-ref">--disable</span> commands to the IIS .NET APM instrumentation management script on Windows
Windows: Adds a PURGE argument to the MSI to remove all OCI packages during uninstallation.
The Workload Protection's activity dump functionality on Linux has been improved to reduce its impact on processes that use very large amounts of memory.
Cache result of <span class="title-ref">TagsToString()</span> in serverless-init to improve CPU performance.
The DDOT service runs as ddagentuser.
Released on: 2025-12-10 Pinned to datadog-agent v7.73.0: CHANGELOG.
This change removes support for v1 of the auto-instrumentation webhook used for Single Step Instrumentation. The v2 implementation, which has been the default since Agent v7.57.0, is a drop-in replacement. This setting was never exposed in Helm or the Datadog Operator. If you previously set the DD_APM_INSTRUMENTATION_VERSION environment variable on the Cluster Agent, it is now ignored.
If you use a private registry, add the <span class="title-ref">apm-inject</span> container to your registry before upgrading. No action is required for other users. For details on using private registries, see Use a private container registry.
Customers using Single Step Instrumentation with target-based workload selection can now use language detection. Language detection greatly reduces startup time when all default libraries are configured for a target.
A target is eligible for language detection if a target has no defined <span class="title-ref">ddTraceVersions</span> or if <span class="title-ref">ddTraceVersions</span> matches the default set of SDKs. Once a language has been determined for a deployment, subsequent deploys only use the SDKs necessary for the detected language.
kube-system and the Datadog Agent's namespace) resources from Admission Controller mutation webhooks. This prevents mutation webhooks from unnecessarily intercepting system namespace resources, reducing misleading warnings or logs, and improving clarity about which resources are actually mutated.