Agent
Prelude
Release on: 2025-08-14
Upgrade Notes
- The cilium conntracker is now enabled by default in the system-probe, and now expects the /sys/fs/bpf to be mounted at /host/sys/fs/bpf in containerized environments. The conntracker, if enabled, will fail to load unless this mount is provided, with the log line "not loading cilium conntracker since cilium maps are not present" in system-probe's log file. Users who have enabled this feature can either upgrade to the latest helm chart or add this mount to their container
New Features
- Adds additional information and data related to the setsockopt hook.
- Socket Information:
- Socket type
- Socket family
- Socket protocol
- Filter Information
- Disassembled filter
- Filter hash
- You can now set the
JAVA_TOOL_OPTIONS that JMXFetch uses by setting the jmx_java_tool_options configuration option in the datadog.yaml config file. This allows you to pass additional JVM options to JMXFetch, such as memory settings or system properties.
- Adding a TracerPayloadModifier to the Trace Agent.
- pkg/trace/api: Container tags hash is returned as a response header of the info endpoint.
- Added new config option
include_ephemeral_containers to collect Kubernetes ephemeral containers. The option is disabled by default. When enabled, the Agent will report container.* and kubernetes.* metrics for ephemeral containers. It will also collect logs and schedule checks for ephemeral containers when configured to do so.
- Data Streams Monitoring: Adds new feature allowing users to retrieve messages from Kafka topics.
- Change <span class="title-ref">collect_gpu_tags</span> config flag to be enabled by default. Now the Agent collects an additional <span class="title-ref">gpu_host</span> host tag for all hosts that have Nvidia GPUs.
- Added new processing rule to omit truncated logs from being sent to ingest
- GPU: Add GPM collector for Hopper and newer NVIDIA GPUs
- Adds VPN tunnels and route table data collection to SNMP. This can be enabled/disabled using the <span class="title-ref">collect_vpn</span> config.
- The NTP check on Windows now discovers the primary domain controller (PDC) on domain-joined hosts when <span class="title-ref">use_local_defined_servers</span> is enabled. If the PDC is unavailable, it automatically falls back to registry-defined servers. Check now performs order-insensitive server list comparisons, reduces log noise, and avoids using itself as a time source when running on a domain controller.
- [Preview] The agent can now connect to the AWS SSM, AWS Secrets, Hashicorp Vault and Azure Keyvault secret management solutions to resolve secrets without requiring a user provided binary. For this, two new settings are introduced: <span class="title-ref">secret_backend_type</span> and <span class="title-ref">secret_backend_config</span>. For more information see: https://docs.datadoghq.com/agent/configuration/secrets-management
- Added support in DDOT for the <span class="title-ref">datadogexporter.proxy_url</span> configuration option. This allows users to specify proxy settings for DDOT with the collector configuration.
- Windows: Add CRL monitoring to the Windows Certificate Store integration.
Enhancement Notes
- The serverless-init build uses the new TracerPayloadModifier to add Function Tags to the <span class="title-ref">_dd.tags.function</span> tag of the Tracer Payload to support serverless trace tagging.
- Agents are now built with Go
1.24.5.
- The user is now able to specify which features they want enabled inside of the converter. Previously, the user would have to either enable or disable everything.
- DDOT now uses zstd compression for logs by default.
- If a check has both a Go and a Python version, the Go version now has priority by default. This change should not have any visible impact, but if needed, you can disable this configuration by setting <span class="title-ref">prioritize_go_check_loader</span> to <span class="title-ref">false</span>.
- GPUM: the "status" command now returns status of the system-probe part of GPU monitoring
- Added new DogStatsD configuration option "dogstatsd_flush_incomplete_buckets". When enabled, DogStatsD will flush all received metrics during shutdown, regardless of which time-interval based bucket they belong to.
- Agent integration metadata payloads now include the JMX integrations.
- Allow users to configure the HTTP timeout for the Logs Agent.
- No longer have the Logs Agent fall back to TCP when configuring <span class="title-ref">logs_config.logs_dd_url</span> with a http(s):// prefix.
- If the Oracle <span class="title-ref">can_connect</span> check is critical, also set the <span class="title-ref">can_query</span> check to critical.
- Display the number of times each log processor has been used in the Logs Agent status endpoint.
- Reduce binary size by removing the Sensitive Data Scanner (SDS) from the logs agent.
- OTLP spans support <span class="title-ref">db.namespace</span> semantic and map to <span class="title-ref">db.name</span> for DBM support.
- Generate a more detailed warning when the Logs Agent tailer limit is reached.
- Improved the granularity of the Logs Agent pipeline monitor to track the capacity of each individual component of the pipeline.
- Remote Agent management operations on Windows now attempt to force stop the Agent services if they do not respond to Service Control Manager requests.
- Remote Agent management on Windows now automatically retries when the MSI returns error 1618 (
ERROR_INSTALL_ALREADY_RUNNING).
Bug Fixes
- Correctly respect the <span class="title-ref">ecs_collect_resource_tags_ec2</span> variable when calling the ECS Agent. Start caching tags to reduce burden on the ECS Agent. Start logging error responses from the ECS agent.
- Fix a panic in Docker streams log parsing when stream messages are truncated on transmission.
- Fix the cgroup reader bug that would prevent the generic container check from sending metrics when the Agent encountered a permission error.
- Fixes invalid logs compression error in DDOT, sets DDOT logs compression to gzip.
- Add support for selecting the endpoint resolution method using advanced AD identifiers in Kubernetes endpoint check configurations defined in files or configmaps. This enables static pod check configurations to correctly resolve the endpoint by setting resolve method to "ip".
- Fixed the serializer exporter for the OSS Collector, which was not setting the correct proxy variables when sending metric data.
- Fixed Windows installer overwriting
install_info from setup scripts. When using Fleet Automation setup scripts, the subsequent MSI installation now skips writing install_info via a new SKIP_INSTALL_INFO flag, preserving the original setup script installation method tracking.
- Fix Jetson check to correctly parse the output of tegrastats for Orin boards.
- Fix incorrect <span class="title-ref">container.memory.kernel</span> value when running with Kernel >= 5.19 and cgroupv2
- Breaking change - Fixes the Oracle service name tag to be <span class="title-ref">service_name</span> instead of <span class="title-ref">service</span>. This corrects the conflict with the APM <span class="title-ref">service</span> tag. This is a breaking change for any users who had been relying on the <span class="title-ref">service</span> tag to be set to the Oracle service name. The <span class="title-ref">service</span> tag can still be set explicitly in the tags configuration if needed.
- Metrics sent from the process check on the core agent now have the host tag.
- GPU: fix a bug where the device assigned to a process could be wrong if it updates the CUDA_VISIBLE_DEVICES environment variable during runtime
- GPUM: fix Kubernetes device allocation detection in Google Kubernetes Engine
- The NTP check will no longer fail to start if the initial discovery of local NTP servers fails at agent startup.
- Limit the HTTP timeout on startup to 5 seconds for the Logs Agent.
- Prevent the process component from running in the cluster worker.
- Removes an extra copy of
agent.exe from the Windows container
- Remote Agent management operations on Windows now attempt to restart the Agent services after failing to stop the services or uninstall the Agent.
- Fix Cgroup namespace not properly detected in Workload Protection, leading to incorrect container ID resolution and misqualified detections.
Other Notes
- Add a new metric to the Agent telemetry for the startup and running states. This will help us track the startup and running states of the Agent.
- Transparent Huge Pages (THP) usage is now disabled by default in the System Probe and Security Agent. To re-enable their usage, set the <span class="title-ref">system_probe_config.disable_thp</span> or <span class="title-ref">security_agent.disable_thp</span> configuration options to <span class="title-ref">false</span>.
Datadog Cluster Agent
Prelude
Released on: 2025-08-14 Pinned to datadog-agent v7.69.0: CHANGELOG.
Enhancement Notes
- The auto-instrumentation webhook supports labels and annotations as tags configuration. If any of the label or annotation mappings for the incoming pod correspond to Universal Service Tags (
service, env, or version), the webhook will also add the corresponding UST environment variable to the pod (DD_SERVICE, DD_ENV, or DD_VERSION).
- The autoinstrumentation webhook will now set a default security context for init containers if the pod is in a namespace with a restricted security context. This can still be overridden by setting the environment variable
DD_ADMISSION_CONTROLLER_AUTO_INSTRUMENTATION_INIT_SECURITY_CONTEXT.
- Collect agent version in orchestrator check.