Before upgrading your Temporal Cluster to v1.28.0, you must upgrade your core schemas to the following:
Please see our upgrade documentation for the necessary steps to upgrade your schemas.
Deprecating old Versioning APIs: The following APIs related to previous versions of Worker Versioning are deprecated.
The following APIs related to the December 2024 pre-release of Worker Versioning have been deprecated and are now no longer supported:
The following APIs are now deprecated and will be removed once the latest APIs reach to General Availability in the coming months:
Update-With-Start sends a Workflow Update that checks whether an already-running Workflow with that ID exists. If it does, the Update is processed. If not, it starts a new Workflow Execution with the supplied ID. When starting a new Workflow, it immediately processes the Update.
Update-With-Start is great for latency-sensitive use cases.
Now you can have multiple callers starting operations backed by a single workflow. When the handler tries to start a workflow that is already running, with the “use existing” conflict policy, the server will attach the caller’s callback to the running workflow. When the workflow completes, the server will call all attached callbacks to notify the callers with the workflow result.
Here’s an example using Go SDK (available in v1.34.0+):
import (
"context"
"github.com/nexus-rpc/sdk-go/nexus"
enumspb "go.temporal.io/api/enums/v1"
"go.temporal.io/sdk/client"
"go.temporal.io/sdk/temporalnexus"
)
sampleOperation := temporalnexus.NewWorkflowRunOperation(
"sample-operation",
SampleWorkflow,
func (
ctx context.Context,
input SampleWorkflowInput,
options nexus.StartOperationOptions,
) (client.StartWorkflowOptions, error) {
return client.StartWorkflowOptions{
// Workflow ID is used as idempotency key.
ID: "sample-workflow-id",
// If a workflow with same ID is already running, then it will attach the callback to the existing running workflow.
// Otherwise, it will start a new workflow.
WorkflowIDConflictPolicy: enumspb.WORKFLOW_ID_CONFLICT_POLICY_USE_EXISTING,
}, nil
},
)
The handler workflow will return a RequestIdReferenceLink to the caller. This is an indirect link to the history event that attached the callback in the handler workflow. Links can provide information about the handler workflow to the caller. In order to get the exact event history, there is now the RequestIdInfos map in the WorkflowExtendedInfo field of a DescribeWorkflowExecutionResponse. To enable RequestIdReferenceLink, you have to set the dynamic config history.enableRequestIdRefLinks to true (this might become enabled by default in a future release).
Nexus links were previously stored in the history event not directly associated with the callback that it came together. Now, the server is storing the Nexus links together with the callback. With this direct association, you can easily find out the caller that triggered the Nexus workflow from the callback through these links for example. This feature requires the latest version of Go SDK (v1.35.0+) and Java SDK (v1.30.0+).
The server now supports Nexus operation cancellation types. These are specified when starting an operation and indicate what should happen to Nexus operations when the parent context that started them is cancelled. To use them, you must be using an SDK version that supports them. Available cancellation types are:
Abandon - Do not request cancellation of the operationTryCancel - Request cancellation of the operation and immediately report cancellation to callersWaitRequested - Request cancellation and wait until the cancellation request has been received by the operation handlerWaitCompleted - Request cancellation and wait for the operation to complete. The operation may or may not complete as cancelled. Default and behavior for server versions <1.28For the WaitRequested type to work, you must set the dynamic config component.nexusoperations.recordCancelRequestCompletionEvents to true (default false ).
The following Worker Versioning APIs graduated into Public Preview stage. Production usage is encouraged but note that limited changes might be made to the APIs before General Availability in the coming months.
Using Worker Versioning: Find instructions in https://docs.temporal.io/worker-versioning.
Operator notes:
frontend.workerVersioningWorkflowAPIs (default: true)system.enableDeploymentVersions (default: true)matching.maxDeployments controls the maximum number of worker deployments that the server allows to be registered in a single namespace (default: 100, safe to increase to much higher values)matching.maxVersionsInDeployment controls the maximum number of versions that the server allows to be registered in a single worker deployments (default: 100, unsafe to increase beyond a few 100s)matching.maxTaskQueuesInDeploymentVersion controls the maximum number of task queues that the server allows to be registered in a single worker deployment version (default: 100, unsafe to increase beyond a few 100s)matching.wv.VersionDrainageStatusVisibilityGracePeriod systems waits for this amount of time before checking the drainage status of a version that just entered in DRAINING state (default: 3 minutes, setting a very low value might cause the status to become DRAINED incorrectly)matching.wv.VersionDrainageStatusRefreshInterval interval used for checking drainage status (default: 3 minutes, lowering the value will increase load on the Visibility database)Please see deprecation warnings regarding earlier versions of Temporal versioning APIs.
If you used Worker Versioning in v1.27.x, you must delete all Worker Deployments (via DeleteWorkerDeployment) before upgrading to v1.28.0, then recreate them after. This is due to breaking changes between v1.27.2 and v1.28.0.
If you already upgraded or need help, ask in #safe-deploys on Community Slack.
Simple priority allows you to control the execution order of workflows, activities, and child workflows based on assigned priority values within a single task queue. You can select a priority level in the integer range [1,5]. A lower value implies higher priority.
Priority can be attached to workflows and activities using the latest versions of most SDKs. In order for priority to take effect on the server, you need to switch to the new implementation of the matching service: set the dynamic config matching.useNewMatcher to true either on specific task queues, namespaces, or globally. After the new matcher has been turned on for a task queue, turning it off will cause tasks with non-default priority to be temporarily lost until it’s turned on again.
When the setting is changed, the implementation will be switched immediately, which may cause a momentary disruption in task dispatch.
Besides enabling priority, the new matcher will have a different profile of persistence operations, and slightly different behavior with task forwarding and some other edge cases. If you see performance regressions or unexpected behavior with the new matcher, please let us know.
See more usage details here: Temporal - Task Queue Priority Guide (Pre-Release)
PAUSED and PAUSE_REQUESTED to activity state. This allows to distinguish between the situation when pause signal is received, but activity is still running on the worker.ActivityPause/ActivityReset flag in heartbeat response. This notifies the workers about activity state.ResetActivity and UpdateActivityOptions commands.The Temporal server now supports resetting workflows even when the resulting new run has pending child workflows. Previously, such reset attempts were rejected if the new run appeared to have one or more pending child workflows.
With this update:
This change improves flexibility and reliability in handling resets for workflows with children in various states.
Behavior Matrix
| Reset Point | Child Running | Child Completed/Failed/Terminated |
|---|---|---|
| Before child started | New run does not reference the child | No reference to child |
| In between child start & completed. | Child is reconnected to new parent run. | Completed/Failed/Terminated event immediately replayed in new parent run |
| After child completed | N/A | All child events are included in the new run. |
Description: A new dynamic configuration history.sendRawHistoryBetweenInternalServices has been introduced. When enabled, the history service sends raw history event blobs to the frontend and matching services during GetWorkflowExecutionHistory and RecordWorkflowTaskStarted API calls. This reduces CPU usage on the history service by eliminating event decoding. The frontend service handles decoding before responding to clients, ensuring no impact on SDKs or other clients without adding extra CPU load to the frontend.
This configuration should only be enabled after all services have been upgraded to version v1.28, and must be disabled before downgrading to any version earlier than v1.28.
executions table for Cassandra-backed clusters contains additional columns (chasm_node_map).chasm_node_maps , has been added to SQL-backed clusters.ListSearchAttributesResponse.StorageSchema is no longer being populated.DescribeTaskQueue contains a new Stats field that contains the task queue’s statsstatsd.framework: opentelemetry to try it out and report any issues.Temporal Docs Server Docker Compose Helm Chart
1.28.0)Server Server With Auto Setup (what is Auto-Setup?) Admin-Tools
Full Changelog: https://github.com/temporalio/temporal/compare/v1.27.2...v1.28.0
This patch release fixes a few minor Worker Deployment and Nexus bugs.
Temporal Docs Server Docker Compose Helm Chart
1.27.2)Server Server With Auto Setup (what is Auto-Setup?) Admin-Tools
Full Changelog: https://github.com/temporalio/temporal/compare/v1.27.1...v1.27.2
Before upgrading your Temporal Cluster to v1.27.1, you must upgrade your core schema if you are using MySQL or PostgreSQL, and your visibility schema to the following:
Please see our upgrade documentation for the necessary steps to upgrade your schemas.
NOTE: The upgrade to MySQL and PostgreSQL Visibility schemas may come with temporary performance degradation because of creation of a new column _version which has default values. Consider performing the schema upgrades when load is low. There are protective mechanisms in place to account for timeouts from any VisibilityStore.
Visibility Scan APIs have been deprecated in favor of List Workflow APIs. Visibility Scan APIs will be removed in a future version. Migration to List Workflow APIs will be required in future versions.
Nexus is now GA with a stable server API.
Read more here on how to disable Nexus or how to operate it here.
Notable features and bug fixes since v1.26.2:
StartWorkflowExecutionRequest.OnConflictOptions with WorkflowIdConflictPolicy of USE_EXISTING. This allows multiple operations to be backed by the same workflow.The following APIs are added for Worker Versioning. All APIs are experimental and not yet recommended for production usage. You need to set the dynamic configs system.enableDeployments and system.enableDeploymentVersions in order to use them.
The following APIs are now deprecated and replaced by above:
Changes to the Activity Commands — a set of APIs designed to resolve issues related to activity execution. The following APIs where updated:
UpdateActivityOptionsById was renamed to UpdateActivityOptions. This API can be used by the client to update the options of an activity while activity is running.PauseActivityById was renamed to PauseActivity. With this API, If the Activity is not currently running (e.g. because it previously failed and is waiting for the next retry), it will not be run again until it is unpaused.
activity_type parameter was added. If this parameter is set - all running activities of this type will be paused.UnpauseActivityById was renamed to UnpauseActivity. With this API clients can re-schedule a previously-paused Activity for execution.
no_wait parameter was removedactivity_type parameter was added. If this parameter is set - all paused activities of this type will be unpaused.match-all parameter was added. If this parameter is set - all paused activities will be unpaused.jitter parameter was added. If set, the activity will start at a random time within the specified jitter duration.ResetActivityById was renamed to ResetActivity. With this API clients can reset the execution of an activity, specified by its ID or type.
no_wait parameter was removedactivity_type parameter was added. If this parameter is provided - all paused activities of this type will be unpaused.keep_paused parameter was added. If this parameter is provided - all paused activities will stay paused. Otherwise they will be unpaused.jitter parameter was added. If set, the activity will start at a random time within the specified jitter duration.BatchOperationUnpauseActivities.In this release we have added the functionality to reset a workflow with a pending child.
Prior to this release reseting to a point between child workflow initiated and child workflow completed was not supported (the reset operation would fail). In the current release the reset operation will allow this case and the behavior of the parent after reset is to reconnect to the running child. The new run of the parent will receive the child’s completion event and result (if any) from the child.
The feature is gated behind a per namespace boolean dynamic configuration AllowResetWithPendingChildren which is enabled by default for all namespaces.
Note: If you are using Go-SDK and are relying on the SDK to generate child workflow IDs then you need to update it to the latest version to be able to use this feature. Other SDKs don’t need any upgrade to use this feature.
Delete workflow executions RPS is now dynamic. Previously, frontend.deleteNamespaceDeleteActivityRPS was read only once when namespace deletion started, and subsequent changes to this dynamic config didn't affect the ongoing deletion. This was inconvenient for large namespaces since the default RPS is only 100. Now the RPS can be adjusted on the fly.
Please note: Since deletion of Workflow Execution is an asynchronous process, this RPS controls the rate at which delete execution tasks are created. Decreasing this value (for example, from 1000 to 10) won't immediately slow down the process, as existing tasks in the transfer queue must be processed first.
DeleteExecutionsWorkflow now supports a stats query to track its progress. Since this Workflow can run for hours after a namespace is marked as deleted, it was previously difficult to monitor how many Workflow Executions remained. The new query handler provides current statistics about total and remaining executions.
Metrics and logging have been enhanced for better actionability. Key improvements include:
ReclaimResources workflow failures using the metrics reclaim_resources_namespace_delete_failure and reclaim_resources_delete_executions_failure.DeleteExecutionsWorkflow progress using the metrics delete_executions_success and delete_executions_failure.Business critical namespaces can be protected from deletion. Use dynamic config to list namespaces which are not deletable:
worker.protectedNamespaces:
- value:
- critical_namespace
- just_very_important_namespace
Sleep duration in ReclaimResourcesWorkflow now supports dynamic changes. If a namespace delete delay was mistakenly set too long, you can now modify it after the Workflow has started. Use this command to update the delay to a new value (10 hours in this example):
temporal workflow update --namespace temporal-system --name update_namespace_delete_delay --workflow-id temporal-sys-reclaim-namespace-resources-workflow/default-deleted-93f5e --input '"10h"'
Or use this command to remove the delay entirely:
temporal workflow update --namespace temporal-system --name update_namespace_delete_delay --workflow-id temporal-sys-reclaim-namespace-resources-workflow/default-deleted-93f5e --input '"0"'
Please note: The new delay starts from when it is set, not from when the original timer was created. For example, if the Workflow has already slept for 2 hours and the timer is updated to 10h, it will sleep for another 10 hours, not 8.
FutureActionTimes now accounts for a schedule’s update time and RemainingActions.ScheduleActionResult now includes a WorkflowExecutionStatus field, providing an eventually-consistent view of a workflow’s status within List results.All SQL stores used for Visibility had the rare possibility to perform updates to a workflow's visibility state out-of-order. This could result in Workflows occasionally appearing to have state that is out of date.
This has been fixed by adding a new column _version to all SQL store implementations. Queries to update Visibility data now ensure the _version advances before performing any writes.
Updates to Visibility data are prepared by checking actual Workflow state. Therefore when a write is rejected for being out-of-order, we know the VisibilityStore already contains a equal or more up-to-date state, so we drop out-of-order updates silently.
Temporal Docs Server Docker Compose Helm Chart
1.27.1)Server Server With Auto Setup (what is Auto-Setup?) Admin-Tools
Full Changelog: https://github.com/temporalio/temporal/compare/v1.26.2...v1.27.1
[!CAUTION] Please DO NOT use it if you are using SQL-based (PostgreSQL, MySQL, or sqlite) persistence. Update directly to v1.27.1. This release made changes to the SQL schemas, and there's a bug when upgrading from an older version.
Before upgrading your Temporal Cluster to v1.27.0, you must upgrade your core schema if you are using MySQL or PostgreSQL, and your visibility schema to the following:
Please see our upgrade documentation for the necessary steps to upgrade your schemas.
NOTE: The upgrade to MySQL and PostgreSQL Visibility schemas may come with temporary performance degradation because of creation of a new column _version which has default values. Consider performing the schema upgrades when load is low. There are protective mechanisms in place to account for timeouts from any VisibilityStore.
Visibility Scan APIs have been deprecated in favor of List Workflow APIs. Visibility Scan APIs will be removed in a future version. Migration to List Workflow APIs will be required in future versions.
Nexus is now GA with a stable server API.
Read more here on how to disable Nexus or how to operate it here.
Notable features and bug fixes since v1.26.2:
StartWorkflowExecutionRequest.OnConflictOptions with WorkflowIdConflictPolicy of USE_EXISTING. This allows multiple operations to be backed by the same workflow.The following APIs are added for Worker Versioning. All APIs are experimental and not yet recommended for production usage. You need to set the dynamic configs system.enableDeployments and system.enableDeploymentVersions in order to use them.
The following APIs are now deprecated and replaced by above:
Changes to the Activity Commands — a set of APIs designed to resolve issues related to activity execution. The following APIs where updated:
UpdateActivityOptionsById was renamed to UpdateActivityOptions. This API can be used by the client to update the options of an activity while activity is running.PauseActivityById was renamed to PauseActivity. With this API, If the Activity is not currently running (e.g. because it previously failed and is waiting for the next retry), it will not be run again until it is unpaused.
activity_type parameter was added. If this parameter is set - all running activities of this type will be paused.UnpauseActivityById was renamed to UnpauseActivity. With this API clients can re-schedule a previously-paused Activity for execution.
no_wait parameter was removedactivity_type parameter was added. If this parameter is set - all paused activities of this type will be unpaused.match-all parameter was added. If this parameter is set - all paused activities will be unpaused.jitter parameter was added. If set, the activity will start at a random time within the specified jitter duration.ResetActivityById was renamed to ResetActivity. With this API clients can unpauses the execution of a previously paused activity, specified by its ID.
no_wait parameter was removedactivity_type parameter was added. If this parameter is provided - all paused activities of this type will be unpaused.keep_paused parameter was added. If this parameter is provided - all paused activities will stay paused. Otherwise they will be unpaused.jitter parameter was added. If set, the activity will start at a random time within the specified jitter duration.BatchOperationUnpauseActivities.In this release we have added the functionality to reset a workflow with a pending child.
Prior to this release reseting to a point between child workflow initiated and child workflow completed was not supported (the reset operation would fail). In the current release the reset operation will allow this case and the behavior of the parent after reset is to reconnect to the running child. The new run of the parent will receive the child’s completion event and result (if any) from the child.
The feature is gated behind a per namespace boolean dynamic configuration AllowResetWithPendingChildren which is enabled by default for all namespaces.
Note: If you are using Go-SDK and are relying on the SDK to generate child workflow IDs then you need to update it to the latest version to be able to use this feature. Other SDKs don’t need any upgrade to use this feature.
Delete workflow executions RPS is now dynamic. Previously, frontend.deleteNamespaceDeleteActivityRPS was read only once when namespace deletion started, and subsequent changes to this dynamic config didn't affect the ongoing deletion. This was inconvenient for large namespaces since the default RPS is only 100. Now the RPS can be adjusted on the fly.
Please note: Since deletion of Workflow Execution is an asynchronous process, this RPS controls the rate at which delete execution tasks are created. Decreasing this value (for example, from 1000 to 10) won't immediately slow down the process, as existing tasks in the transfer queue must be processed first.
DeleteExecutionsWorkflow now supports a stats query to track its progress. Since this Workflow can run for hours after a namespace is marked as deleted, it was previously difficult to monitor how many Workflow Executions remained. The new query handler provides current statistics about total and remaining executions.
Metrics and logging have been enhanced for better actionability. Key improvements include:
ReclaimResources workflow failures using the metrics reclaim_resources_namespace_delete_failure and reclaim_resources_delete_executions_failure.DeleteExecutionsWorkflow progress using the metrics delete_executions_success and delete_executions_failure.Business critical namespaces can be protected from deletion. Use dynamic config to list namespaces which are not deletable:
worker.protectedNamespaces:
- value:
- critical_namespace
- just_very_important_namespace
Sleep duration in ReclaimResourcesWorkflow now supports dynamic changes. If a namespace delete delay was mistakenly set too long, you can now modify it after the Workflow has started. Use this command to update the delay to a new value (10 hours in this example):
temporal workflow update --namespace temporal-system --name update_namespace_delete_delay --workflow-id temporal-sys-reclaim-namespace-resources-workflow/default-deleted-93f5e --input '"10h"'
Or use this command to remove the delay entirely:
temporal workflow update --namespace temporal-system --name update_namespace_delete_delay --workflow-id temporal-sys-reclaim-namespace-resources-workflow/default-deleted-93f5e --input '"0"'
Please note: The new delay starts from when it is set, not from when the original timer was created. For example, if the Workflow has already slept for 2 hours and the timer is updated to 10h, it will sleep for another 10 hours, not 8.
FutureActionTimes now accounts for a schedule’s update time and RemainingActions.ScheduleActionResult now includes a WorkflowExecutionStatus field, providing an eventually-consistent view of a workflow’s status within List results.All SQL stores used for Visibility had the rare possibility to perform updates to a workflow's visibility state out-of-order. This could result in Workflows occasionally appearing to have state that is out of date.
This has been fixed by adding a new column _version to all SQL store implementations. Queries to update Visibility data now ensure the _version advances before performing any writes.
Updates to Visibility data are prepared by checking actual Workflow state. Therefore when a write is rejected for being out-of-order, we know the VisibilityStore already contains a equal or more up-to-date state, so we drop out-of-order updates silently.
Temporal Docs Server Docker Compose Helm Chart
1.27.0)Server Server With Auto Setup (what is Auto-Setup?) Admin-Tools
Full Changelog: https://github.com/temporalio/temporal/compare/v1.26.2...v1.27.0
TemporalPauseInfo column was added to visibility. TemporalPauseInfo contains search attribute related to paused entities in temporal workflows.
Before upgrading your Temporal Cluster to v1.26.2, you must upgrade your visibility schemas to the following:
PR: https://github.com/temporalio/temporal/pull/6655
Description: Extended metrics.Handler interface with a new StartBatch method. StartBatch returns a BatchHandler that can be used to send a sequence of metrics as a single event when Close() is called on the batch. All provided metric handlers have been updated with the new interface and simply send metrics individually.
💥 BREAKING CHANGE 💥 If you provide a custom metrics handler with temporal.WithCustomMetricsHandler(metricsHandler) you will need to implement StartBatch() on that handler. See the tally metrics handler for an example of this.
The following EXPERIMENTAL Versioning APIs are added in this release:
DescribeDeploymentListDeploymentsGetCurrentDeploymentSetCurrentDeploymentGetDeploymentReachabilityUpdateWorkflowExecutionOptions API (and its batch mode) for setting versioning override for executions.Documentation is not available at this point. Do not use above APIs in production.
To enable these APIs the following configs should be enabled: system.enableDeployments and frontend.workerVersioningWorkflowAPIs.
Description:
Workflow Update enables a gRPC client of a Workflow Execution to issue requests to that Workflow Execution and receive a response. These requests are delivered to and processed by a client-specified Workflow Execution. Updates are differentiated from Queries in that the processing of an Update is allowed to modify the state of a Workflow Execution. Updates are different from Signals in that an Update returns a response.
Any gRPC client can invoke Updates via the WorkflowService.UpdateWorkflowExecution. Additionally, past Update requests can be observed via the WorkflowService.PollWorkflowExecutionUpdate API. The wait stage option determines whether they respond once the Update was accepted or completed.
Note that an Update only becomes durable when it was accepted, until then, it will not appear in the Workflow history. SDKs will automatically retry to ensure Update requests complete.
The execution and retention of Updates is configured via two optional dynamic configuration values:
history.maxTotalUpdates controls the total number of Updates that a single Workflow Execution can support. The default is 2000.history.maxInFlightUpdates controls the number of Updates that can be “in-flight” (that is, concurrently executing, not having completed) for a given Workflow Execution. The default is 10.Since the 1.25 release, several minor bugs have been fixed.
Update-With-Start sends a Workflow Update that checks whether an already-running Workflow with that ID exists. If it does, the Update is processed. If not, it starts a new Workflow Execution with the supplied ID. When starting a new Workflow, it immediately processes the Update.
Update-With-Start is great for latency-sensitive use cases.
Description:
We introduce the Activity API — a set of APIs designed to resolve issues related to activity execution. The following APIs where introduced:
--no-wait.Documentation is not available at this point. Do not use above APIs in production.
Nexus is still in public preview for this release, but has now been enabled by default.
Read more here on how to disable Nexus or how to operate it here.
Notable features and bug fixes since v1.25.2:
Primary engineer: @prathyushpv
Table not found bug in sqlite.
Durations with mismatched seconds and nanoseconds signs will now fail validation and return an InvalidArgument error.
Temporal Docs Server Docker Compose Helm Chart
1.26.2)Server Server With Auto Setup (what is Auto-Setup?) Admin-Tools
Full Changelog: https://github.com/temporalio/temporal/compare/v1.25.2...v1.26.2
This patch release fixes few minor Nexus and Update-with-Start related bugs.
Full Changelog: https://github.com/temporalio/temporal/compare/v1.25.1...v1.25.2
Temporal Docs Server Docker Compose Helm Chart
1.25.2)Server Server With Auto Setup (what is Auto-Setup?) Admin-Tools
This release fixes one bug:
It also fixes a small bug in the Makefile which affects users building from source.
Temporal Docs Server Docker Compose Helm Chart
1.24.3)Server Server With Auto Setup (what is Auto-Setup?) Admin-Tools
Full Changelog: https://github.com/temporalio/temporal/compare/v1.24.2...v1.24.3
This patch release fixes a couple bugs:
It also adds support for Nexus links.
Full Changelog: https://github.com/temporalio/temporal/compare/v1.25.0...v1.25.1
Temporal Docs Server Docker Compose Helm Chart
1.25.1)Server Server With Auto Setup (what is Auto-Setup?) Admin-Tools
Before upgrading your Temporal Cluster to v1.25.0, you must upgrade your core and visibility schemas to the following:
Please see our upgrade documentation for the necessary steps to upgrade your schemas.
Nexus RPC is an open-source service framework for arbitrary-length operations whose lifetime may extend beyond a traditional RPC. It is an underpinning connecting durable executions within and across namespaces, clusters and regions – with an API contract designed with multi-team collaboration in mind. A service can be exposed as a set of sync or async Nexus operations – the latter provides an operation identifier and a uniform interface to get the status of an operation or its result, receive a completion callback, or cancel the operation.
Temporal uses the Nexus RPC protocol to allow calling across namespace and cluster boundaries. The Go SDK Nexus proposal explains the user experience and shows sequence diagrams from an external perspective.
Read more here on how to enable Nexus and how to operate it here.
Workflow Update enables a gRPC client of a Workflow Execution to issue requests to that Workflow Execution and receive a response. These requests are delivered to and processed by a client-specified Workflow Execution. Updates are differentiated from Queries in that the processing of an Update is allowed to modify the state of a Workflow Execution. Updates are different from Signals in that an Update returns a response.
Any gRPC client can invoke Updates via the WorkflowService.UpdateWorkflowExecution. Additionally, past Update requests can be observed via the WorkflowService.PollWorkflowExecutionUpdate API. The wait stage option determines whether they respond once the Update was accepted or completed.
Note that an Update only becomes durable when it was accepted, until then, it will not appear in the Workflow history. SDKs will automatically retry to ensure Update requests complete.
The execution and retention of Updates is configured via two optional dynamic configuration values:
history.maxTotalUpdates controls the total number of Updates that a single Workflow Execution can support. The default is 2000.history.maxInFlightUpdates controls the number of Updates that can be “in-flight” (that is, concurrently executing, not having completed) for a given Workflow Execution. The default is 10.Since the 1.21 release, the feature was heavily tested and several bug fixes as well as performance optimizations were made.
You can find more information at this link.
The MutableState cache has been updated to operate as a host-level cache by default. Previously, this cache was managed at the shard level, with each shard cache holding 512 MutableState entries. Now, the host-level cache, enabled by default (history.enableHostHistoryCache = true), will be shared across all shards on a given host.
The size of the host-level cache is controlled by the history.hostLevelCacheMaxSize configuration, which is set to 128,000 entries by default. This change may impact the memory usage of the history service, but it can be adjusted by modifying the history.hostLevelCacheMaxSize value.
Enhanced the Nexus CLI to support query filtering for the schedule list command. The --query or -q (string) option allows filtering of results using a specified list filter.
Provide stats for Task Queue backlogs to be used for worker scaling decisions.
User DescribeTaskQueue API in enhanced mode (with report_stats=true) to get the following info about the Task Queue:
Temporal Docs Server Docker Compose Helm Chart
Server (use the tag 1.25.0)
Server With Auto Setup ([what is Auto-Setup?] (https://docs.temporal.io/blog/auto-setup)) (use the tag 1.25.0)
Admin-Tools (use the tag 1.25.0-tctl-1.18.1-cli-1.0.0)
Full Changelog: https://github.com/temporalio/temporal/compare/v1.24.0...v1.25.0
Full Changelog: https://github.com/temporalio/temporal/compare/v1.24.1...v1.24.2
If you are using SQL based visibility, before upgrading your Temporal Cluster to v1.24.0, you must upgrade your visibility schemas to the following:
Please see our upgrade documentation for the necessary steps to upgrade your schemas.
This release contains schema fix for SQL based visibility introduced in 1.24.0. For full set of new features please check v1.24.0 release notes.
Temporal Docs Server Docker Compose Helm Chart
Server (use the tag 1.24.1)
Server With Auto Setup (what is Auto-Setup?) (use the tag 1.24.1)
Admin-Tools (use the tag 1.24.1-tctl-1.18.1-cli-0.12.0)
Full Changelog: https://github.com/temporalio/temporal/compare/v1.24.0...v1.24.1
[!CAUTION]
This release introduces a bug in SQL visibility. Please DO NOT use it if you are using SQL-based (PostgreSQL, MySQL, or sqlite) visibility. Elasticsearch based visibility is not affected. Update directly to v1.24.1.
Before upgrading your Temporal Cluster to v1.24.0, you must upgrade your core and visibility schemas to the following:
Please see our upgrade documentation for the necessary steps to upgrade your schemas.
As planned, standard visibility is no longer supported in this version. Please, upgrade to advanced visibility as well as the config keys to setup visibility before upgrading to this version. Refer to v1.20.0 release notes for upgrade instructions, and also check the v1.21.0 release notes for config key changes.
Note that you also need to update the plugin name for the main store, ie., if you are using mysql plugin name for the main store, then you need to change to mysql8. Similarly, if it's postgres, then change it to postgres12.
The following changes were made to Worker Versioning APIs:
UpdateWorkerBuildIdCompatibility in favor of the new UpdateWorkerVersioningRules API.GetWorkerBuildIdCompatibility in favor of the new GetWorkerVersioningRules API.GetWorkerTaskReachability in favor of DescribeTaskQueue enhanced mode (api_mode=ENHANCED)Together with the old APIs, the Version Set concept is also deprecated and replaced with “versioning rules” which are more powerful and flexible. More details can be found in https://github.com/temporalio/api/blob/master/temporal/api/taskqueue/v1/message.proto#L153.
For using these experimental APIs you need to enable the following configs:
frontend.workerVersioningRuleAPIsfrontend.workerVersioningWorkflowAPIsRemoving frontend.namespaceBurst and adding frontend.namespaceBurstRatio config. Similarly replacing frontend.namespaceBurst.visibility and frontend.namespaceBurst.namespaceReplicationInducingAPIs with frontend.namespaceBurstRatio.visibility and frontend.namespaceBurstRatio.namespaceReplicationInducingAPIs.
The old values are used to specify the burst rate as number of requests per second. New values will specify burst as a ratio of their respective RPS limit. This ratio will be applied to calculated RPS limit from global and per-instance rate limits.
We added two new system search attributes: RootWorkflowId and RootRunId. If you have previously created custom search attributes with one of these names, attempts to set them will start to fail. We suggest updating your workflows to not set those search attributes, delete those search attributes and then upgrade Temporal to this version. Alternatively, you can also set the dynamic config system.supressErrorSetSystemSearchAttribute: true. When this dynamic config is set to true, your workflow will not fail when trying to set a value on a system search attribute, and it will ignore your input for those system search attributes.
OpenAPI v2 docs are served at /api/v1/swagger.json while v3 is at /api/v1/openapi.yaml when our HTTP API is enabled.
Operators can now configure how often we update shard info (tracking how many tasks have been acked, etc). This improves recovery speed by persisting shard data more frequently.
This can be configured through the following dynamic config values:
history.shardUpdateMinTasksCompleted - the minimum number of tasks which must be completed (across all queues) before the shard info can be updatedhistory.shardUpdateMinInterval - the minimum amount of time between shard info updates unless shardUpdateMinTasksCompleted tasks have been ackedNote that once history.shardUpdateMinInterval amount of time has passed we'll update the shard info regardless of the number of tasks completed
We now interpolate parameters into queries client-side for MySQL main databases but not visibility.
When interpolateParams is false (the default) the driver will prepare parameterized statements before executing them, meaning we need two round-trips to the database for each query. By setting interpolateParams to true the DB driver will handle interpolation and send the query just once to the database, halving the number of round trips necessary. This should improve the performance of all Temporal deploys using MySQL.
Support for enabling OpenTelemetry for tracing gRPC requests via environment variables. See develop/docs/tracing.md for details.
Various improvements were made to task queue handover when adding/removing/restarting matching nodes. This should improve tail latency for task dispatch during those situations. To enable the improvements, operators should set the dynamic config matching.alignMembershipChange to a value like 10s after fully deploying v1.24 to the entire cluster. This may become the default in future versions.
When we migrated Temporal from the deprecated gogoproto fork of Google’s protobuf library to the official version in v1.23, we disabled protobuf’s default utf-8 validation to ensure a smooth deployment, since gogoproto did not validate fields for utf-8, and turning on validation immediately would have broken applications that accidentally used invalid utf-8.
This was a temporary measure and we will eventually re-enable validation. As the first step, we’ve added tools to detect and warn about invalid utf-8 without breaking applications. There are two sets of dynamic config settings to use.
The sample settings are set to a floating point value between 0.0 and 1.0 (default 0.0), and control what proportion of either RPC requests, responses, or data read from persistence, is validated for utf-8 in strings. If invalid utf-8 is found, warnings will be sent to logs, and the counter metric utf8_validation_errors will be incremented.
The fail settings (boolean, default false) control whether a validation error will be turned into a RPC failure or data corruption error.
system.validateUTF8.sample.rpcRequestsystem.validateUTF8.sample.rpcResponsesystem.validateUTF8.sample.persistencesystem.validateUTF8.fail.rpcRequestsystem.validateUTF8.fail.rpcResponsesystem.validateUTF8.fail.persistenceIf you think your application may be using invalid utf-8, we suggesting turning on the sample settings without the fail settings and running for a while. In a future version, validation and errors will be turned on by default (effectively sample set to 1.0 and fail set to true).
admin-tools docker image versioningWe separated admin-tools docker image release process. Version tag now includes versions of tctl (deprecated but still supported CLI) and temporal (modern CLI) binaries. This image is released every time whenever new version of any of these component is released. Current latest tag is 1.24.0-tctl-1.18.1-cli-0.12.0.
Temporal Docs Server Docker Compose Helm Chart
Server (use the tag 1.24.0)
Server With Auto Setup (what is Auto-Setup?) (use the tag 1.24.0)
Admin-Tools (use the tag 1.24.0-tctl-1.18.1-cli-0.12.0)
Full Changelog: https://github.com/temporalio/temporal/compare/v1.23.1...v1.24.0
2024-04-30 - fad6bdc0e - Bump Server version to 1.23.1 2024-04-26 - 99b6e0c38 - Fix schedule workflow to CAN after signals (#5799) 2024-04-26 - 1bb03b730 - Update dependencies (#5789) 2024-04-26 - 9701ef095 - Recalculate schedule times from previous action on update (#5381) 2024-04-26 - dd4323a02 - Handle data corruption in history resend (#5398) 2024-04-26 - 9b1981ce4 - Do schedule backfills incrementally (#5344) 2024-04-26 - a520df228 - Use proto encoding for scheduler workflow next time cache (#5277)
Temporal Docs Server Docker Compose Helm Chart
1.23.1)Server Server With Auto Setup (what is Auto-Setup?) Admin-Tools
Full Changelog: https://github.com/temporalio/temporal/compare/v1.23.0...v1.23.1
This release mitigates a problem where invalid UTF-8 data could be supplied to the history service, causing a denial of service
Temporal Docs Server Docker Compose Helm Chart
1.22.7)Server Server With Auto Setup (what is Auto-Setup?) Admin-Tools
Full Changelog: https://github.com/temporalio/temporal/compare/v1.22.6...v1.22.7
This release mitigates a problem where invalid UTF-8 data could be supplied to the history service, causing a denial of service
Temporal Docs Server Docker Compose Helm Chart
1.21.6)Server Server With Auto Setup (what is Auto-Setup?) Admin-Tools
Full Changelog: https://github.com/temporalio/temporal/compare/v1.21.5...v1.21.6
This release mitigates a problem where invalid UTF-8 data could be supplied to the history service, causing a denial of service
Temporal Docs Server Docker Compose Helm Chart
1.20.5)Server Server With Auto Setup (what is Auto-Setup?) Admin-Tools
Full Changelog: https://github.com/temporalio/temporal/compare/v1.20.4...v1.20.5
github.com/gogo/protobuf has been replaced with google.golang.org/protobuf
We've fully replaced the use of gogo/protobuf with the official google protobuf runtime. This has both developmental and operational impacts as prior to Server version v1.23.0 our protobuf code generator allowed invalid UTF-8 data to be stored as proto strings. This isn't allowed by the proto3 spec, so if you're running a custom-built temporal server and think some tenant may store arbitrary binary data in our strings you should set -tags protolegacy when compiling the server. If you use our Makefile this is already done.
If you don't and see an error like grpc: error unmarshalling request: string field contains invalid UTF-8 then you will need to enable this when building the server. If you're unsure then you should specify it anyways as there's no harm in doing so unless you relied on the protobuf compiler to ensure all strings were valid UTF-8.
Developers using our protobuf-generated code will notice that:
time.Time in proto structs will now be [timestamppb.Timestamp](https://pkg.go.dev/google.golang.org/protobuf@v1.31.0/types/known/timestamppb#section-documentation)time.Duration will now be [durationpb.Duration](https://pkg.go.dev/google.golang.org/protobuf/types/known/durationpb)[go.temporal.io/api/temporalproto](https://pkg.go.dev/go.temporal.io/api/temporalproto)reflect.DeepEqual or anything that uses it. This includes testify and mock equality testers!
reflect.DeepEqual for any reason you can use go.temporal.io/api/temporalproto.DeepEqual insteadtestify require/assert compatible checkers you can use the go.temporal.io/server/common/testing/protorequire, go.temporal.io/server/common/testing/protoassert packagesgo.temporal.io/server/common/testing/protomockNew System Search Attributes
We added two new system search attributes: ParentWorkflowId and ParentRunId. If you have previously created custom search attributes with one of these names, attempts to set them will start to fail. We suggest updating your workflows to not set those search attributes, delete those search attributes and then upgrade Temporal to this version.
Alternatively, you can also set the dynamic config system.supressErrorSetSystemSearchAttribute to true. When this dynamic config is set values for system search attributes will be ignored instead of causing your workflow to fail. Please use this as temporary workaround, because it could hide real issue in users workflows.
Before upgrading your Temporal Cluster to v1.23.0, you must upgrade your core and visibility schemas to the following:
Please see our upgrade documentation for the necessary steps to upgrade your schemas.
complete_workflow_command, continue_as_new_command etc.) with a single metric called command which has a tag “commandType” describing the specific command type (see https://github.com/temporalio/temporal/pull/4995)LIKE operator will no longer be supported in v1.24.0. It never did what it was meant to do, and only added confusing behavior when used with Elasticsearch.For situations where an operator wants to handle a bad deployment using workflow reset, the batch reset operation can now reset to before the first workflow task processed by a specific build id. This is based on reset points that are created when build id changes between workflow tasks. Note that this also applies across continue-as-new.
This operation is not currently supported by a released version of the CLI, but you can use it through the gRPC API directly, e.g. using the Go SDK:
client.WorkflowService().StartBatchOperationRequest(ctx, &workflowservice.StartBatchOperationRequest{
JobId: uuid.New(),
Namespace: "my-namespace",
// Select workflows that were touched by a specific build id:
VisibilityQuery: fmt.Sprintf(`BuildIds = "unversioned:bad-build"`),
Reason: "reset bad build",
Operation: &workflowservice.StartBatchOperationRequest_ResetOperation{
ResetOperation: &batch.BatchOperationReset{
Identity: "bad build resetter",
Options: &commonpb.ResetOptions{
Target: &commonpb.ResetOptions_BuildId{
BuildId: "bad-build",
},
ResetReapplyType: enumspb.RESET_REAPPLY_TYPE_SIGNAL,
},
},
},
})
We've added a DLQ to history service to handle poison pills in transfer / timer queues and other history task queues including visibility and replication queues. You can see our operators guide for more details.
If you want tasks experiencing unexpected errors to go to the DLQ after a certain number of failures you can set the history.TaskDLQUnexpectedErrorAttempts dynamic config.
Once this feature is enabled, our task queues will be roughly FIFO.
This is disabled by default in 1.23, as we continue testing it but expect that it’ll be enabled by default in 1.24. To enable it the following config should be set to a short duration (e.g. 5sec) from its current default value (10yrs): "matching.backlogNegligibleAge"
We've added the following metrics as part of this effort:
poll_latency - this is a per-task-queue histogram of the duration between worker poll request and response (with or without task) calculated from the Matching server’s perspectivetask_dispatch_latency - this is a histogram of schedule_to_start time from Matching's perspective, broken down by task queue and task source (backlog vs history)We've added the ability to specify global (cluster level) rate limiting value for the persistence layer. You can configure by specifying the following dynamic config values:
frontend.persistenceGlobalMaxQPShistory.persistenceGlobalMaxQPSmatching.persistenceGlobalMaxQPSworker.persistenceGlobalMaxQPSYou can also specify this on the per-namespace level using
frontend.persistenceGlobalNamespaceMaxQPShistory.persistenceGlobalNamespaceMaxQPSmatching.persistenceGlobalNamespaceMaxQPSworker.persistenceGlobalNamespaceMaxQPSPlease be aware that this functionality is experimental. This global rate limiting isn’t workload aware but shard-aware; we currently allocate this QPS to each pod based on the number of shards they own rather than the demands of the workload, so pods with many low-workload shards will have a higher allocation of this limit than pods with fewer but more active workloads. If you plan to use this you will want to set the QPS value with some headroom (like 25%) to account for this.
The metrics exported by the deadlock detector were renamed to use a dd_ prefix to avoid confusion with other lock latency metrics. Affected metrics: dd_cluster_metadata_lock_latency, dd_cluster_metadata_callback_lock_latency, dd_shard_controller_lock_latency, dd_shard_lock_latency, dd_namespace_registry_lock_latency.
Visibility API now supports prefix search by using the keyword STARTS_WITH. Eg: WorkflowType STARTS_WITH 'hello_world'. Check the Visibility documentation for additional information on supported operators.
Temporal Docs Server Docker Compose Helm Chart
v1.23.0)Server Server With Auto Setup (what is Auto-Setup?) Admin-Tools
Full Changelog: https://github.com/temporalio/temporal/compare/v1.22.6...v1.23.0
This release mitigates a rollback problem introduced into one of our v1.23.0 release candidates. This has no impact on OSS users using official releases.
2024-02-29 - 2899920e9 - Bump Server version to 1.22.6 2024-02-29 - 1eba091c8 - Update Go SDK to handle SDKPriorityUpdateHandling flag (#5468)
Temporal Docs Server Docker Compose Helm Chart
1.22.6)Server Server With Auto Setup (what is Auto-Setup?) Admin-Tools
2024-02-22 - 2787da350 - Bump Server version to 1.22.5 2024-02-22 - 760ea9c09 - Bump Server version to 1.22.5-rc2 2024-02-21 - 2ea05b30d - Ensure PollActivityTaskQueueResponse.ScheduleToCloseTimeout is not nil (#5444) 2024-02-02 - d4f38c207 - Bump Server version to 1.22.5-rc1 2024-02-01 - 64fe53cb9 - Backport code to drop internal errors encountered during task processing (#5385) 2024-01-31 - 2647b3675 - Fix scheduled task rescheduling on failover (#5377)
Temporal Docs Server Docker Compose Helm Chart
1.22.5)Server Server With Auto Setup (what is Auto-Setup?) Admin-Tools
2024-01-12 - fb617040c - Bump Server version to 1.22.4 2024-01-12 - f2659e725 - Change auth order (#5294) 2024-01-12 - 4489f174a - Fix buildkite cassandra setup (#5263) 2024-01-08 - 6dcab7349 - Update GHA setup-go to pull version from go.mod file (#5207)
Temporal Docs Server Docker Compose Helm Chart
1.22.4)Server Server With Auto Setup (what is Auto-Setup?) Admin-Tools