Most teams orchestrate multi-step payments synchronously. An API endpoint receives a payment request. It calls a KYC check service. It calls a screening service. It executes a swap. It submits the settlement. If any step fails, the whole chain rolls back.

This works for happy path operations on low-volume systems. It fails catastrophically at scale.

A KYC service timeout leaves the payment hanging. If the timeout is configured as 10 seconds and the service is slow, the API request never returns. The customer retries. Now two payment workflows are in flight, both waiting for KYC. A screening service might be external. Third-party screening APIs have SLAs but not guarantees. If an external screening service is slow or down, the entire payment flow stalls.

This is the fundamental problem with synchronous workflows. They are only as reliable as their slowest dependency. If any step has 99.9% uptime, a chain of 5 steps has 99.5% uptime. A chain of 10 steps has 99% uptime. At that point, the system is unavailable for maintenance windows or expected failures.

Workflow Orchestration Patterns

Temporal is a workflow orchestration engine. It manages multi-step processes by specifying the sequence of steps and the failure handling for each.

A Temporal workflow for a payment executes KYC check (with timeout and retry policy). If KYC fails, compensation occurs (mark payment as failed, notify customer). If KYC succeeds, execute screening check (with timeout and retry policy). If screening fails, compensate. If screening succeeds, execute swap (with timeout and retry policy). If swap fails, compensate (reverse the swap, notify customer). If swap succeeds, submit settlement (with timeout and retry policy). If settlement fails, compensate (reverse the swap). If settlement succeeds, mark payment as complete.

Each step is isolated. If the KYC check times out, the workflow does not hang. Instead, the timeout is caught by Temporal. A retry policy is applied. If retries are exhausted, the compensation logic runs.

Critically, each step runs asynchronously from the perspective of the API caller. The API endpoint initiates the workflow and returns immediately. The client is notified when the workflow completes.

In synchronous workflows, timeout handling requires custom code at each step. A service call is wrapped in a try-catch with timeout logic. If the timeout triggers, the code must decide whether to retry, escalate, or fail. Different services may have different retry policies.

In Temporal, timeouts are declarative. The workflow specifies a timeout for each step. If the timeout is exceeded, Temporal automatically retries according to the policy. If retries are exhausted, the workflow transitions to compensation logic. This logic is written once in the workflow definition. All invocations use the same timeout and retry behavior.

Compensation Logic and Silent Failures

A multi-step payment that fails partway through must undo completed steps. If the swap succeeds but settlement fails, the swap must be reversed. This is compensation.

In synchronous workflows, compensation is custom code wrapping each step. If step 3 fails, we must call the undo logic for steps 2 and 1. But the undo logic must succeed, or we have a zombie transaction on the ledger. Compensation becomes nested and complex.

In Temporal, compensation is part of the workflow definition. For each action, a corresponding compensation step is defined. If the action succeeds, the compensation is not executed. If the action fails, the compensation is automatically executed in reverse order.

A subtle failure mode emerges in synchronous REST chains when services timeout or hang. A swap service is called. The connection hangs for 30 seconds and then times out. The calling code assumes the swap failed and initiates compensation (refund the swap collateral). Meanwhile, the swap service eventually times out on its end and rolls back the swap. The refund never executes. But the calling code thinks it did. Reconciliation finds a discrepancy.

Temporal prevents this by tracking the completion status of each step explicitly. The swap step either completed or did not. If it times out, Temporal does not assume failure. Instead, it checks whether the underlying swap was actually executed. Compensation logic must be idempotent. If compensation is executed twice, the second execution should not reverse the first. This is the same idempotency requirement as was described in the idempotency keys article. Compensation steps need idempotent keys just like settlement steps.

Scale and Compliance

Synchronous workflows work for small volumes. At 10 payments per second, synchronous chains are fine. At 100 payments per second, the number of hanging requests becomes material. At 1000 payments per second, synchronous orchestration breaks entirely.

The reason is that timeouts accumulate. If each step has a 30-second timeout and there are 3 steps with retries, a payment can hang for 2 to 3 minutes. If the API is handling 1000 payments per second and each payment can hold a request for 3 minutes, the number of concurrent requests is 180,000. Most servers cannot handle this.

Asynchronous workflows decouple the API request from the payment processing. The API returns immediately. Payments are processed in the background. The number of concurrent API requests stays constant.

Many compliance officers insist on synchronous execution of payment steps. The reasoning is that if a step fails, the payment should be immediately rejected and the customer notified. Asynchronous execution is seen as risky.

This reasoning is backwards. Asynchronous execution with clear failure notifications is more reliable than synchronous execution that hangs. From the customer's perspective, eventual notification is more important than immediate response if the system is unavailable.

Temporal workflows can be configured to alert on failure. If a payment does not complete within a time window, human review is triggered. The failure is visible and addressable.

Payments at scale require asynchronous orchestration. Synchronous chains do not work. The team that builds payment infrastructure must select between spending time building custom async logic for each payment type, or using a workflow orchestrator.

The cost of custom async logic is high. Each compensation step must be coded. Timeouts and retries must be configured. Idempotency must be ensured. Testing is complex because failure modes are distributed across services.

Workflow orchestrators abstract this. The cost of adding a new payment type is the cost of defining the workflow steps. The cost of handling failures is zero. The cost of ensuring idempotency is minimal because the orchestrator provides guidance.

For payment infrastructure companies, asynchronous orchestration is foundational. A payment platform that uses synchronous REST chains will hit its scale limit quickly. The rebuild to async is painful and expensive.

Asynchronous workflows are not optional for payment systems. They are inevitable.