Retries & Error Handling

Restate automatically retries failures of your agents until they succeed. But LLM calls are costly, so you might want to configure retry behavior to fit your use case and to avoid retrying errors that cannot heal. Restate distinguishes between two types of errors:

Transient errors: Temporary issues like network failures or rate limits. Restate automatically retries these until they succeed or the retry policy is exhausted.
Terminal errors: Permanent failures like invalid input or business rule violations. Restate does not retry these. The invocation fails permanently. You can catch these errors and handle them gracefully.

Retrying LLM calls

LLM API calls can suffer from transient failures (rate limits, network issues, provider outages). Restate retries failed LLM calls so your agents recover automatically.

Default behavior

The Vercel AI SDK and the Restate middleware each have their own retry layer, and they compose.The Vercel AI SDK does the first layer of retries based on what is set for maxRetries on generateText (default: 2) . Once those are exhausted, the AI SDK throws an error.Restate then takes over and retries the invocation. Each Restate retry replays the call, which goes through maxRetries Vercel AI SDK attempts again.By default, Restate’s retries follow the policy configured at the service or handler level, or otherwise the Restate server’s default policy. Restate will go through a limited set of retries with exponential backoff (see default policy), after which the invocation will be paused. This gives you time to fix the issue, and then resume the invocation.

By default, DurableRunner.run retries LLM calls according to the policy configured at the service or handler level, or otherwise the Restate server’s default policy. Restate will go through a limited set of retries with exponential backoff (see default policy), after which the invocation will be paused. This gives you time to fix the issue, and then resume the invocation.

By default, the RestatePlugin retries LLM calls according to the policy configured at the service or handler level, or otherwise the Restate server’s default policy. Restate will go through a limited set of retries with exponential backoff (see default policy), after which the invocation will be paused. This gives you time to fix the issue, and then resume the invocation.

By default, RestateAgent retries LLM calls according to the policy configured at the service or handler level, or otherwise the Restate server’s default policy. Restate will go through a limited set of retries with exponential backoff (see default policy), after which the invocation will be paused. This gives you time to fix the issue, and then resume the invocation.

When you wrap LLM calls in ctx.run(), Restate retries them according to the policy configured at the service or handler level, or otherwise the Restate server’s default policy. Restate will go through a limited set of retries with exponential backoff (see default policy), after which the invocation will be paused. This gives you time to fix the issue, and then resume the invocation.

When you wrap LLM calls in ctx.run_typed(), Restate retries them according to the policy configured at the service or handler level, or otherwise the Restate server’s default policy. Restate will go through a limited set of retries with exponential backoff (see default policy), after which the invocation will be paused. This gives you time to fix the issue, and then resume the invocation.

Setting a retry policy

To set a separate retry policy for LLM calls, pass RunOptions to the durableCalls middleware:

errorhandling/fail-on-terminal-tool-agent.ts

const model = wrapLanguageModel({
  model: openai("gpt-5.4"),
  middleware: durableCalls(ctx, { maxRetryAttempts: 3 }),
});

If you set a maximum number of retry attempts, Restate will still go through the AI SDK’s maxRetries for each attempt, so the two limits multiply (e.g. maxRetryAttempts: 3 × maxRetries: 2 = up to 6 attempts).Once Restate’s retries are exhausted, the invocation fails with a TerminalError and won’t be retried further. You can catch the Terminal Error in your handler and act accordingly.

To set a separate retry policy for LLM calls, pass RunOptions to DurableRunner.run:

error_handling.py

@agent_service.handler()
async def run(_ctx: restate.Context, req: WeatherPrompt) -> str:
    try:
        run_opts = RunOptions(
            max_attempts=3, initial_retry_interval=timedelta(seconds=2)
        )
        result = await DurableRunner.run(agent, req.message, run_options=run_opts)
    except restate.TerminalError as e:
        # Handle terminal errors gracefully
        return f"The agent couldn't complete the request: {e.message}"

    return result.final_output

Once these retries are exhausted, the invocation fails with a TerminalError and won’t be retried further. You can catch the Terminal Error in your handler and act accordingly.

To set a separate retry policy for LLM calls, pass RunOptions to the Restate plugin when activating it for your ADK App:

error_handling.py

run_options = RunOptions(max_attempts=3, initial_retry_interval=timedelta(seconds=1))
app = App(
    name=APP_NAME,
    root_agent=agent,
    plugins=[RestatePlugin(run_options=run_options)],
)

Once these retries are exhausted, the invocation fails with a TerminalError and won’t be retried further. You can catch the Terminal Error in your handler and act accordingly.

To set a separate retry policy for LLM calls, pass RunOptions to RestateAgent:

error_handling.py

restate_agent = RestateAgent(
    agent,
    run_options=RunOptions(max_attempts=3, initial_retry_interval=timedelta(seconds=2)),
)

Once these retries are exhausted, the invocation fails with a TerminalError and won’t be retried further. You can catch the Terminal Error in your handler and act accordingly.

Restate’s RestateMiddleware lets you specify the retry behavior for LLM calls via RunOptions:

error_handling.py

agent = create_agent(
    model=init_chat_model("openai:gpt-4o-mini"),
    tools=[get_weather],
    middleware=[RestateMiddleware(run_options=RunOptions(max_attempts=3))],
)

By default, the middleware retries indefinitely with exponential backoff. Once Restate’s retries are exhausted, the invocation fails with a TerminalError and won’t be retried further.

To set a separate retry policy for LLM calls, pass RunOptions to ctx.run():

// Retries up to 3 times with exponential backoff
const result = await ctx.run(
  "LLM call",
  async () => llmCall(messages, tools),
  { maxRetryAttempts: 3 }
);

Once these retries are exhausted, the invocation fails with a TerminalError and won’t be retried further. You can catch the Terminal Error in your handler and act accordingly.

To set a separate retry policy for LLM calls, pass RunOptions to ctx.run_typed():

# Retries up to 3 times with exponential backoff
result = await ctx.run_typed(
    "LLM call",
    llm_call,
    RunOptions(max_attempts=3),
    messages=messages,
    tools=tools,
)

Once these retries are exhausted, the invocation fails with a TerminalError and won’t be retried further. You can catch the Terminal Error in your handler and act accordingly.

Tool execution errors

Restate makes tool execution resilient by retrying transient errors and propagating terminal ones.

Transient errors

By default, the Vercel AI SDK converts any errors in tool executions into a message to the LLM, and the agent decides how to proceed. This is often desirable, as the LLM can decide to use a different tool or provide a fallback answer.When you wrap external calls in Restate Context actions like ctx.run, Restate retries transient errors within the Context action before the result reaches the agent. This makes your tools resilient to network failures, database hiccups, and other temporary issues. For all operations that might suffer from transient errors, use Context actions:

// Without ctx.run - error goes straight to agent
async function myTool() {
  const result = await fetch("/api/data"); // Might fail due to network
  // If this fails, agent gets the error immediately
}

// With ctx.run - Restate handles retries
async function myToolWithRestate(ctx: restate.Context) {
  const result = await ctx.run("fetch-data", () => fetch("/api/data"));
  // Network failures get retried automatically
  // Only terminal errors reach the AI
}

Restate then retries the whole invocation according to the policy configured at the service or handler level, or otherwise the Restate server’s default policy.

Setting a retry policy on run actions

If you do run actions in your tools, you can override the default retry policy by passing RunOptions:

const result = await ctx.run(
    "fetch-data",
    () => fetch("/api/data"),
    { maxRetryAttempts: 3 }
);