ToolExecutionMode.PARALLEL(the default mode via NONE) does not actually execute tools in parallel. The ADK Javadoc explicitly contracts that PARALLEL runs multiple tools concurrently, but the implementation in Functions.handleFunctionCalls() uses concatMapEager without subscribeOn(Schedulers.io()). For any tool that performs blocking I/O (e.g. Spring RestClient, JDBC, any synchronous HTTP client), this results in strictly sequential execution — the calling thread is held for the full duration of each tool call before the next subscription can begin.
===================================================================================
Steps to Reproduce:
Create an LlmAgent with 2+ tools that perform blocking HTTP calls (e.g. Spring RestClient.retrieve().body(...))
Use the default RunConfig (ToolExecutionMode.NONE, which Javadoc states defaults to PARALLEL)
Send a user message that causes the LLM to return multiple tool calls in a single response (e.g. "compare laptops and phones" → LLM emits 2 simultaneous searchForProducts calls)
Measure total tool execution wall-clock time
Expected: Wall time ≈ single tool RTT (~900 ms) Observed: Wall time ≈ N × single tool RTT (~900 ms × N, sequential
===================================================================================
Expected Behavior:
When the LLM returns multiple tool calls in a single response and ToolExecutionMode is PARALLEL (or NONE, which defaults to PARALLEL per Javadoc), all tool calls should be dispatched concurrently to separate threads. Total latency should be bounded by the slowest individual tool call, not the sum of all tool call latencies.
The RunConfig Javadoc states:
Observed Behavior:
Tools execute sequentially regardless of ToolExecutionMode. Root cause is in Functions.handleFunctionCalls():
// google-adk 0.7.0 — Functions.java
Observable functionResponseEventsObservable;
if (invocationContext.runConfig().toolExecutionMode() == ToolExecutionMode.SEQUENTIAL) {
functionResponseEventsObservable =
Observable.fromIterable(functionCalls).concatMapMaybe(functionCallMapper);
} else {
// PARALLEL and NONE both reach this branch — but concatMapEager without
// subscribeOn(Schedulers.io()) is still sequential with blocking tools:
functionResponseEventsObservable =
Observable.fromIterable(functionCalls)
.concatMapEager(call -> functionCallMapper.apply(call).toObservable());
// ↑ No subscribeOn() → all subscriptions happen on the same calling thread
// → blocking I/O inside each tool stalls the next subscription
}
concatMapEager subscribes to all inner observables eagerly, but without subscribeOn(Schedulers.io()), every subscription executes synchronously on the ADK calling thread. A blocking RestClient.body() call inside a tool holds that thread for ~900 ms, preventing the next tool from being subscribed until the current one completes.
No error or stack trace is produced — the code runs correctly but sequentially, making this a silent correctness bug.
===================================================================================
Environment Details:
ADK Library Version: google-adk:0.7.0
OS: Linux (Kubernetes pod — sparky-brain-prod on WCNP)
TS Version: N/A (Java)
Model Information:
Claude Sonnet 4.6
===================================================================================
Optional Information:
We worked around this by wrapping every BaseTool in a decorator that adds subscribeOn(Schedulers.io()) at registration time:
// Workaround — SparkyIoSchedulerTool.java
@OverRide
public Single<Map<String, Object>> runAsync(Map<String, Object> args, ToolContext toolContext) {
return Single.defer(() -> delegate.runAsync(args, toolContext))
.subscribeOn(Schedulers.io()); // ← dispatches blocking work off the ADK thread
}
This workaround confirms the root cause: adding IO thread dispatch is sufficient to restore true parallel execution.
// Blocking tool — simulates RestClient or any synchronous HTTP call
public class SlowTool extends BaseTool {
public SlowTool() { super("slow_tool", "Simulates a blocking HTTP call"); }
@Override
public Single<Map<String, Object>> runAsync(Map<String, Object> args, ToolContext ctx) {
return Single.fromCallable(() -> {
Thread.sleep(1000); // simulates blocking RestClient ~1000ms
return Map.of("result", "done");
});
}
}
// Agent with 3 instances of SlowTool
LlmAgent agent = LlmAgent.builder()
.name("test-agent")
.model(model)
.tools(List.of(new SlowTool("search1"), new SlowTool("search2"), new SlowTool("search3")))
.build();
// RunConfig with default mode (NONE = PARALLEL per Javadoc)
RunConfig runConfig = RunConfig.builder().build(); // ToolExecutionMode.NONE
// Send a message that causes the LLM to call all 3 tools in one response
// Expected total time: ~1,000ms (parallel)
// Actual total time: ~3,000ms (sequential — bug)
Suggested Fix:
// Functions.java — replace concatMapEager with flatMap + subscribeOn
} else {
functionResponseEventsObservable =
Observable.fromIterable(functionCalls)
.flatMap(call ->
functionCallMapper.apply(call)
.toObservable()
.subscribeOn(Schedulers.io())); // ← true parallel dispatch
}
ToolExecutionMode.PARALLEL(the default mode via NONE) does not actually execute tools in parallel. The ADK Javadoc explicitly contracts that PARALLEL runs multiple tools concurrently, but the implementation in Functions.handleFunctionCalls() uses concatMapEager without subscribeOn(Schedulers.io()). For any tool that performs blocking I/O (e.g. Spring RestClient, JDBC, any synchronous HTTP client), this results in strictly sequential execution — the calling thread is held for the full duration of each tool call before the next subscription can begin.
===================================================================================
Steps to Reproduce:
Create an LlmAgent with 2+ tools that perform blocking HTTP calls (e.g. Spring RestClient.retrieve().body(...))
Use the default RunConfig (ToolExecutionMode.NONE, which Javadoc states defaults to PARALLEL)
Send a user message that causes the LLM to return multiple tool calls in a single response (e.g. "compare laptops and phones" → LLM emits 2 simultaneous searchForProducts calls)
Measure total tool execution wall-clock time
Expected: Wall time ≈ single tool RTT (~900 ms) Observed: Wall time ≈ N × single tool RTT (~900 ms × N, sequential
===================================================================================
Expected Behavior:
When the LLM returns multiple tool calls in a single response and ToolExecutionMode is PARALLEL (or NONE, which defaults to PARALLEL per Javadoc), all tool calls should be dispatched concurrently to separate threads. Total latency should be bounded by the slowest individual tool call, not the sum of all tool call latencies.
The RunConfig Javadoc states:
Observed Behavior:
Tools execute sequentially regardless of ToolExecutionMode. Root cause is in Functions.handleFunctionCalls():
// google-adk 0.7.0 — Functions.java
Observable functionResponseEventsObservable;
if (invocationContext.runConfig().toolExecutionMode() == ToolExecutionMode.SEQUENTIAL) {
functionResponseEventsObservable =
Observable.fromIterable(functionCalls).concatMapMaybe(functionCallMapper);
} else {
// PARALLEL and NONE both reach this branch — but concatMapEager without
// subscribeOn(Schedulers.io()) is still sequential with blocking tools:
functionResponseEventsObservable =
Observable.fromIterable(functionCalls)
.concatMapEager(call -> functionCallMapper.apply(call).toObservable());
// ↑ No subscribeOn() → all subscriptions happen on the same calling thread
// → blocking I/O inside each tool stalls the next subscription
}
concatMapEager subscribes to all inner observables eagerly, but without subscribeOn(Schedulers.io()), every subscription executes synchronously on the ADK calling thread. A blocking RestClient.body() call inside a tool holds that thread for ~900 ms, preventing the next tool from being subscribed until the current one completes.
No error or stack trace is produced — the code runs correctly but sequentially, making this a silent correctness bug.
===================================================================================
Environment Details:
ADK Library Version: google-adk:0.7.0
OS: Linux (Kubernetes pod — sparky-brain-prod on WCNP)
TS Version: N/A (Java)
Model Information:
Claude Sonnet 4.6
===================================================================================
Optional Information:
We worked around this by wrapping every BaseTool in a decorator that adds subscribeOn(Schedulers.io()) at registration time:
// Workaround — SparkyIoSchedulerTool.java
@OverRide
public Single<Map<String, Object>> runAsync(Map<String, Object> args, ToolContext toolContext) {
return Single.defer(() -> delegate.runAsync(args, toolContext))
.subscribeOn(Schedulers.io()); // ← dispatches blocking work off the ADK thread
}
This workaround confirms the root cause: adding IO thread dispatch is sufficient to restore true parallel execution.
// Blocking tool — simulates RestClient or any synchronous HTTP call
public class SlowTool extends BaseTool {
public SlowTool() { super("slow_tool", "Simulates a blocking HTTP call"); }
}
// Agent with 3 instances of SlowTool
LlmAgent agent = LlmAgent.builder()
.name("test-agent")
.model(model)
.tools(List.of(new SlowTool("search1"), new SlowTool("search2"), new SlowTool("search3")))
.build();
// RunConfig with default mode (NONE = PARALLEL per Javadoc)
RunConfig runConfig = RunConfig.builder().build(); // ToolExecutionMode.NONE
// Send a message that causes the LLM to call all 3 tools in one response
// Expected total time: ~1,000ms (parallel)
// Actual total time: ~3,000ms (sequential — bug)
Suggested Fix:
// Functions.java — replace concatMapEager with flatMap + subscribeOn
} else {
functionResponseEventsObservable =
Observable.fromIterable(functionCalls)
.flatMap(call ->
functionCallMapper.apply(call)
.toObservable()
.subscribeOn(Schedulers.io())); // ← true parallel dispatch
}