Skip to content

Fix persistent compilation cache not working on TPU#9759

Open
shr1ram wants to merge 1 commit into
pytorch:masterfrom
shr1ram:fix/persistent-cache-tpu-deserialization
Open

Fix persistent compilation cache not working on TPU#9759
shr1ram wants to merge 1 commit into
pytorch:masterfrom
shr1ram:fix/persistent-cache-tpu-deserialization

Conversation

@shr1ram

@shr1ram shr1ram commented Mar 23, 2026

Copy link
Copy Markdown

Summary

  • Fixes the persistent compilation cache failing to load cached executables on TPU, causing recompilation every run
  • The root cause is that DeserializeComputation calls client_->DeserializeExecutable(), which returns UNIMPLEMENTED on TPU because the TPU backend (PjRtCApiClient) implements LoadSerializedExecutable instead
  • The fix tries LoadSerializedExecutable first (the PJRT C API plugin path used by TPU/Neuron), then falls back to the old DeserializeExecutable + Load two-step path for other backends

Fixes #9094

Test plan

  • Existing test_persistent_cache.py tests should continue to pass on TPU
  • Run the reproduction script from tpu torch xla is not using xla_cache #9094 twice — second run should show cache hits (via PersistentCacheHit metric) and significantly faster startup
  • Verify no regression on non-TPU backends (the fallback path preserves the original behavior)

🤖 Generated with Claude Code

The TPU backend (via PjRtCApiClient) implements LoadSerializedExecutable
but does not override DeserializeExecutable, which returns UNIMPLEMENTED
from the base PjRtClient class. This caused every cache load attempt to
fail, forcing recompilation on every run even when valid cached
executables existed on disk.

Fix by trying LoadSerializedExecutable first (the path implemented by
PJRT C API plugins like TPU), then falling back to the two-step
DeserializeExecutable + Load path for backends that implement that
instead.

Fixes pytorch#9094

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

tpu torch xla is not using xla_cache

2 participants