oracle: fix consensus deadlock from stale attestation timestamps by JohnnyLawDGB · Pull Request #383 · DigiByte-Core/digibyte

JohnnyLawDGB · 2026-02-26T14:47:49Z

Summary

Fixes the oracle attestation deadlock where consensus gets permanently stuck at 0 despite all oracles reporting valid prices. Observed on testnet19 — 8/9 oracles reporting ~$0.00445 but consensus_price_micro_usd: 0 and last_bundle_height: 0 for 19+ hours.

Root Cause

When a consensus round stalls, all oracles create attestations with the same frozen (price, timestamp) tuple. The Phase2 hash — H(oracle_id, price, timestamp) — is identical each cycle, so the duplicate filter in AddOracleMessage() permanently rejects it. Since BroadcastMessage() calls AddOracleMessage() before pushing to P2P, the message never reaches the network. All oracle nodes enter this state simultaneously with no automatic recovery.

The deadlock chain:

pending_messages contains entries with stale timestamp T
ComputeConsensusValues() returns consensus_timestamp = T (median of stale entries)
Oracle creates attestation with (price, T) → Phase2 hash H
AddOracleMessage() finds H in seen_message_hashes → rejected as duplicate
BroadcastMessage() returns false → message never reaches P2P
Other oracles in the same state → no fresh data arrives → goto 1

Three-Part Fix

Stale consensus detection (src/oracle/node.cpp): When consensus timestamp is >5 minutes old, BroadcastCurrentPrice() clears stale state and falls through to individual price broadcast with GetTime(). Fresh timestamp → different Phase2 hash → passes duplicate filter → propagates to network → new consensus forms.
Periodic seen-hash cleanup (src/oracle/bundle_manager.cpp): Clear seen_message_hashes every 300 seconds in AddOracleMessage(). Safety net for automatic recovery. The pending_messages map (keyed by oracle_id, accepting only newer timestamps) provides authoritative dedup.
State reset on stoporacle (src/rpc/digidollar.cpp): stoporacle now calls ClearPendingMessages() to reset the duplicate filter. Gives operators a manual recovery path.

Relationship to RC22 `pending_messages.clear()` Fix

The RC22 fix (f7a4b1d) removed pending_messages.clear() from AddOracleBundleToBlock() to prevent messages being drained on every CreateNewBlock() call. This fix addresses a different but related deadlock: messages survive template creation but get permanently trapped in the duplicate filter when the consensus timestamp freezes.

Test plan

Verify oracle recovers from stale consensus within 5 minutes (automatic)
Verify stoporacle/startoracle cycle clears state and allows fresh consensus
Verify normal oracle operation (fresh consensus) is unaffected by changes
Verify periodic seen-hash cleanup doesn't cause duplicate message processing storms
Run existing oracle test suite (oracle_bundle_manager_tests, oracle_phase2_tests)

🤖 Generated with Claude Code

When oracle consensus stalls (e.g., from rapid block production or network partition), all oracles create attestations with the same frozen (price, timestamp) tuple. The Phase2 hash—computed from (oracle_id, price, timestamp)—is identical each cycle, so the duplicate filter in AddOracleMessage() permanently rejects it. Since BroadcastMessage() calls AddOracleMessage() before pushing to P2P, the message never reaches the network. All oracle nodes enter this state simultaneously, creating a network-wide deadlock with no automatic recovery. Three-part fix: 1. Stale consensus detection (node.cpp): When the consensus timestamp is >5 minutes old, BroadcastCurrentPrice() clears stale pending state and falls through to individual price broadcast with GetTime() as timestamp. The fresh timestamp produces a different Phase2 hash, breaking the duplicate filter collision and allowing the message to propagate. 2. Periodic seen-hash cleanup (bundle_manager.cpp): Clear seen_message_hashes every 300 seconds in AddOracleMessage(). This provides automatic recovery even without the stale consensus detection. The pending_messages map (keyed by oracle_id, accepting only newer timestamps) provides authoritative dedup; the seen hash set is a best-effort P2P optimization only. 3. State reset on stoporacle (digidollar.cpp): stoporacle now calls ClearPendingMessages() to reset seen_message_hashes, pending_messages, and pending_attestations. Gives operators a manual recovery path via stoporacle/startoracle cycle. Fixes the oracle attestation stuck-at-4/5 bug observed on testnet19 where 8 oracles reported valid prices but consensus_price remained 0 for 19+ hours until an operator manually restarted their node. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

DigiSwarm · 2026-02-27T19:26:08Z

Hey Johnny, thank you so much for digging into this and putting in the time to track down the root cause. The deadlock analysis is excellent — the frozen timestamp → identical Phase2 hash → permanent duplicate filter rejection chain is exactly what we were seeing on testnet19 and you nailed it.

The three-part oracle fix (stale consensus detection, periodic seen-hash cleanup, and state reset on stoporacle) is well thought out and I want to get this merged.

I ran the full test suite locally — 1,969 C++ unit tests pass and all 311 functional tests pass. However, both CI checks (macOS and Ubuntu) are failing on the PR, and I noticed the branch was forked from before my latest RC22 RPC display fix commit (272d44092b), which means a few of those RC22 fixes got inadvertently reverted in digidollar.cpp and the digidollar_rpc_display_bugs.py test file was removed.

No worries at all — I'm going to cherry-pick your oracle fix commit directly onto our current branch tip so everything stays clean. Your authorship will be preserved in the commit. This way we keep both the oracle deadlock fix and the RC22 RPC corrections in one clean history.

Thank you again for the incredible contribution. This kind of deep debugging work is exactly what we need as we push toward mainnet. The whole testnet19 testing effort from everyone has been invaluable. 🚀💎

ycagel requested review from JaredTate, SmartArray, digicontributer and gto90 February 26, 2026 17:45

DigiSwarm merged commit 6963ed5 into DigiByte-Core:feature/digidollar-v1 Feb 27, 2026
0 of 2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

oracle: fix consensus deadlock from stale attestation timestamps#383

oracle: fix consensus deadlock from stale attestation timestamps#383
DigiSwarm merged 1 commit intoDigiByte-Core:feature/digidollar-v1from
JohnnyLawDGB:fix/oracle-consensus-deadlock

JohnnyLawDGB commented Feb 26, 2026

Uh oh!

DigiSwarm commented Feb 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

JohnnyLawDGB commented Feb 26, 2026

Summary

Root Cause

Three-Part Fix

Relationship to RC22 pending_messages.clear() Fix

Test plan

Uh oh!

DigiSwarm commented Feb 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Relationship to RC22 `pending_messages.clear()` Fix