Skip to content

feat: Bitswap improvements for Kubo compatibility#1321

Merged
acul71 merged 46 commits into
libp2p:mainfrom
sumanjeet0012:improvement/bitswap
Jun 18, 2026
Merged

feat: Bitswap improvements for Kubo compatibility#1321
acul71 merged 46 commits into
libp2p:mainfrom
sumanjeet0012:improvement/bitswap

Conversation

@sumanjeet0012

@sumanjeet0012 sumanjeet0012 commented May 3, 2026

Copy link
Copy Markdown
Contributor

What was wrong?

Fixes #1347

Issue: py-libp2p's Bitswap file operations were not compatible with Kubo (Go-IPFS)
and the broader IPFS network. Files added by py-libp2p could not be fetched by Kubo
nodes, and vice versa. Additionally, DHT records lacked cryptographic verification
required by Kubo peers.

Several root causes were identified:

1. Flat DAG structure (no balanced layout)

create_file_node() put all chunks as direct links on the root node — a flat 1-level tree. Kubo uses a balanced tree with a maximum of 174 links per node.

2. Wrong DAG-PB wire encoding (field ordering)

encode_dag_pb() used PBNode.SerializeToString() which emits Data before Links. The DAG-PB spec requires Links before Data for canonical encoding.

3. Wrong default chunk size

The previous default (~63 KiB) did not match Kubo's default size-262144 (256 KiB), producing different root CIDs from ipfs add.

4. No persistent block storage

MemoryBlockStore was the only BlockStore implementation, meaning all fetched blocks were lost when the process exited.

5. No transparent caching layer

MerkleDag called bitswap.get_block() / bitswap.add_block() directly with no service layer, meaning fetched blocks were not auto-cached locally.

6. No stream input support

add_file() only accepted a file path string.

7. Magic integers in Bitswap message construction

create_wantlist_entry(cid, want_type=1) used raw integers with no type safety.

8. DHT record signing/verification not compatible with Kubo

DHT value records lacked proper signing and verification, causing incompatibility with Kubo's DHT implementation.

9. Sequential block fetching caused Kubo disconnects

Fetching many blocks one-by-one opened hundreds of streams, causing Kubo to issue GO_AWAY disconnects.


How was it fixed?

Bitswap: Kubo CID compatibility

Raw leaf blocks (CODEC_RAW) in dag.py:
Multi-chunk files store raw leaf blocks, matching Kubo's RawLeaves=true default for ipfs add.

Default chunk size 256 * 1024 in chunker.py:
Matches Kubo's default size-262144.

New balanced_layout(leaves) in dag_pb.py:
Groups leaves into batches of 174, builds a tree level by level.

Fixed encode_dag_pb() in dag_pb.py:
Manually constructs wire bytes with Links before Data (canonical DAG-PB ordering).

create_leaf_node() in dag_pb.py:
Utility for dag-pb-wrapped leaves (RawLeaves=false mode); used by tests and layout helpers, not the default file-ingestion path.

Bitswap: batch fetching

Enhanced get_blocks_batch() in client.py:
Sends all CIDs in a single wantlist message per batch, waits for all responses on the same stream.

New: FilesystemBlockStore

New FilesystemBlockStore in block_store.py:
Stores each block as a file at <base>/<cid[:2]>/<cid[2:]> with atomic writes. Drop-in replacement for MemoryBlockStore.

New: BlockService

New block_service.py:
Transparent local→network fallback layer between MerkleDag and BitswapClient. Auto-caches network-fetched blocks locally.

New: add_stream() + chunk_stream()

New chunk_stream(stream: io.IOBase) in chunker.py and add_stream() in MerkleDag:
Reads one chunk at a time from any io.IOBase stream.

New: Generic IPLD node API

New add_node() / get_node() / remove_node() in dag.py:
Store and retrieve structured DAG-JSON, DAG-CBOR, and RAW nodes.

New: Wantlist / Message dataclasses

New wantlist.py with typed dataclasses (WantType, BlockPresenceType, WantlistEntry, Wantlist, BlockPresence, BitswapMessage).

DHT: record signing and verification & ProviderQueryManager

Enhanced DHT record handling with proper signing and verification for compatibility with Kubo's DHT implementation. Added ProviderQueryManager for robust DHT-based provider discovery and caching in Bitswap.

Bitswap 1.3.0 Payment Extension (experimental)

Partial implementation:
BlockPricingEngine, PaymentLedger, and PaymentGatedDecisionEngine are implemented and tested. BitswapPaymentClient and PaymentExtension provide the wire protocol scaffolding; full client/extension test coverage is deferred to a follow-up PR.


Breaking changes

  • MerkleDag.fetch_file() now returns tuple[bytes, str | None] (data + optional filename) instead of bare bytes.
  • Default chunk size changed from ~63 KiB to 256 KiB; root CIDs differ from pre-PR behavior for callers using implicit defaults.
  • MerkleDag.add_file(..., wrap_with_directory=False) by default (was effectively file-root CID before).

Files changed

File Change
libp2p/bitswap/dag_pb.py create_leaf_node(), balanced_layout(), fixed encode_dag_pb() canonical ordering
libp2p/bitswap/dag.py Updated add_file(), add_bytes(), new add_stream(), add_node()/get_node(), BlockService routing
libp2p/bitswap/client.py Enhanced get_blocks_batch(), modularized with IBitswapExtension
libp2p/bitswap/chunker.py New chunk_stream(stream: io.IOBase), DEFAULT_CHUNK_SIZE = 256 * 1024
libp2p/bitswap/block_store.py New FilesystemBlockStore with atomic writes
libp2p/bitswap/block_service.py New fileBlockService
libp2p/bitswap/wantlist.py New file — typed dataclasses
libp2p/bitswap/payment_extension.py New file — Bitswap 1.3.0 Payment implementation (experimental)
libp2p/bitswap/messages.py create_wantlist_entry() accepts WantType | int
libp2p/bitswap/__init__.py Exports all new types
libp2p/kad_dht/ DHT signing/verification for Kubo compatibility, ProviderQueryManager
libp2p/records/ Record signing utilities
libp2p/peer/envelope.py, peer_record.py Updated for record signing
examples/bitswap/bitswap.py Updated example

Test files added

File What it tests
test_unixfs_encoding.py Raw-leaf + balanced DAG layout, create_leaf_node()
test_filesystem_blockstore.py FilesystemBlockStore persistence + round-trip
test_block_service.py BlockService local hit / miss / auto-cache / announce
test_io_stream.py chunk_stream() + add_stream() with BytesIO, GzipFile, file handles
test_wantlist.py WantType, Wantlist, BitswapMessage, to_proto() / from_proto()
test_dag_pb.py test_encode_canonical_field_ordering (Links before Data on wire)
test_provider_query.py ProviderQueryManager DHT provider discovery
test_payment.py BlockPricingEngine, PaymentLedger, PaymentGatedDecisionEngine
test_dag.py Generic IPLD node API (add_node/get_node/remove_node)

To-Do

  • Clean up commit history
  • Add or update documentation related to these changes
  • Add entry to the release notes

sumanjeet0012 and others added 12 commits April 14, 2026 15:55
…compatibility with ipfs kubo

- Added support for signed records in the DHT by introducing `make_signed_put_record` function.
- Updated `ValueStore` to create signed records when storing values.
- Enhanced `Envelope` class to handle raw payload types for peer records.
- Introduced utility functions for signing and verifying DHT records.
- Updated protobuf definitions to include author and signature fields in records.
- Improved logging and debug messages for better traceability.
…ce DAG-PB encoding

Co-authored-by: Copilot <copilot@github.com>
… layout

Co-authored-by: Copilot <copilot@github.com>
… in MerkleDag

Co-authored-by: Copilot <copilot@github.com>
…s and implement add_stream method in MerkleDag for handling io.IOBase streams

Co-authored-by: Copilot <copilot@github.com>
…improve chunk_stream documentation, and add Wantlist functionality

Co-authored-by: Copilot <copilot@github.com>
- Introduced `test_block_service.py` to validate BlockService behavior including local hits, network fetches, auto-caching, and block storage.
- Created `test_filesystem_blockstore.py` to manually test FilesystemBlockStore for basic operations, persistence, and directory structure.
- Added `test_io_stream.py` to verify io.IOBase input support with chunk_stream and MerkleDag.add_stream functionalities.
- Implemented `test_unixfs_encoding.py` to ensure add_file and add_bytes produce dag-pb leaf blocks and validate balanced layout tree structures.
- Developed `test_wantlist.py` to test Wantlist and Message dataclasses, including backward compatibility and public API exports.
- Updated type hints in `make_service` function to allow for None.
- Specified type hints for lists of bytes in block retrieval tests.
- Added assertions to check for non-null `unixfs` in various tests to ensure proper decoding of DAG PB blocks.
- Enhanced type hints for observer and subscriber peers in Gossipsub tests.
- Improved type hints for candidate lists in opportunistic grafting tests.
- Added type ignore comments for factory Meta classes to suppress type checker warnings.
- Updated import statements for ID to include type ignore comments in interop utilities.
@sumanjeet0012 sumanjeet0012 force-pushed the improvement/bitswap branch from 475086a to 6acceb2 Compare May 3, 2026 11:04
sumanjeet0012 and others added 10 commits May 3, 2026 22:41
…handling, and update tests for dag-pb leaf blocks

Co-authored-by: Copilot <copilot@github.com>
…ndling in DAG fetching

Co-authored-by: Copilot <copilot@github.com>
…leDag

Co-authored-by: Copilot <copilot@github.com>
…aching in Bitswap

Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>

@yashksaini-coder yashksaini-coder left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sumanjeet0012 Strong PR overall — the encoding root-causes are correct and well-tested (verified locally: 238/238 bitswap tests pass, wire format confirmed Links-before-Data). The PR description is exemplary.

Leaving a few comments inline. Three I'd flag as blocking before merge:

  1. Breaking API change in add_file default (wrap_with_directory=True) — old callers get a directory CID instead of a file CID for the same call. Either revert the default or call this out loudly in the newsfragment.
  2. verify_record only handles Ed25519 — silently fails for RSA/Secp256k1 peers, breaking DHT interop with non-Ed25519 nodes.
  3. DEFAULT_CHUNK_SIZE = 63 KiB - 32 doesn't match Kubo's 256 KiB default — files added by py-libp2p won't have the same root CID as ipfs add file.bin unless Kubo is told --chunker=size-65504. The PR header claims "Kubo CID compatibility"; this caveat needs to be documented.

Medium-priority items (perf and hygiene) inline. Everything else is comfortable post-merge cleanup.

Nice work on the manual DAG-PB outer envelope — limiting hand-rolling to 0x12 <varlen> <linkbytes> then 0x0a <varlen> <unixfs> while still using protobuf for the inner messages is the minimal, correct approach.

Comment thread libp2p/bitswap/dag.py Outdated
@@ -187,7 +272,7 @@ async def add_file(

dir_data = create_directory_node([(filename, cid, file_size)])
dir_cid = compute_cid_v1(dir_data, codec=CODEC_DAG_PB)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Breaking API change (re: the wrap_with_directory=True default a few lines up at the parameter decl): defaulting this to True silently changes the behavior of add_file(path) for every existing caller — they'll now get a directory CID where they previously got a file CID, and fetch_file returns a (bytes, filename) tuple where it returned bytes before. This is invisible in the new tests because they all pass wrap_with_directory=False.

Suggest defaulting to False for back-compat, or version-bump and add a clear migration note to newsfragments/1321.feature.rst (currently doesn't flag this as breaking).

Comment thread libp2p/records/utils.py Outdated

"""
try:
public_key = Ed25519PublicKey.from_bytes(author_public_key)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interop bug for non-Ed25519 peers. This hard-codes Ed25519 deserialization, so DHT records signed by RSA or Secp256k1 peers will silently fail verification (caught by the try/except and returned as False, indistinguishable from a genuinely invalid signature).

Compare with libp2p/peer/envelope.py:198-216 (pub_key_from_protobuf) which dispatches on KeyType for all three types. Either dispatch the same way here, or accept a PublicKey object and let the caller deserialize.

Kubo defaults to Ed25519 today, so this won't surface in basic interop tests, but it's a real correctness gap.

Comment thread libp2p/bitswap/chunker.py Outdated
DEFAULT_CHUNK_SIZE = 63 * 1024
# 63 KB minus 32 bytes to leave room for the dag-pb leaf envelope overhead,
# ensuring wrapped blocks never exceed MAX_BLOCK_SIZE (63 * 1024).
DEFAULT_CHUNK_SIZE = 63 * 1024 - 32

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mismatch with Kubo's default chunk size. Kubo's ipfs add uses size-262144 (256 KiB) by default. With this 63 KiB default, py-libp2p will produce a different root CID than ipfs add file.bin for the same content — contradicting the PR's "Kubo CID compatibility" headline.

Leaf CIDs match Kubo (because of RawLeaves=false + dag-pb wrapping), but the root over a multi-chunk file won't. Either:

  • Document this clearly in newsfragments/1321.feature.rst ("CIDs match ipfs add --chunker=size-65504"), or
  • Match Kubo's 256 KiB default and split large messages at the wire layer instead of capping the chunk size.

Comment thread libp2p/bitswap/dag.py Outdated
chunk = leaf_raw
logger.debug(f"[DAG] Leaf {idx + 1}: raw block {len(chunk)} bytes")

file_data += chunk

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

O(n²) bytes concat. file_data += chunk over potentially thousands of leaves means each += allocates a new bytes object and copies all previous data. For a 100 MB file at 63 KB chunks (~1626 leaves), this allocates roughly 80 GB of intermediate strings.

The fix is one line — accumulate into a bytearray (or list + b"".join(parts) at the end). This is the single biggest perf win available in the PR.

Same pattern in _read_message (client.py:1017-1047) and encode_dag_pb (dag_pb.py:125-149).

Comment thread libp2p/bitswap/block_store.py Outdated
await trio.to_thread.run_sync(
lambda: path.parent.mkdir(parents=True, exist_ok=True)
)
await trio.to_thread.run_sync(path.write_bytes, data)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-atomic writes. path.write_bytes(data) writes in place. If the process crashes mid-write, the next startup finds a truncated file at a CID path. get_block will then return corrupted bytes that fail verification only if the caller checks.

Standard fix: write to path.with_suffix('.tmp'), then os.replace(tmp, path) — atomic on POSIX, durable on most filesystems.

Comment thread libp2p/bitswap/dag.py Outdated
# Ensure the result is a plain dict (not a coroutine from a mock)
if isinstance(result, dict):
return result
except Exception:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test infrastructure leaking into production code. The getattr probe + isinstance(result, dict) check + bare except Exception: pass is a workaround for MagicMock returning a coroutine object. This silently masks real failures in production.

Suggest defining a Protocol for the batch interface, typing self.bitswap with it, and removing the runtime probing entirely. Tests can then mock the protocol explicitly.

Comment thread libp2p/bitswap/client.py Outdated

# Send all CIDs in a single wantlist to the peer
if peer_id:
await self._send_wantlist_to_peer(peer_id, batch)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Swallowed exception in _send_wantlist_to_peer causes hangs. That helper (further down in this file) catches all exceptions, logs "Failed to send wantlist to peer", and returns. When called from this batch path, if host.new_stream or _write_message fails for one CID in the batch, the corresponding _pending_requests[cid].wait() below will block until trio.fail_after(timeout) cancels it — adding the full timeout (default 30s) to every per-batch failure.

Fix: propagate the failure from _send_wantlist_to_peer, or event.set() with a sentinel so the waiter can fail fast.

Comment thread libp2p/bitswap/dag_pb.py Outdated
# We manually construct the wire format to enforce the correct ordering.

# Add links
result = b""

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: same O(n²) concat pattern as fetch_file reassembly. With 174 links on an internal node and large CIDs, each result += ... allocates a fresh bytes. Trivial fix: build into a bytearray and convert to bytes at return.

Comment thread libp2p/bitswap/chunker.py
yield chunk


def chunk_stream(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this new chunk_stream should also be added to the module's __all__ (further down in this file). Doesn't affect direct imports, but it breaks from chunker import * and confuses some IDE auto-import tools.

Comment thread libp2p/bitswap/dag.py Outdated
ordered_leaf_cids: list[bytes] = []

def _collect_leaves_local(cid_bytes: bytes, depth: int = 1) -> None:
"""Traverse locally-fetched blocks to collect leaf CIDs."""

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: _collect_leaves_local is unbounded recursion. For a balanced 174-fanout DAG over a 100 MB file the depth is ~2, so this never trips Python's default 1000-frame limit in practice. But a maliciously crafted DAG (e.g. a chain of single-link nodes) would crash with RecursionError. Convert to an explicit stack if you want to harden this.

@Winter-Soren

Copy link
Copy Markdown
Contributor

PR #1321 AI Review

1. Summary of Changes

  • This PR introduces a large Bitswap/DAG feature set: batch wantlist fetching, BlockService, FilesystemBlockStore, stream ingestion (chunk_stream, add_stream), provider discovery manager, and Kubo-oriented DAG-PB/CID interoperability updates.
  • It also includes Kademlia record-signing changes, transport/test refactors, and multiple docs/newsfragment updates.
  • Related issue references present in branch history/newsfragments include #1321 and several prior issues merged in this branch lineage.
  • Affected architecture areas: libp2p.bitswap, libp2p.kad_dht, record/signature handling (libp2p.records), plus tests and docs.
  • Breaking change note is present in existing release notes/newsfragments around legacy CID helper migration (not this review's primary blocker, but user-facing and important).

2. Strengths

  • Strong feature depth for Bitswap interoperability and performance, with substantial test additions in tests/core/bitswap/.
  • Clear separation of responsibilities with new components (ProviderQueryManager, BlockService, Wantlist).
  • Good effort toward typed APIs and user-facing docs/newsfragments.
  • newsfragments/1321.feature.rst is present and formatted as expected.

3. Issues Found

Critical

  • None found with high confidence.

Major

  • File: libp2p/bitswap/provider_query.py

  • Line(s): Provider query path inside _query_single

  • Issue: ProviderQueryManager claims DHT provider discovery but calls self.dht.provider_store.get_providers(cid_bytes), which is a local provider-store read, not a network provider lookup. This can prevent remote provider discovery and silently degrade to fallback behavior.

  • Suggestion: Replace local-store read with the actual async DHT provider lookup path (KadDHT.find_providers(...)/equivalent network query API), then cache those remote results.

  • File: libp2p/kad_dht/value_store.py

  • Line(s): _store_at_peer

  • Issue: put() creates signed records (make_signed_put_record), but outbound PUT_VALUE RPC in _store_at_peer sets only record.key and record.value; signature/author fields are not propagated. This can break signed-record interoperability and weaken authenticity guarantees.

  • Suggestion: Build outbound RPC records from the signed record object (including signature/author fields) and add tests asserting these fields are present in serialized outbound Message.record.

Minor

  • No additional minor issues reported; review focused on behavior and correctness over cosmetic changes.

4. Security Review

  • Risk: Signed DHT record metadata not transmitted in outbound PUT_VALUE flow.
  • Impact: Medium (authenticity/interoperability degradation; possible rejection by stricter peers).
  • Mitigation: Ensure author and signature fields are serialized in outbound record messages and covered by unit/integration tests.

5. Documentation and Examples

  • Docs/newsfragment coverage is generally good for this PR scope.
  • However, provider discovery behavior documentation should match implementation: if discovery is advertised as DHT-network based, implementation/tests must prove network query behavior rather than local cache-only reads.

6. Newsfragment Requirement

  • Required newsfragment found: newsfragments/1321.feature.rst.
  • No blocker on newsfragment presence for PR #1321.

7. Tests and Validation

  • New tests are substantial, especially under tests/core/bitswap/.
  • Major test gap: current provider-query tests mostly mock local provider-store behavior and do not verify real DHT network provider lookup path.
  • Major test gap: no assertion that outbound PUT_VALUE includes signed record fields (author, signature).
  • Local validation results:
    • make pr failed in this environment due ruff issues under references/rust-libp2p-references/.../fix-unreachable-pub.py.
    • make linux-docs built docs but ended with environment-specific failure (xdg-open: No such file or directory).

8. Recommendations for Improvement

  • Switch provider discovery from local provider_store.get_providers reads to true DHT network querying and keep cache layer as an optimization.
  • Propagate full signed record fields in outbound DHT PUT_VALUE.
  • Add integration test proving provider discovery finds remote providers not already in local store.
  • Add unit test around _store_at_peer serialization ensuring signed fields are present.
  • Re-run CI checks in a clean Linux CI environment to separate repository issues from local tooling constraints.

9. Questions for the Author

  • Was ProviderQueryManager intentionally designed to read only local provider store state, or should it perform network DHT lookup?
  • Should signed PUT_VALUE interoperability with Kubo peers require transmitting author/signature on every outbound store operation?
  • Do you have an integration test scenario where a provider is discovered only via remote DHT query (not pre-populated local store)?

10. Overall Assessment

  • Quality Rating: Needs Work
  • Security Impact: Low to Medium
  • Merge Readiness: Needs fixes
  • Confidence: High

sumanjeet0012 and others added 5 commits May 6, 2026 23:19
… lookups and always send signed records.

Co-authored-by: Copilot <copilot@github.com>
- Added PaymentGatedDecisionEngine to handle payment-required logic for block serving.
- Introduced PaymentTerms, PaymentAuthorization, PaymentReceipt, and PaymentRejection messages in the new bitswap_1_3_0.proto.
- Enhanced existing MerkleDag class to store internal nodes with a callback for payment gating.
- Created BitswapPaymentClient_1_3 to manage client-side payment authorizations and receipts.
- Updated balanced_layout function to support payment gating and internal node storage.
- Added necessary protobuf definitions and generated Python files for Bitswap 1.3.0.
@acul71

acul71 commented May 17, 2026

Copy link
Copy Markdown
Contributor

Thanks @sumanjeet0012, full review here:

AI PR Review: #1321 — feat: Bitswap improvements for Kubo compatibility

PR: #1321
Branch: improvement/bitswap (26 commits ahead of origin/main, 0 behind at review time)
Reviewer: AI review (prompt: downloads/py-libp2p-pr-review-prompt.md)
Review version: 0


1. Summary of Changes

This PR is a large feature change centered on Bitswap / Merkle DAG interoperability with Kubo (go-ipfs) and supporting infrastructure:

Area What changed
DAG-PB / UnixFS Leaf blocks use dag-pb + UnixFS wrapping (create_leaf_node), balanced tree layout (balanced_layout, max 174 links), canonical wire encoding (Links before Data in encode_dag_pb).
Bitswap client Batch wantlist fetching, typed wantlist API (wantlist.py), optional ProviderQueryManager for DHT provider discovery in get_block().
Storage / caching FilesystemBlockStore, BlockService (local-first + auto-cache), chunk_stream / add_stream for io.IOBase inputs.
Kademlia / records Signed DHT put records, multi-key-type verify_record, outbound PUT_VALUE carries signature/author.
Tests / release Extensive new tests under tests/core/bitswap/ and tests/core/kad_dht/; newsfragments/1321.feature.rst.

Related issue: #1321 (same title/body as the PR; tracks Kubo compatibility work). The PR body does not include an explicit Fixes #1321 / Closes #1321 line — recommend adding one for auto-linking on merge.

Breaking / behavior changes (user-facing):

  • MerkleDag.add_file(..., wrap_with_directory=True) by default (was effectively file-root CID before).
  • MerkleDag.fetch_file() now returns tuple[bytes, str | None] (data + optional filename).
  • Default chunk size remains ~63 KiB − 32 B, not Kubo’s default 256 KiB — root CIDs differ from default ipfs add unless chunker is aligned.

Modules touched: libp2p.bitswap.*, libp2p.kad_dht.*, libp2p.records.*, libp2p.peer.envelope, examples/bitswap/bitswap.py.


2. Branch Sync Status and Merge Conflicts

Branch sync status

  • Status: ℹ️ Ahead of origin/main (no commits behind at review time).
  • Details: branch_sync_status.txt0 26 (0 commits on main not in HEAD; 26 commits on HEAD not in main).

Merge conflict analysis

No merge conflicts detected. Test merge of origin/main into the PR branch completed cleanly (merge_conflict_check.log).


3. Strengths

  • Root-cause analysis is correct: Raw leaf codec, flat DAG, and protobuf field ordering were real Kubo incompatibilities; the manual DAG-PB envelope (0x12 links then 0x0a data) is the right minimal fix.
  • Good layering: BlockService and FilesystemBlockStore separate concerns without forcing all callers to adopt them (MerkleDag(bitswap) still works).
  • Substantial automated tests: 238 bitswap tests pass locally; new coverage for BlockService, provider query manager, signed PUT_VALUE, multi-key verify_record, stream ingestion, and UnixFS layout.
  • Follow-up on prior review: Commits after May 6 address Winter-Soren’s provider-discovery and signed-record propagation findings (find_providers path, _store_at_peer uses message.record.CopyFrom(signed_record), multi-key verify_record).
  • Newsfragment present: newsfragments/1321.feature.rst with user-facing bullet list.
  • Local quality gates: make lint, make typecheck, and full make test all pass (2842 passed, 16 skipped).

4. Issues Found

Critical

None identified that would corrupt protocol state or introduce obvious remote code execution. CI docs job is failing (see §8) — treat as merge blocker until fixed.

Major

  • File: libp2p/bitswap/dag.py
  • Line(s): 202
  • Issue: Breaking API defaultwrap_with_directory=True changes the default CID returned by add_file(path) from file-root to directory-wrapper. Existing callers that do not pass wrap_with_directory=False get a different root CID and different fetch_file semantics. Reviewer @yashksaini-coder flagged this as blocking; it remains unresolved.
  • Suggestion: Default to False for backward compatibility, or document as breaking in newsfragments/1321.breaking.rst (and PR body) and add a migration note. All new tests pass wrap_with_directory=False, so regressions in default behavior are untested.

libp2p/bitswap/dag.py — lines 197–203

    async def add_file(
        self,
        file_path: str,
        chunk_size: int | None = None,
        progress_callback: Callable[[int, int, str], None] | None = None,
        wrap_with_directory: bool = True,
    ) -> bytes:

  • File: libp2p/bitswap/chunker.py, newsfragments/1321.feature.rst
  • Line(s): 16–16 (chunker); newsfragment line 4
  • Issue: “Kubo CID compatibility” is overstated for multi-chunk files. DEFAULT_CHUNK_SIZE = 63 * 1024 - 32 does not match Kubo’s default size-262144 (256 KiB). Leaf blocks can match Kubo (RawLeaves=false + dag-pb), but root CIDs for chunked files differ unless Kubo uses --chunker=size-65504. The newsfragment claims “identical CIDs to Kubo's ipfs add” without this caveat.
  • Suggestion: Narrow the newsfragment claim (e.g. “wire-compatible dag-pb layout; root CID matches Kubo when using the same chunk size”) or adopt/document Kubo’s 256 KiB default and enforce MAX_BLOCK_SIZE at the wire layer.

libp2p/bitswap/chunker.py — lines 13–16

# Default chunk size: 63 KB (py-libp2p accepts less than 64 KB)
# 63 KB minus 32 bytes to leave room for the dag-pb leaf envelope overhead,
# ensuring wrapped blocks never exceed MAX_BLOCK_SIZE (63 * 1024).
DEFAULT_CHUNK_SIZE = 63 * 1024 - 32

  • File: libp2p/records/utils.py
  • Line(s): 73–76 (docstring)
  • Issue: Documentation build failure — Sphinx/docutils error on verify_record docstring (Unexpected indentation / block-quote warning). CI: tox (3.10, docs) and Read the Docs fail on this branch.
  • Suggestion: Fix the author_public_key parameter description (avoid indented parenthetical under Args; use a single line or blank line before continuation). Re-run make linux-docs.

  • File: PR description / GitHub linking
  • Line(s): N/A
  • Issue: No explicit Fixes #1321 (or Closes #1321) in the PR body. Issue feat: Bitswap improvements for Kubo compatibility #1321 exists and matches scope; newsfragment uses 1321.feature.rst.
  • Suggestion: Add Fixes #1321 to the PR description for maintainer workflow and auto-close.

Minor

  • File: libp2p/bitswap/dag.py
  • Line(s): 813
  • Issue: O(n²) reassemblyfile_data += chunk in fetch_file allocates repeatedly; significant cost for large multi-chunk files (~1626 chunks × 63 KiB).
  • Suggestion: Use bytearray or list + b"".join().

libp2p/bitswap/dag.py — lines 809–814

            else:
                chunk = leaf_raw
                logger.debug(f"[DAG] Leaf {idx + 1}: raw block {len(chunk)} bytes")

            file_data += chunk
            bytes_fetched += len(chunk)

  • File: libp2p/bitswap/dag_pb.py
  • Line(s): 125–149
  • Issue: Same O(n²) pattern in encode_dag_pb (result += ...).
  • Suggestion: Build with bytearray or pre-sized buffer.

  • File: libp2p/bitswap/client.py
  • Line(s): 522–523
  • Issue: _send_wantlist_to_peer swallows all exceptions after logging. In batch get_blocks_batch, a failed wantlist send can leave waiters blocked until the full per-batch timeout.
  • Suggestion: Propagate failure, set pending events with error, or fail fast.

libp2p/bitswap/client.py — lines 522–523

        except Exception as e:
            logger.error(f"Failed to send wantlist to peer {peer_id}: {e}")

  • File: libp2p/bitswap/dag.py
  • Line(s): 169–181
  • Issue: Test/mock workaround in productiongetattr + bare except Exception: pass around get_blocks_batch masks real failures.
  • Suggestion: Type bitswap with a Protocol for batch fetch; remove runtime probing.

  • File: libp2p/bitswap/block_store.py
  • Line(s): 168–174
  • Issue: Non-atomic put_block — crash mid-write_bytes can leave truncated block files.
  • Suggestion: Write to .tmp then os.replace.

  • File: tests (missing file named in PR description)
  • Line(s): N/A
  • Issue: PR description lists test_canonical_dag_pb.py for wire ordering; no such file exists. test_dag_pb.py round-trips encode/decode but does not assert that link tags (0x12) precede data tags (0x0a) on the wire.
  • Suggestion: Add an explicit assertion (e.g. first field tag is 0x12 when links present) to lock the Kubo compatibility fix.

  • File: libp2p/bitswap/chunker.py
  • Line(s): __all__ (module end)
  • Issue: chunk_stream not exported in __all__ (nit from reviewer).
  • Suggestion: Add "chunk_stream" to __all__.

  • File: PR checklist
  • Line(s): To-Do in PR body
  • Issue: Author TODOs still open: clean commit history, update user-facing docs beyond newsfragment, release notes entry (partially done via newsfragment).
  • Suggestion: Complete or tick before merge.

Maintainer feedback status

Reviewer Topic Status
@yashksaini-coder Ed25519-only verify_record ✅ Fixed (_unmarshal_public_key + tests)
@yashksaini-coder Provider local-store vs network ✅ Fixed (provider_store.find_providers in _query_single; tests assert find_providers called)
@yashksaini-coder Signed PUT outbound ✅ Fixed (_store_at_peer + unit tests)
@yashksaini-coder wrap_with_directory=True default Open
@yashksaini-coder Chunk size vs Kubo default Open (undocumented in newsfragment)
@yashksaini-coder Perf / atomic writes / wantlist errors Open (acceptable post-merge per reviewer, except docs CI)
Winter-Soren (AI) Provider + signed PUT ✅ Addressed in later commits

5. Security Review

Risk Impact Mitigation
Signed DHT records now verified with multi-key support Low (improvement) Keep tests for tampered payloads and invalid protobuf authors.
verify_record returns False on any exception (broad except) Low Acceptable for verification API; log at debug if diagnosing interop failures.
Truncated blocks on disk after crash Low–Medium (integrity) Atomic write + optional CID re-hash on read for production stores.
Malicious deep/single-link DAG causing RecursionError in _collect_leaves_local Low Use iterative traversal for untrusted CIDs.
No new obvious RCE or auth bypass in Bitswap message parsing Continue relying on block size limits and existing stream handling.

Overall security impact: Low (incremental hardening on DHT signing; no new high-risk surfaces identified).


6. Documentation and Examples

Present:

  • Module/docstrings on new APIs (BlockService, FilesystemBlockStore, ProviderQueryManager, verify_record).
  • Updated examples/bitswap/bitswap.py.
  • newsfragments/1321.feature.rst.

Gaps:

  • Newsfragment overclaims full ipfs add CID parity (chunk size caveat missing).
  • No dedicated user guide for Kubo interop (chunk size, wrap_with_directory, BlockService wiring) — PR TODO acknowledges this.
  • Sphinx build broken on verify_record docstring (blocks docs CI).
  • PR-listed test_canonical_dag_pb.py documentation does not match repo (see §4 Minor).

7. Newsfragment Requirement

Check Result
File exists newsfragments/1321.feature.rst
Naming 1321.feature.rst
User-facing content ✅ Bullet list
Newline at EOF ✅ (assumed from lint pass)
Type appropriateness ⚠️ Consider .breaking.rst if wrap_with_directory=True default is kept
Issue link in PR ⚠️ Issue #1321 exists; add Fixes #1321 to PR body

Not a blocker on newsfragment presence alone; blocker if breaking default is shipped without .breaking.rst or default revert.


8. Tests and Validation

Linting (make lint)

  • Exit code: 0
  • Result: All hooks passed (ruff, mypy-local, pyrefly-local, mdformat, etc.).
  • Log: downloads/AI-PR-REVIEWS/1321/lint_output.log

Type checking (make typecheck)

  • Exit code: 0
  • Result: mypy + pyrefly passed.
  • Log: downloads/AI-PR-REVIEWS/1321/typecheck_output.log

Tests (make test)

  • Exit code: 0
  • Summary: 2842 passed, 16 skipped, ~118 s
  • Bitswap subset: 238 passed in ~8.4 s
  • Log: downloads/AI-PR-REVIEWS/1321/test_output.log

Documentation (make linux-docs)

  • Exit code: 2 (failed)
  • Errors:
    • libp2p/records/utils.py:docstring of libp2p.records.utils.verify_record:12: ERROR: Unexpected indentation.
    • libp2p/records/utils.py:docstring ...:13: WARNING: Block quote ends without a blank line
  • Sphinx treats warnings as errors (-W).
  • Log: downloads/AI-PR-REVIEWS/1321/docs_output.log

CI (GitHub Actions, PR head)

Check Status
tox (3.10–3.13, core) ✅ pass
tox (*, lint) ✅ pass
tox (3.10, docs) fail
Read the Docs fail
Windows core ✅ pass

9. Recommendations for Improvement

  1. Unblock CI: Fix verify_record docstring RST formatting; re-run docs tox.
  2. Resolve reviewer blockers: Revert wrap_with_directory default to False or ship .breaking.rst + migration text; document chunk-size vs Kubo in newsfragment and chunker.py module docstring.
  3. Performance (quick wins): bytearray in fetch_file and encode_dag_pb; atomic writes in FilesystemBlockStore.
  4. Reliability: Propagate or surface errors from _send_wantlist_to_peer; remove MagicMock-oriented fallback in _get_blocks_batch.
  5. Tests: Add explicit canonical wire-order test (0x12 before 0x0a); optional integration test with Kubo/ipfs add for same chunker.
  6. Process: Add Fixes #1321 to PR body; squash/clean 26 commits if maintainers require linear history.

10. Questions for the Author

  1. Is wrap_with_directory=True as default intentional for IPFS parity, and are you willing to call it breaking in release notes?
  2. Should the project standardize on 63 KiB chunks permanently, or align with Kubo’s 256 KiB default over time?
  3. For ProviderQueryManager, is provider_store.find_providers (network DHT walk) sufficient, or should callers use KadDHT.find_providers at the host API level for consistency?
  4. Do you plan Kubo interop integration tests (dockerized ipfs add / ipfs cat) in a follow-up PR?
  5. Will you fix the Sphinx docstring in this PR or a fast follow-up?

11. Overall Assessment

Metric Rating
Quality Good — strong technical core; remaining issues are mostly API contract, docs CI, and performance hygiene
Security impact Low
Merge readiness Needs fixes — docs CI failure + unresolved blocking review on wrap_with_directory default and CID/chunk-size messaging
Confidence High (full local test run + code review on PR branch; CI docs failure reproduced locally)

Summary: This PR materially fixes real Kubo Bitswap/DHT interoperability gaps and adds valuable infrastructure (BlockService, persistent store, batch wants). Earlier major review findings (signed PUT propagation, network provider lookup, multi-key verification) appear addressed. Before merge: fix docs build, resolve wrap_with_directory default / breaking disclosure, and correct newsfragment claims about chunk size and root CID parity with default Kubo ipfs add.

- Added `payment_client_1_3.py` for handling in-band payment messages, including:
  - Processing PaymentTerms and sending PaymentAuthorization.
  - Handling PaymentReceipts and PaymentRejections.
  - Validating payment amounts and signing EIP-3009 transactions.

- Introduced `payment_ledger.py` for tracking payments at the root CID level:
  - Supports registration of DAG structures for child blocks.
  - Implements nonce deduplication to prevent replay attacks.
  - Provides methods to check payment status and record payments.

- Updated protobuf definitions in `bitswap_1_3_0_pb2.pyi` to reflect new payment structures:
  - Added PaymentTerms, PaymentAuthorization, and PaymentReceipt messages.
  - Introduced TxReceipt for transaction details.

- Created `pricing_engine.py` to compute block pricing based on configurable strategies:
  - Supports free, fixed, size-based, and custom pricing strategies.
  - Allows marking specific CIDs as free and setting per-CID prices.
- Added PaymentExtension class to handle payment-related protobuf fields and wantlists in Bitswap 1.3.0.
- Integrated payment terms, receipts, and rejections processing for client-side and server-side.
- Enhanced PaymentLedger to track root CID payments and manage payment records.
- Updated pricing engine to support configurable pricing strategies.
- Refactored tests to accommodate changes in block storage and encoding, ensuring raw blocks are used for leaves.
- Improved type hints and documentation across the codebase for better clarity and maintainability.
@acul71

acul71 commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

Thanks @sumanjeet0012, for the fix.

Overall Assessment

Metric Rating
Quality Good
Security impact Low–Medium
Merge readiness Needs maintainer sign-off
Confidence High

HEAD: 50d3c26a — 39 commits ahead of main, no merge conflicts. All CI jobs pass (lint, core, docs, interop, Windows, Read the Docs).

What the PR Does Well

The Kubo interoperability fixes are sound on the current branch:

  • Raw leaf blocks + 256 KiB chunks (Kubo defaults)
  • Balanced DAG layout (max 174 links)
  • Canonical DAG-PB wire encoding (Links before Data)
  • Signed DHT records with multi-key verification
  • Batch wantlist fetching, BlockService, FilesystemBlockStore, ProviderQueryManager
  • Strong test additions (238 bitswap tests pass locally)

Earlier review findings (provider network lookup, signed PUT_VALUE propagation, Ed25519-only verification, docs CI) are addressed.

Open Blockers / Major Issues

  1. wrap_with_directory=True default@yashksaini-coder flagged this as blocking. Existing callers get a directory CID instead of a file CID. All tests use wrap_with_directory=False. Either revert the default or add 1321.breaking.rst.

  2. Bitswap 1.3.0 Payment Extension (~1,400 lines) has zero tests — includes auto-pay up to $1.00 USDC. Needs tests or a follow-up PR.

  3. payment_client_1_3.py.backup committed — 455-line duplicate with divergent payment defaults; should be removed.

  4. Stale PR description — still describes dag-pb leaves and 63 KiB chunks; lists examples/bitswap_payment_example.py and test_canonical_dag_pb.py, which don't exist.

  5. Duplicate newsfragments — both 1321.feature.rst and 1347.feature.rst cover overlapping Kubo interop work.

Validation Results (this run)

Check Result
make lint ✅ pass
make typecheck ✅ pass
make linux-docs ✅ pass
make test (local) 2875 passed, 1 failed (test_invalid_dns — local DNS sinkhole; CI passes)
Bitswap tests ✅ 238 passed

Recommended Before Merge

  1. Resolve wrap_with_directory default or document as breaking
  2. Add payment tests or split payment into a follow-up
  3. Remove payment_client_1_3.py.backup
  4. Update PR body + add Fixes #1321
  5. Consolidate feat: Bitswap improvements for Kubo compatibility #1321 / Enhancement: Implement Full Kubo Interoperability for Bitswap, UnixFS, and DHT #1347 newsfragments

Post-merge hygiene (reviewer-accepted): bytearray in file reassembly, atomic block writes, propagate wantlist send errors, remove print() debug output in fetch_file.

FULL REVIEW

AI PR Review: #1321 — feat: Bitswap improvements for Kubo compatibility

PR: #1321
Branch: improvement/bitswap (39 commits ahead of origin/main, 0 behind)
Reviewer: AI review (prompt: downloads/py-libp2p-pr-review-prompt.md)
Review version: 3
HEAD: 50d3c26a — fix: lint issues
CI:Run tox #27506869200 — all jobs pass on 50d3c26a


1. Summary of Changes

This PR is a large feature change centered on Bitswap / Merkle DAG interoperability with Kubo (go-ipfs) and supporting infrastructure:

Area What changed
DAG / UnixFS Raw leaf blocks (CODEC_RAW, Kubo RawLeaves=true default for multi-chunk files), balanced tree layout (balanced_layout, max 174 links), canonical DAG-PB wire encoding (Links before Data in encode_dag_pb).
Chunk size DEFAULT_CHUNK_SIZE = 256 * 1024 — matches Kubo's default size-262144.
Bitswap client Batch wantlist fetching (get_blocks_batch), typed wantlist API (wantlist.py), ProviderQueryManager, IBitswapExtension hook, Bitswap 1.3.0 Payment Extension (~1,400+ lines across payment client, ledger, pricing engine, gated decision engine, extension).
Storage / caching FilesystemBlockStore, BlockService (local-first + auto-cache).
Streaming chunk_stream / add_stream for io.IOBase inputs.
Kademlia / records Signed DHT put records, multi-key-type verify_record, outbound PUT_VALUE carries signature/author via CopyFrom(signed_record).
Scope creep Yamux SYN+RST handling fix and QUIC transport logging fix (unrelated to Bitswap).

Related issues:

The PR body does not include Fixes #1321 — recommend adding it for auto-close on merge.

Breaking / behavior changes (user-facing):

  • MerkleDag.add_file(..., wrap_with_directory=True) by default — callers get a directory wrapper CID unless they pass wrap_with_directory=False.
  • MerkleDag.fetch_file() returns tuple[bytes, str | None] instead of bare bytes.

PR description vs. implementation: The PR body still describes dag-pb-wrapped leaves (RawLeaves=false) and an outdated 63 KiB chunk size. The code uses raw leaves and 256 KiB chunks — the correct Kubo defaults for ipfs add parity.

Modules touched: libp2p.bitswap.*, libp2p.kad_dht.*, libp2p.records.*, libp2p.peer.envelope, examples/bitswap/, plus incidental yamux.py and quic/transport.py.


2. Branch Sync Status and Merge Conflicts

Branch Sync Status

  • Status: ℹ️ Ahead of origin/main (0 behind).
  • Details: branch_sync_status.txt0 39.

Merge Conflict Analysis

No merge conflicts detected. Test merge of origin/main into the PR branch completed cleanly (merge_conflict_check.log).


3. Strengths

  • Root-cause fixes are sound: flat DAG layout, protobuf field ordering, and missing DHT signing were real Kubo incompatibilities; manual DAG-PB envelope (0x12 links then 0x0a data) is the correct minimal fix.
  • Kubo alignment on current HEAD: 256 KiB chunks and raw-leaf encoding match Kubo defaults; balanced internal nodes use dag-pb with canonical wire ordering.
  • Good layering: BlockService and FilesystemBlockStore are optional; MerkleDag(bitswap) still works without them.
  • Substantial automated tests: 238 bitswap tests pass locally; new coverage for BlockService, provider query manager, signed PUT_VALUE, multi-key verify_record, stream ingestion, and UnixFS/raw-leaf layout.
  • Prior review findings addressed: network provider lookup (provider_store.find_providers), signed PUT_VALUE propagation, multi-key verify_record, docs/lint CI (commit 50d3c26a).
  • All CI checks green on current HEAD (lint, core, docs, interop, Windows, Read the Docs).

4. Issues Found

Critical

None identified on commit 50d3c26a. All CI jobs pass.

Major

  • File: libp2p/bitswap/dag.py
  • Line(s): 202
  • Issue: Breaking API default still unresolvedwrap_with_directory=True changes the default CID from file-root to directory-wrapper. Reviewer @yashksaini-coder flagged this as blocking; it remains open. All new tests pass wrap_with_directory=False, so default behavior is untested.
  • Suggestion: Default to False for backward compatibility, or add newsfragments/1321.breaking.rst with a migration note.

libp2p/bitswap/dag.py — lines 197–203

    async def add_file(
        self,
        file_path: str,
        chunk_size: int | None = None,
        progress_callback: Callable[[int, int, str], None] | None = None,
        wrap_with_directory: bool = True,
    ) -> bytes:

  • File: libp2p/bitswap/payment_extension.py, payment_client_1_3.py, payment_ledger.py, pricing_engine.py, gated_decision_engine.py
  • Line(s): module-wide
  • Issue: Bitswap 1.3.0 Payment Extension has no automated tests. ~1,400+ lines covering payment-gated serving, EIP-3009 auto-pay (default up to $1.00 USDC), ledger/nonce tracking — zero tests under tests/.
  • Suggestion: Add unit tests for pricing, nonce replay rejection, and authorization bounds; or defer payment to a follow-up PR.

  • File: libp2p/bitswap/payment_client_1_3.py.backup
  • Line(s): full file (455 lines)
  • Issue: Backup source file committed to the repository. This is not referenced by any import and duplicates payment_client_1_3.py with divergent defaults (DEFAULT_MAX_AUTO_PAY_UNITS = 1000 vs live file's 1000000).
  • Suggestion: Remove the .backup file before merge.

  • File: PR description
  • Line(s): N/A
  • Issue: PR body is stale — still describes dag-pb-wrapped leaves (RawLeaves=false) and 63 KiB chunks; code uses raw leaves and 256 KiB chunks. Also lists examples/bitswap_payment_example.py and test_canonical_dag_pb.py, neither of which exist in the branch.
  • Suggestion: Update the PR description to match the implemented strategy and actual file list.


  • File: PR description
  • Line(s): N/A
  • Issue: No explicit Fixes #1321 in the PR body.
  • Suggestion: Add Fixes #1321 for maintainer workflow and auto-close.

Minor

  • File: libp2p/bitswap/dag.py — lines 841 — O(n²) file_data += chunk in fetch_file; use bytearray.
  • File: libp2p/bitswap/dag_pb.py — lines 135–162 — same O(n²) pattern in encode_dag_pb.
  • File: libp2p/bitswap/client.py — lines 552–553 — _send_wantlist_to_peer swallows exceptions; batch waiters can hang until timeout.
  • File: libp2p/bitswap/dag.py — lines 169–194 — getattr + bare except Exception: pass around get_blocks_batch (mock workaround in production).
  • File: libp2p/bitswap/block_store.py — lines 168–174 — non-atomic put_block writes.
  • File: tests — PR lists test_canonical_dag_pb.py; file does not exist; test_dag_pb.py round-trips encode/decode but does not assert link tags (0x12) precede data tags (0x0a) on the wire.
  • File: libp2p/bitswap/chunker.py__all__chunk_stream missing from exports.
  • File: libp2p/bitswap/dag.py — lines 669–871 — print(..., flush=True) debug output in fetch_file (project convention prefers logger).
  • File: yamux.py, quic/transport.py — unrelated changes bundled into Bitswap PR.
  • File: libp2p/bitswap/payment_client_1_3.py — line 23 vs 37 — comment says $0.001 USDC default but max_auto_pay_usdc: float = 1.0 ($1.00); confusing for operators.

Maintainer feedback status

Reviewer Topic Status
@yashksaini-coder Ed25519-only verify_record ✅ Fixed
@yashksaini-coder Provider local-store vs network ✅ Fixed
@yashksaini-coder Signed PUT outbound ✅ Fixed
@yashksaini-coder Chunk size vs Kubo default ✅ Fixed (256 KiB)
@yashksaini-coder Docs / lint CI ✅ Fixed (50d3c26a)
@yashksaini-coder wrap_with_directory=True default Open
@yashksaini-coder Perf / atomic writes / wantlist errors ❌ Open (reviewer: post-merge OK)
Winter-Soren (AI) Provider + signed PUT ✅ Addressed
@acul71 Prior AI review (v0) Addressed in subsequent commits

5. Security Review

Risk Impact Mitigation
Payment auto-pay up to $1.00 USDC without tests Medium Add tests; require explicit opt-in
EIP-3009 signer typed as Any Low–Medium Type the signer interface
Truncated blocks on disk (non-atomic writes) Low–Medium Atomic write + optional re-hash on read
Malicious DAG → RecursionError in _collect_leaves_local Low Iterative traversal for untrusted CIDs
Signed DHT records with multi-key verification Low (improvement) Keep tamper tests
.backup file with divergent payment defaults Low Remove before merge to avoid confusion

Overall security impact: Low–Medium (payment extension adds financial surface without test coverage).


6. Documentation and Examples

Present: docstrings on new APIs, updated examples/bitswap/bitswap.py, newsfragments, Sphinx build passes.

Gaps:

  • Stale PR body (encoding strategy, chunk size, file list).
  • No Kubo interop user guide (chunk size, wrap_with_directory, BlockService wiring) — issue feat: Bitswap improvements for Kubo compatibility #1321 To-Do "Add or update documentation" still unchecked.
  • examples/bitswap_payment_example.py listed in PR description but not present in the branch.
  • test_canonical_dag_pb.py listed in PR description but not present.

7. Newsfragment Requirement

Check Result
File exists 1321.feature.rst (+ 1347.feature.rst)
Naming
User-facing content
Type appropriateness ⚠️ Consider .breaking.rst if wrap_with_directory=True kept
Issue link in PR ⚠️ Add Fixes #1321
Duplicate fragments ⚠️ #1321 / #1347 overlap
Content accuracy ⚠️ #1347 mentions dag-pb leaf encoding; implementation uses raw leaves

Not a blocker on newsfragment presence alone; blocker if breaking default is shipped without .breaking.rst or default revert.


8. Tests and Validation

Linting (make lint)

  • Exit code: 0 ✅
  • Result: All pre-commit hooks passed (ruff, mypy-local, pyrefly-local, mdformat, etc.).
  • Log: downloads/AI-PR-REVIEWS/1321/lint_output.log

Type checking (make typecheck)

  • Exit code: 0 ✅
  • Log: downloads/AI-PR-REVIEWS/1321/typecheck_output.log

Tests (make test)

Authoritative: CI (run #27506869200)

Environment Result
tox (3.10–3.13, core) 2625 passed, 12 skipped
tox (3.10–3.13, interop) ✅ pass
Windows (3.11–3.13, core) ✅ pass
Bitswap subset (local) 238 passed in ~8.9 s

Local make test: 2875 passed, 16 skipped, 1 failed in 114 s.

Test Result Notes
tests/core/network/test_connection_management.py::TestExtractIpFromMultiaddr::test_invalid_dns ❌ FAILED Local DNS sinkhole resolves this-domain-does-not-exist-12345.com to 127.0.0.1. CI passes — environment artifact, not a PR regression.

Log: downloads/AI-PR-REVIEWS/1321/test_output.log

Documentation (make linux-docs)

  • Exit code: 0 ✅
  • Log: downloads/AI-PR-REVIEWS/1321/docs_output.log

CI summary (HEAD 50d3c26a)

Check Status
tox (*, lint) ✅ pass
tox (3.10, docs) ✅ pass
tox (3.10–3.13, core) ✅ pass
tox (3.10, interop) ✅ pass
Read the Docs ✅ pass
Windows core ✅ pass

9. Recommendations for Improvement

  1. Resolve wrap_with_directory default — revert to False or ship 1321.breaking.rst.
  2. Add payment extension tests or split payment into a follow-up PR.
  3. Remove payment_client_1_3.py.backup from the branch.
  4. Update PR description (raw leaves, 256 KiB, actual file list); add Fixes #1321.
  5. Consolidate newsfragments for feat: Bitswap improvements for Kubo compatibility #1321 vs Enhancement: Implement Full Kubo Interoperability for Bitswap, UnixFS, and DHT #1347; align wording with raw-leaf encoding.
  6. Performance: bytearray in fetch_file / encode_dag_pb; atomic writes in FilesystemBlockStore.
  7. Reliability: propagate errors from _send_wantlist_to_peer; remove mock fallback in _get_blocks_batch.
  8. Tests: add explicit wire-order assertion (0x12 before 0x0a).
  9. Hygiene: remove print(..., flush=True) from fetch_file; split Yamux/QUIC fixes; add missing payment example or remove from PR description.

10. Questions for the Author

  1. Is wrap_with_directory=True intentional for IPFS parity, and will you document it as breaking?
  2. Should feat: Bitswap improvements for Kubo compatibility #1321 and Enhancement: Implement Full Kubo Interoperability for Bitswap, UnixFS, and DHT #1347 both keep newsfragments, or should one subsume the other?
  3. Is the payment extension ready to merge without tests, or should it be a follow-up?
  4. Was payment_client_1_3.py.backup left in intentionally?
  5. Were Yamux/QUIC changes required for Bitswap interop, or can they be split out?

11. Overall Assessment

Metric Rating
Quality Good — strong Kubo interoperability core; API-default, payment-test, and hygiene gaps remain
Security impact Low–Medium
Merge readiness Needs maintainer sign-off — CI is fully green; @yashksaini-coder's wrap_with_directory blocker and payment test gap are the main open items
Confidence High (synced to 50d3c26a; CI + local lint/docs/bitswap tests verified)

Summary: CI/CD is fully green on the current PR head (50d3c26a). The Kubo interoperability work is technically sound: raw leaves, 256 KiB chunks, balanced layout, canonical DAG-PB encoding, signed DHT records, and batch fetching address the real root causes. Substantive remaining concerns are the reviewer-flagged breaking default (wrap_with_directory=True), untested payment extension, committed backup file, and process/docs hygiene (stale PR body, duplicate newsfragments, missing files listed in description). None of these are CI failures.

@seetadev seetadev self-requested a review June 15, 2026 05:07
@acul71

acul71 commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Good work @sumanjeet0012 , thanks !

Recommendation
CI is green and the Kubo interoperability work is technically sound. Before merge: remove the committed binary, fix the newsfragment and PR description, and either add payment client/extension tests or split payment into a follow-up PR.

Full Review HERE:

AI PR Review: #1321 — feat: Bitswap improvements for Kubo compatibility

PR: #1321
Branch: improvement/bitswap
Reviewer: AI review (prompt: downloads/py-libp2p-pr-review-prompt.md)
Review version: 4
HEAD: 622e61f7 — Merge branch 'main' into improvement/bitswap
CI:Run tox #27535093124 — all jobs pass on 622e61f7


1. Summary of Changes

This PR is a large feature change centered on Bitswap / Merkle DAG interoperability with Kubo (go-ipfs) and supporting infrastructure.

Area What changed
DAG / UnixFS Raw leaf blocks (CODEC_RAW, Kubo RawLeaves=true default for multi-chunk files), balanced tree layout (balanced_layout, max 174 links), canonical DAG-PB wire encoding (Links before Data in encode_dag_pb).
Chunk size DEFAULT_CHUNK_SIZE = 256 * 1024 — matches Kubo's default size-262144.
Bitswap client Batch wantlist fetching (get_blocks_batch), typed wantlist API (wantlist.py), ProviderQueryManager, IBitswapExtension hook, Bitswap 1.3.0 Payment Extension (~1,400+ lines).
Storage / caching FilesystemBlockStore (with atomic writes), BlockService (local-first + auto-cache).
Streaming chunk_stream / add_stream for io.IOBase inputs.
Kademlia / records Signed DHT put records, multi-key-type verify_record, outbound PUT_VALUE carries signature/author via CopyFrom(signed_record).
Scope creep Yamux SYN+RST handling fix, QUIC transport logging fix, and a committed interop/bitswap/go_peer/go_peer Mach-O binary.

Related issues:

Breaking / behavior changes (user-facing):

  • MerkleDag.fetch_file() returns tuple[bytes, str | None] instead of bare bytes.
  • Default chunk size changes from ~63 KiB to 256 KiB for callers relying on implicit defaults — root CIDs will differ from pre-PR behavior.

PR description vs. implementation: The PR body still describes dag-pb-wrapped leaves (RawLeaves=false) and an outdated 63 KiB chunk size. The code uses raw leaves and 256 KiB chunks — the correct Kubo defaults for ipfs add parity.

Modules touched: libp2p.bitswap.*, libp2p.kad_dht.*, libp2p.records.*, libp2p.peer.envelope, examples/bitswap/, plus incidental yamux.py, quic/transport.py, and interop/bitswap/go_peer/.


2. Branch Sync Status and Merge Conflicts

Branch Sync Status

  • Status: ℹ️ Ahead of origin/main (0 behind).
  • Details: branch_sync_status.txt0 43 (0 commits on main not in HEAD; 43 commits on HEAD not in main).

Merge Conflict Analysis

No merge conflicts detected. Test merge of origin/main into the PR branch completed cleanly (merge_conflict_check.log).


3. Strengths

  • Root-cause fixes are sound: flat DAG layout, protobuf field ordering, and missing DHT signing were real Kubo incompatibilities; manual DAG-PB envelope (0x12 links then 0x0a data) is the correct minimal fix.
  • Kubo alignment on current HEAD: 256 KiB chunks and raw-leaf encoding match Kubo defaults; balanced internal nodes use dag-pb with canonical wire ordering.
  • Good layering: BlockService and FilesystemBlockStore are optional; MerkleDag(bitswap) still works without them.
  • Substantial automated tests: 245 bitswap tests pass locally; new coverage for BlockService, provider query manager, signed PUT_VALUE, multi-key verify_record, stream ingestion, UnixFS/raw-leaf layout, and payment pricing/ledger/gating.
  • Prior review findings addressed: network provider lookup (provider_store.find_providers), signed PUT_VALUE propagation, multi-key verify_record, docs/lint CI, wrap_with_directory=False default, atomic block writes, batch wantlist fail-fast, bytearray reassembly/encoding.
  • All CI checks green on current HEAD (lint, core, docs, interop, Windows, Read the Docs).

4. Issues Found

Critical

None identified on commit 622e61f7. All CI jobs pass.

Major

  • File: interop/bitswap/go_peer/go_peer
  • Line(s): full file (~43 MB)
  • Issue: Committed pre-built Mach-O arm64 binary with no references in the codebase (grep finds zero usages). It is not portable to Linux CI, bloats the repository, and is difficult to audit. Binaries in source control are a supply-chain and maintenance risk.
  • Suggestion: Remove the binary from the PR. If interop testing needs a Go peer, add a Makefile/build script and document how to compile it, or rely on the existing interop tox job.

  • File: newsfragments/1347.feature.rst
  • Line(s): 1
  • Issue: Newsfragment wording does not match implementation. It claims "UnixFS DAG-PB encoding" for leaves, but multi-chunk add_file/add_bytes store raw leaf blocks (CODEC_RAW, Kubo RawLeaves=true). Internal nodes use dag-pb; leaves do not.
  • Suggestion: Update the newsfragment to describe raw-leaf + balanced-layout + canonical internal-node encoding, not dag-pb leaf wrapping.

newsfragments/1347.feature.rst — line 1

Implement comprehensive Bitswap interoperability with IPFS Kubo, including UnixFS DAG-PB encoding and balanced layout support. Introduces ``FilesystemBlockStore`` and ``BlockService`` for robust block caching, Bitswap batch fetching, and streaming inputs (``chunk_stream``).

  • File: libp2p/bitswap/payment_client_1_3.py, payment_extension.py
  • Line(s): module-wide
  • Issue: Payment client and extension have no automated tests. tests/core/bitswap/test_payment.py covers BlockPricingEngine, PaymentLedger, and PaymentGatedDecisionEngine only. BitswapPaymentClient_1_3 and PaymentExtension (~600+ lines combined) have zero test references.
  • Suggestion: Add unit tests for auto-pay threshold enforcement, EIP-3009 message round-trips, and extension message handling; or defer payment client/extension to a follow-up PR.

  • File: PR description
  • Line(s): N/A
  • Issue: PR body is stale — still describes dag-pb-wrapped leaves (RawLeaves=false) and 63 KiB chunks; lists files (test_canonical_dag_pb.py, examples/bitswap_payment_example.py) that do not exist on the branch.
  • Suggestion: Rewrite the PR description to match the implemented raw-leaf + 256 KiB strategy and the actual file list.

Minor

  • File: libp2p/bitswap/client.py
  • Line(s): 759–763, 880, 900, 914
  • Issue: print(..., flush=True) debug output in production message-handling paths. Project convention prefers logger (see #1207 internal change).
  • Suggestion: Replace with logger.debug / logger.info.

libp2p/bitswap/client.py — lines 759–763

            print(
                f"\n📥 RECEIVED WANTLIST from peer {peer_id_str} with "
                f"{len(msg.wantlist.entries)} entries",
                flush=True,
            )

  • File: tests (missing assertion)
  • Line(s): N/A
  • Issue: No test asserts canonical wire field ordering (0x12 link tags before 0x0a data tags). encode_dag_pb implements this correctly via bytearray, but the fix is not locked by an explicit regression test.
  • Suggestion: Add a small unit test in test_dag_pb.py asserting first field tag is 0x12 when links are present.

  • File: libp2p/stream_muxer/yamux/yamux.py, libp2p/transport/quic/transport.py
  • Line(s): various
  • Issue: Unrelated transport fixes bundled into a Bitswap-focused PR, increasing review surface.
  • Suggestion: Split into a separate PR if not required for Bitswap interop.

Maintainer feedback status

Reviewer Topic Status
@yashksaini-coder Ed25519-only verify_record ✅ Fixed
@yashksaini-coder Provider local-store vs network ✅ Fixed
@yashksaini-coder Signed PUT outbound ✅ Fixed
@yashksaini-coder Chunk size vs Kubo default ✅ Fixed (256 KiB)
@yashksaini-coder Docs / lint CI ✅ Fixed
@yashksaini-coder wrap_with_directory=True default ✅ Fixed (False on HEAD)
@yashksaini-coder Perf / atomic writes / wantlist errors ✅ Fixed on HEAD
Winter-Soren (AI) Provider + signed PUT ✅ Addressed
Prior AI reviews (v0–v3) Payment tests, backup file, stale PR body ⚠️ Partially addressed

5. Security Review

Risk Impact Mitigation
Committed 43 MB Mach-O binary (go_peer) Medium Remove binary; build from source in CI or document manual build
Payment auto-pay without client/extension tests Low–Medium Add tests for threshold enforcement and nonce replay on client path
EIP-3009 signer typed as Any Low Type the signer interface
Signed DHT records with multi-key verification Low (improvement) Keep tamper tests
Malicious DAG → RecursionError in _collect_leaves_local Low Iterative traversal for untrusted CIDs (post-merge)

Overall security impact: Low–Medium (primarily from the committed binary and untested payment client path).


6. Documentation and Examples

Present: docstrings on new APIs, updated examples/bitswap/bitswap.py, newsfragments/1347.feature.rst, Sphinx build passes.

Gaps:

  • Stale PR body (encoding strategy, chunk size, file list).
  • Newsfragment overclaims dag-pb leaf encoding.
  • No Kubo interop user guide (chunk size, wrap_with_directory, BlockService wiring).
  • examples/bitswap_payment_example.py listed in PR description but not present.

7. Newsfragment Requirement

Check Result
File exists newsfragments/1347.feature.rst
Naming 1347.feature.rst matches linked issue
User-facing content ✅ Bullet-style ReST
Newline at EOF ✅ (lint passes)
Issue link in PR Fixes #1347 in PR body
Content accuracy ⚠️ Mentions dag-pb leaf encoding; implementation uses raw leaves
Duplicate issues ⚠️ #1321 open with same scope; no 1321.*.rst on HEAD

Not a blocker on newsfragment presence; minor blocker on newsfragment accuracy.


8. Tests and Validation

Linting (make lint)

  • Exit code: 0 ✅
  • Result: All pre-commit hooks passed (ruff, mypy-local, pyrefly-local, mdformat, etc.).
  • Log: downloads/AI-PR-REVIEWS/1321/lint_output.log

Type checking (make typecheck)

  • Exit code: 0 ✅
  • Log: downloads/AI-PR-REVIEWS/1321/typecheck_output.log

Tests (make test)

Authoritative: CI (run #27535093124)

Environment Result
tox (3.10–3.13, core) ✅ pass
tox (3.10–3.13, interop) ✅ pass
Windows (3.11–3.13, core) ✅ pass
Bitswap subset (local) 245 passed in ~8.7 s

Local make test: 2882 passed, 16 skipped, 1 failed in 127 s.

Test Result Notes
tests/core/network/test_connection_management.py::TestExtractIpFromMultiaddr::test_invalid_dns ❌ FAILED Local DNS sinkhole resolves this-domain-does-not-exist-12345.com to 127.0.0.1. CI passes — environment artifact, not a PR regression.

Log: downloads/AI-PR-REVIEWS/1321/test_output.log

Documentation (make linux-docs)

  • Exit code: 0 ✅
  • Log: downloads/AI-PR-REVIEWS/1321/docs_output.log

CI summary (HEAD 622e61f7)

Check Status
tox (*, lint) ✅ pass
tox (3.10, docs) ✅ pass
tox (3.10–3.13, core) ✅ pass
tox (3.10, interop) ✅ pass
Read the Docs ✅ pass
Windows core ✅ pass

9. Recommendations for Improvement

  1. Remove interop/bitswap/go_peer/go_peer binary; add a build script or rely on interop tox infrastructure.
  2. Update newsfragments/1347.feature.rst to describe raw-leaf encoding accurately.
  3. Rewrite the PR description to match current implementation and file list.
  4. Add payment client/extension tests or split payment into a follow-up PR.
  5. Add explicit wire-order regression test (0x12 before 0x0a).
  6. Replace print() debug output in client.py with logger calls.
  7. Clarify feat: Bitswap improvements for Kubo compatibility #1321 / Enhancement: Implement Full Kubo Interoperability for Bitswap, UnixFS, and DHT #1347 issue relationship for maintainers.

10. Questions for the Author

  1. What is the purpose of the committed go_peer Mach-O binary, and can it be removed or replaced with a build step?
  2. Should feat: Bitswap improvements for Kubo compatibility #1321 be closed as duplicate of Enhancement: Implement Full Kubo Interoperability for Bitswap, UnixFS, and DHT #1347, or should both keep separate newsfragments?
  3. Is the payment client/extension ready to merge without dedicated tests, or planned as a fast follow-up?
  4. Were Yamux/QUIC changes required for Bitswap Kubo interop, or can they be split out?
  5. Will you update the PR description before maintainer sign-off?

11. Overall Assessment

Metric Rating
Quality Good — Kubo interoperability core is solid; most prior blockers resolved
Security impact Low–Medium (committed binary; untested payment client path)
Merge readiness Needs minor fixes — CI fully green; remove binary, fix newsfragment/PR body, add payment client tests or defer
Confidence High (synced to 622e61f7; CI + local lint/docs/bitswap tests verified)

Summary: This PR materially fixes real Kubo Bitswap/DHT interoperability gaps. Since review v3, the author addressed the major reviewer blockers (wrap_with_directory default, chunk size, atomic writes, batch fail-fast, payment component tests for pricing/ledger/gating, backup file removal). CI is fully green. Remaining concerns are the 43 MB committed binary, stale PR description, inaccurate newsfragment wording, and missing tests for BitswapPaymentClient_1_3 / PaymentExtension. None are CI failures; the binary removal is the highest-priority hygiene item before merge.

@Winter-Soren

Copy link
Copy Markdown
Contributor

AI PR Review: #1321

PR: #1321
Related issue: #1347
Reviewed head: 30a13aab63a74e246e01b44db3d6401f6182971d
Review date: 2026-06-18

1. Summary of Changes

This PR is a large Bitswap/Kubo compatibility update for issue #1347. It adds raw-leaf chunking, 256 KiB chunk defaults, balanced UnixFS/DAG-PB construction, canonical DAG-PB internal-node encoding, a filesystem-backed block store, a BlockService, provider querying, streaming chunk input, and a typed wantlist layer.

It also adds a new Bitswap 1.3.0 payment stack: payment protocol buffers, PaymentExtension, BitswapPaymentClient_1_3, PaymentLedger, PricingEngine, and PaymentGatedDecisionEngine. The DHT and record layers are updated so provider/value records can be signed and exchanged in the expected Kubo-compatible form.

The PR touches core Bitswap behavior, DHT value storage, record signing, examples, and tests. It is not a narrow patch; it changes multiple cross-module contracts.

2. Strengths

  • The UnixFS path is much closer to Kubo behavior than the previous implementation: raw leaves, canonical DAG-PB internal nodes, balanced layouts, and 256 KiB chunking are all represented.
  • The prior high-risk review items around wrap_with_directory=True, local-only provider lookup, unsigned DHT PUT records, stale chunk sizing, and committed interop binaries appear fixed on the current head.
  • BlockService and FilesystemBlockStore are useful separations of concern and make block caching and future persistence cleaner.
  • The newsfragment is present and accurately describes the main user-visible feature work.
  • GitHub CI reports all 37 check runs passing on the reviewed head, including docs, lint, core tests, interop tests, utility tests, wheel checks, and Windows checks.

3. Issues Found

Critical

No critical correctness issue was confirmed in the non-payment Bitswap/UnixFS path during this review.

Major

1. Payment rejection responses can be dropped by an unbound local variable

File: libp2p/bitswap/payment_extension.py
Lines: 131-163

Issue: PaymentExtension.process_message() computes _nb only inside the _has_blocks branch, but logs SENDING {_nb} block(s) unconditionally before writing the response. If the payment engine returns a pure PaymentRejection with no payload and no blocks, _nb is unbound. The broad exception handler then catches the resulting UnboundLocalError, logs it, and returns without sending the rejection.

Impact: Invalid signatures, insufficient payment, and other rejected authorizations may time out client-side instead of receiving a protocol-level rejection. That breaks the Bitswap 1.3.0 payment error path and makes failures hard to diagnose.

Suggestion: Initialize _nb = 0 before the conditional, or move the block-count log inside the _has_blocks branch. Add a regression test where process_incoming_1_3_message() returns only payment_rejections and verify stream.write() is still called.

2. Payment block responses use invalid CID prefixes

File: libp2p/bitswap/gated_decision_engine.py
Lines: 425-440 and 473-481

Issue: _make_receipt_and_block() and _make_block_response() set the Bitswap payload prefix with cid_bytes[:4]. The normal Bitswap response path uses get_cid_prefix(...), and the receiver reconstructs CIDs with reconstruct_cid_from_prefix_and_data(prefix, data). A four-byte slice is not a valid general CID prefix. It is especially wrong for CIDv0, where the prefix should be empty, and for CIDv1 values whose varint-encoded prefix is not exactly four bytes.

Impact: Paid or payment-gated blocks can be reconstructed under the wrong CID, which can prevent pending requests from completing or store data under an unexpected key. This undermines the new payment retrieval path even when the block bytes are correct.

Suggestion: Use the same CID-prefix helper as the standard Bitswap send path, for example get_cid_prefix(parse_cid(cid_bytes)) or the existing project helper that matches current client behavior. Add regression tests for CIDv0 and a CIDv1 multicodec requiring a prefix longer than four bytes, and round-trip them through reconstruct_cid_from_prefix_and_data().

3. Payment-gated serving fails open when no transaction verifier is configured

File: libp2p/bitswap/gated_decision_engine.py
Lines: 51-60 and 279-335

Issue: PaymentGatedDecisionEngine accepts tx_verifier=None and then treats payment authorizations as valid in that configuration. The current test suite also locks this in by constructing the engine with tx_verifier=None and expecting a receipt plus payload for a sufficient payment.

Impact: A payment-gated provider that is misconfigured or deployed without a verifier releases blocks for syntactically sufficient but unverified payment authorizations. For a public payment extension, fail-open behavior is a security-sensitive default.

Suggestion: Fail closed when tx_verifier is missing, or require an explicit opt-in such as allow_optimistic_payments=True for tests and demos. Document that mode clearly and keep production defaults verifier-required.

4. The new payment protocol surface is under-tested at the integration boundary

Files: libp2p/bitswap/payment_extension.py, libp2p/bitswap/payment_client_1_3.py, tests/core/bitswap/test_payment.py

Issue: tests/core/bitswap/test_payment.py covers PricingEngine, PaymentLedger, and direct PaymentGatedDecisionEngine behavior, but the server extension and 1.3.0 payment client paths are not exercised end to end. The gaps include terms handling, authorization signing, rejection delivery, receipt handling, block payload reconstruction, and extension registration with BitswapClient.

Impact: The two functional bugs above both sit at the untested protocol boundary. CI can pass while the payment extension fails for real streams or reconstructs blocks under incorrect CIDs.

Suggestion: Add focused tests for PaymentExtension.process_message(), BitswapPaymentClient_1_3 request/authorization flow, rejection-only responses, receipt-plus-block responses, and registered 1.3.0 protocol handling in BitswapClient.

Minor

1. Bitswap 1.3.0 enablement is not obvious from docs or examples

File: libp2p/bitswap/config.py
Lines: 11-17

Issue: BITSWAP_PROTOCOL_V130 is defined, but BITSWAP_PROTOCOLS still lists only 1.2.0, 1.1.0, and 1.0.0. This appears to be intentional because BitswapClient.register_extension() inserts extension protocols before start, but that setup path is not shown in the example or documentation.

Suggestion: Add a short example or docs note showing how to register PaymentExtension, when to call it relative to BitswapClient.start(), and what verifier/wallet configuration is required.

2. Some block-size comments are stale after the chunk-size changes

File: libp2p/bitswap/config.py
Lines: 19-27

Issue: Comments still describe a roughly 63 KB maximum payload assumption, while MAX_BLOCK_SIZE is now 512 KiB and the UnixFS chunk default is 256 KiB.

Suggestion: Refresh the comments so future changes do not accidentally reintroduce the old 64 KiB assumption.

4. Security Review

The main security concern is the fail-open payment verifier behavior. If payment-gated serving is intended for production use, releasing blocks without verifier-backed transaction validation is unsafe. This should either be changed to fail closed or made an explicit testing/demo mode.

The rejection-path bug is also relevant to security operations because rejected authorizations may not be communicated to clients. That can obscure invalid-payment behavior and make monitoring or debugging harder.

No direct remote code execution, unsafe subprocess usage, obvious key leakage, or filesystem traversal issue was identified in this diff. The new FilesystemBlockStore should still be reviewed carefully when callers pass user-controlled storage paths, but the current implementation keeps block filenames derived from CID bytes rather than arbitrary path input.

5. Documentation and Examples

The existing Bitswap example was updated and the newsfragment covers the main Kubo-compatibility work. The PR would benefit from documentation for the new payment path because it is a separate operational mode with non-obvious setup:

  • how to register PaymentExtension before starting the Bitswap client,
  • how to configure a transaction verifier,
  • what asset and amount units mean,
  • whether optimistic/no-verifier mode is allowed outside tests,
  • how clients should handle PaymentRequired, PaymentTerms, receipts, and rejections.

The comments around Bitswap block-size limits should also be updated to match the new 256 KiB chunk default and 512 KiB max block size.

6. Newsfragment Requirement

Status: Present and acceptable.

File: newsfragments/1347.feature.rst

The filename matches the linked issue number, the extension is valid, the file has a newline at EOF, and the content describes raw-leaf chunks, canonical DAG-PB internal nodes, balanced layout support, FilesystemBlockStore, BlockService, batch fetching, and streaming input.

This is not a merge blocker.

Optional follow-up: If maintainers consider the fetch_file() return-shape change or chunk-size change to be user-visible breaking behavior, add a separate breaking-change newsfragment.

7. Tests and Validation

Local validation performed:

  • Checked out PR head 30a13aab63a74e246e01b44db3d6401f6182971d.
  • python -m ruff check . passed with All checks passed!.

GitHub validation observed:

  • GitHub check runs for the reviewed head report 37/37 successful.
  • Successful checks include tox docs, lint, core tests, interop tests, utility tests, wheel checks, and Windows variants.

Coverage assessment:

  • The regular Bitswap, UnixFS, provider, DHT, and record paths have meaningful tests.
  • The new payment client and payment extension need more integration-level coverage before merge.

8. Recommendations

  1. Fix the payment rejection response path before merge.
  2. Replace the payment response CID-prefix slicing with the existing canonical CID-prefix helper and add CIDv0/CIDv1 regression tests.
  3. Change payment verifier behavior to fail closed by default, or require an explicit optimistic/testing mode.
  4. Add end-to-end tests for PaymentExtension and BitswapPaymentClient_1_3.
  5. Add a short payment setup example or documentation note.
  6. Refresh stale block-size comments.

9. Questions

  1. Is Bitswap 1.3.0 payment support intended to be production-ready in this PR, or an experimental extension API?
  2. Should optimistic payment authorization ever be allowed outside tests?
  3. Should BITSWAP_PROTOCOL_V130 remain extension-only, or should the default protocol list include it once the extension is available?
  4. Are the fetch_file() return-shape and chunk-size changes considered breaking for existing users?

10. Overall Assessment

Overall status: Needs work before merge.

The Kubo compatibility work is substantial and many previously reported blockers appear fixed on the current head. The regular UnixFS/Bitswap changes look directionally sound, and CI is green. However, the newly introduced payment path has security and correctness issues at the protocol boundary: rejection-only responses can be dropped, payment payload CID prefixes are malformed, and missing transaction verification currently fails open.

Recommended merge decision: do not merge until the payment-path issues are fixed or the payment feature is explicitly marked experimental and gated away from production use.

Confidence: Medium-high for code-review findings. Local test confidence is limited by missing workspace dependencies, but GitHub CI is fully green for the reviewed head.

@acul71

acul71 commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Tracked by PR #1321, closes via #1347 — closing as duplicate of #1347.

@acul71 acul71 closed this Jun 18, 2026
@acul71

acul71 commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

CI fix: type annotations for Generic DAG node API

Lint CI was failing on a179832d due to missing type annotations on encode_node, decode_node, add_encoded_block, add_node, and get_node in libp2p/bitswap/dag.py, plus an updated create_leaf_node docstring in dag_pb.py.

Fix commit: acul71@061cda19

Branch: acul71/py-libp2p-fork:improvement/bitswap

To apply locally:

git fetch https://github.com/acul71/py-libp2p-fork.git improvement/bitswap
git cherry-pick 061cda19
git push origin improvement/bitswap

Verified locally: make lint, make typecheck, and pytest tests/core/bitswap/test_dag.py::TestGenericDag all pass.

@acul71 acul71 reopened this Jun 18, 2026
@acul71 acul71 merged commit 5c89d21 into libp2p:main Jun 18, 2026
74 of 75 checks passed
@acul71 acul71 deleted the improvement/bitswap branch June 18, 2026 18:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enhancement: Implement Full Kubo Interoperability for Bitswap, UnixFS, and DHT

4 participants