feat: Bitswap improvements for Kubo compatibility#1321
Conversation
…trieval process for compatibility wth kubo
…compatibility with ipfs kubo - Added support for signed records in the DHT by introducing `make_signed_put_record` function. - Updated `ValueStore` to create signed records when storing values. - Enhanced `Envelope` class to handle raw payload types for peer records. - Introduced utility functions for signing and verifying DHT records. - Updated protobuf definitions to include author and signature fields in records. - Improved logging and debug messages for better traceability.
…ce DAG-PB encoding Co-authored-by: Copilot <copilot@github.com>
… layout Co-authored-by: Copilot <copilot@github.com>
… in MerkleDag Co-authored-by: Copilot <copilot@github.com>
…s and implement add_stream method in MerkleDag for handling io.IOBase streams Co-authored-by: Copilot <copilot@github.com>
…improve chunk_stream documentation, and add Wantlist functionality Co-authored-by: Copilot <copilot@github.com>
- Introduced `test_block_service.py` to validate BlockService behavior including local hits, network fetches, auto-caching, and block storage. - Created `test_filesystem_blockstore.py` to manually test FilesystemBlockStore for basic operations, persistence, and directory structure. - Added `test_io_stream.py` to verify io.IOBase input support with chunk_stream and MerkleDag.add_stream functionalities. - Implemented `test_unixfs_encoding.py` to ensure add_file and add_bytes produce dag-pb leaf blocks and validate balanced layout tree structures. - Developed `test_wantlist.py` to test Wantlist and Message dataclasses, including backward compatibility and public API exports.
- Updated type hints in `make_service` function to allow for None. - Specified type hints for lists of bytes in block retrieval tests. - Added assertions to check for non-null `unixfs` in various tests to ensure proper decoding of DAG PB blocks. - Enhanced type hints for observer and subscriber peers in Gossipsub tests. - Improved type hints for candidate lists in opportunistic grafting tests. - Added type ignore comments for factory Meta classes to suppress type checker warnings. - Updated import statements for ID to include type ignore comments in interop utilities.
475086a to
6acceb2
Compare
…handling, and update tests for dag-pb leaf blocks Co-authored-by: Copilot <copilot@github.com>
…ndling in DAG fetching Co-authored-by: Copilot <copilot@github.com>
… and MerkleDag implementations
…leDag Co-authored-by: Copilot <copilot@github.com>
…aching in Bitswap Co-authored-by: Copilot <copilot@github.com>
…st files and factories
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
There was a problem hiding this comment.
@sumanjeet0012 Strong PR overall — the encoding root-causes are correct and well-tested (verified locally: 238/238 bitswap tests pass, wire format confirmed Links-before-Data). The PR description is exemplary.
Leaving a few comments inline. Three I'd flag as blocking before merge:
- Breaking API change in
add_filedefault (wrap_with_directory=True) — old callers get a directory CID instead of a file CID for the same call. Either revert the default or call this out loudly in the newsfragment. verify_recordonly handles Ed25519 — silently fails for RSA/Secp256k1 peers, breaking DHT interop with non-Ed25519 nodes.DEFAULT_CHUNK_SIZE = 63 KiB - 32doesn't match Kubo's 256 KiB default — files added by py-libp2p won't have the same root CID asipfs add file.binunless Kubo is told--chunker=size-65504. The PR header claims "Kubo CID compatibility"; this caveat needs to be documented.
Medium-priority items (perf and hygiene) inline. Everything else is comfortable post-merge cleanup.
Nice work on the manual DAG-PB outer envelope — limiting hand-rolling to 0x12 <varlen> <linkbytes> then 0x0a <varlen> <unixfs> while still using protobuf for the inner messages is the minimal, correct approach.
| @@ -187,7 +272,7 @@ async def add_file( | |||
|
|
|||
| dir_data = create_directory_node([(filename, cid, file_size)]) | |||
| dir_cid = compute_cid_v1(dir_data, codec=CODEC_DAG_PB) | |||
There was a problem hiding this comment.
Breaking API change (re: the wrap_with_directory=True default a few lines up at the parameter decl): defaulting this to True silently changes the behavior of add_file(path) for every existing caller — they'll now get a directory CID where they previously got a file CID, and fetch_file returns a (bytes, filename) tuple where it returned bytes before. This is invisible in the new tests because they all pass wrap_with_directory=False.
Suggest defaulting to False for back-compat, or version-bump and add a clear migration note to newsfragments/1321.feature.rst (currently doesn't flag this as breaking).
|
|
||
| """ | ||
| try: | ||
| public_key = Ed25519PublicKey.from_bytes(author_public_key) |
There was a problem hiding this comment.
Interop bug for non-Ed25519 peers. This hard-codes Ed25519 deserialization, so DHT records signed by RSA or Secp256k1 peers will silently fail verification (caught by the try/except and returned as False, indistinguishable from a genuinely invalid signature).
Compare with libp2p/peer/envelope.py:198-216 (pub_key_from_protobuf) which dispatches on KeyType for all three types. Either dispatch the same way here, or accept a PublicKey object and let the caller deserialize.
Kubo defaults to Ed25519 today, so this won't surface in basic interop tests, but it's a real correctness gap.
| DEFAULT_CHUNK_SIZE = 63 * 1024 | ||
| # 63 KB minus 32 bytes to leave room for the dag-pb leaf envelope overhead, | ||
| # ensuring wrapped blocks never exceed MAX_BLOCK_SIZE (63 * 1024). | ||
| DEFAULT_CHUNK_SIZE = 63 * 1024 - 32 |
There was a problem hiding this comment.
Mismatch with Kubo's default chunk size. Kubo's ipfs add uses size-262144 (256 KiB) by default. With this 63 KiB default, py-libp2p will produce a different root CID than ipfs add file.bin for the same content — contradicting the PR's "Kubo CID compatibility" headline.
Leaf CIDs match Kubo (because of RawLeaves=false + dag-pb wrapping), but the root over a multi-chunk file won't. Either:
- Document this clearly in
newsfragments/1321.feature.rst("CIDs matchipfs add --chunker=size-65504"), or - Match Kubo's 256 KiB default and split large messages at the wire layer instead of capping the chunk size.
| chunk = leaf_raw | ||
| logger.debug(f"[DAG] Leaf {idx + 1}: raw block {len(chunk)} bytes") | ||
|
|
||
| file_data += chunk |
There was a problem hiding this comment.
O(n²) bytes concat. file_data += chunk over potentially thousands of leaves means each += allocates a new bytes object and copies all previous data. For a 100 MB file at 63 KB chunks (~1626 leaves), this allocates roughly 80 GB of intermediate strings.
The fix is one line — accumulate into a bytearray (or list + b"".join(parts) at the end). This is the single biggest perf win available in the PR.
Same pattern in _read_message (client.py:1017-1047) and encode_dag_pb (dag_pb.py:125-149).
| await trio.to_thread.run_sync( | ||
| lambda: path.parent.mkdir(parents=True, exist_ok=True) | ||
| ) | ||
| await trio.to_thread.run_sync(path.write_bytes, data) |
There was a problem hiding this comment.
Non-atomic writes. path.write_bytes(data) writes in place. If the process crashes mid-write, the next startup finds a truncated file at a CID path. get_block will then return corrupted bytes that fail verification only if the caller checks.
Standard fix: write to path.with_suffix('.tmp'), then os.replace(tmp, path) — atomic on POSIX, durable on most filesystems.
| # Ensure the result is a plain dict (not a coroutine from a mock) | ||
| if isinstance(result, dict): | ||
| return result | ||
| except Exception: |
There was a problem hiding this comment.
Test infrastructure leaking into production code. The getattr probe + isinstance(result, dict) check + bare except Exception: pass is a workaround for MagicMock returning a coroutine object. This silently masks real failures in production.
Suggest defining a Protocol for the batch interface, typing self.bitswap with it, and removing the runtime probing entirely. Tests can then mock the protocol explicitly.
|
|
||
| # Send all CIDs in a single wantlist to the peer | ||
| if peer_id: | ||
| await self._send_wantlist_to_peer(peer_id, batch) |
There was a problem hiding this comment.
Swallowed exception in _send_wantlist_to_peer causes hangs. That helper (further down in this file) catches all exceptions, logs "Failed to send wantlist to peer", and returns. When called from this batch path, if host.new_stream or _write_message fails for one CID in the batch, the corresponding _pending_requests[cid].wait() below will block until trio.fail_after(timeout) cancels it — adding the full timeout (default 30s) to every per-batch failure.
Fix: propagate the failure from _send_wantlist_to_peer, or event.set() with a sentinel so the waiter can fail fast.
| # We manually construct the wire format to enforce the correct ordering. | ||
|
|
||
| # Add links | ||
| result = b"" |
There was a problem hiding this comment.
Minor: same O(n²) concat pattern as fetch_file reassembly. With 174 links on an internal node and large CIDs, each result += ... allocates a fresh bytes. Trivial fix: build into a bytearray and convert to bytes at return.
| yield chunk | ||
|
|
||
|
|
||
| def chunk_stream( |
There was a problem hiding this comment.
Nit: this new chunk_stream should also be added to the module's __all__ (further down in this file). Doesn't affect direct imports, but it breaks from chunker import * and confuses some IDE auto-import tools.
| ordered_leaf_cids: list[bytes] = [] | ||
|
|
||
| def _collect_leaves_local(cid_bytes: bytes, depth: int = 1) -> None: | ||
| """Traverse locally-fetched blocks to collect leaf CIDs.""" |
There was a problem hiding this comment.
Minor: _collect_leaves_local is unbounded recursion. For a balanced 174-fanout DAG over a 100 MB file the depth is ~2, so this never trips Python's default 1000-frame limit in practice. But a maliciously crafted DAG (e.g. a chain of single-link nodes) would crash with RecursionError. Convert to an explicit stack if you want to harden this.
PR #1321 AI Review1. Summary of Changes
2. Strengths
3. Issues FoundCritical
Major
Minor
4. Security Review
5. Documentation and Examples
6. Newsfragment Requirement
7. Tests and Validation
8. Recommendations for Improvement
9. Questions for the Author
10. Overall Assessment
|
… lookups and always send signed records. Co-authored-by: Copilot <copilot@github.com>
…rove unmarshal_public_key functionality
- Added PaymentGatedDecisionEngine to handle payment-required logic for block serving. - Introduced PaymentTerms, PaymentAuthorization, PaymentReceipt, and PaymentRejection messages in the new bitswap_1_3_0.proto. - Enhanced existing MerkleDag class to store internal nodes with a callback for payment gating. - Created BitswapPaymentClient_1_3 to manage client-side payment authorizations and receipts. - Updated balanced_layout function to support payment gating and internal node storage. - Added necessary protobuf definitions and generated Python files for Bitswap 1.3.0.
|
Thanks @sumanjeet0012, full review here: AI PR Review: #1321 — feat: Bitswap improvements for Kubo compatibilityPR: #1321 1. Summary of ChangesThis PR is a large feature change centered on Bitswap / Merkle DAG interoperability with Kubo (go-ipfs) and supporting infrastructure:
Related issue: #1321 (same title/body as the PR; tracks Kubo compatibility work). The PR body does not include an explicit Breaking / behavior changes (user-facing):
Modules touched: 2. Branch Sync Status and Merge ConflictsBranch sync status
Merge conflict analysis✅ No merge conflicts detected. Test merge of 3. Strengths
4. Issues FoundCriticalNone identified that would corrupt protocol state or introduce obvious remote code execution. CI docs job is failing (see §8) — treat as merge blocker until fixed. Major
async def add_file(
self,
file_path: str,
chunk_size: int | None = None,
progress_callback: Callable[[int, int, str], None] | None = None,
wrap_with_directory: bool = True,
) -> bytes:
# Default chunk size: 63 KB (py-libp2p accepts less than 64 KB)
# 63 KB minus 32 bytes to leave room for the dag-pb leaf envelope overhead,
# ensuring wrapped blocks never exceed MAX_BLOCK_SIZE (63 * 1024).
DEFAULT_CHUNK_SIZE = 63 * 1024 - 32
Minor
else:
chunk = leaf_raw
logger.debug(f"[DAG] Leaf {idx + 1}: raw block {len(chunk)} bytes")
file_data += chunk
bytes_fetched += len(chunk)
except Exception as e:
logger.error(f"Failed to send wantlist to peer {peer_id}: {e}")
Maintainer feedback status
5. Security Review
Overall security impact: Low (incremental hardening on DHT signing; no new high-risk surfaces identified). 6. Documentation and ExamplesPresent:
Gaps:
7. Newsfragment Requirement
Not a blocker on newsfragment presence alone; blocker if breaking default is shipped without 8. Tests and ValidationLinting (
|
| Check | Status |
|---|---|
tox (3.10–3.13, core) |
✅ pass |
tox (*, lint) |
✅ pass |
tox (3.10, docs) |
❌ fail |
| Read the Docs | ❌ fail |
| Windows core | ✅ pass |
9. Recommendations for Improvement
- Unblock CI: Fix
verify_recorddocstring RST formatting; re-run docs tox. - Resolve reviewer blockers: Revert
wrap_with_directorydefault toFalseor ship.breaking.rst+ migration text; document chunk-size vs Kubo in newsfragment andchunker.pymodule docstring. - Performance (quick wins):
bytearrayinfetch_fileandencode_dag_pb; atomic writes inFilesystemBlockStore. - Reliability: Propagate or surface errors from
_send_wantlist_to_peer; remove MagicMock-oriented fallback in_get_blocks_batch. - Tests: Add explicit canonical wire-order test (
0x12before0x0a); optional integration test with Kubo/ipfs addfor same chunker. - Process: Add
Fixes #1321to PR body; squash/clean 26 commits if maintainers require linear history.
10. Questions for the Author
- Is
wrap_with_directory=Trueas default intentional for IPFS parity, and are you willing to call it breaking in release notes? - Should the project standardize on 63 KiB chunks permanently, or align with Kubo’s 256 KiB default over time?
- For
ProviderQueryManager, isprovider_store.find_providers(network DHT walk) sufficient, or should callers useKadDHT.find_providersat the host API level for consistency? - Do you plan Kubo interop integration tests (dockerized
ipfs add/ipfs cat) in a follow-up PR? - Will you fix the Sphinx docstring in this PR or a fast follow-up?
11. Overall Assessment
| Metric | Rating |
|---|---|
| Quality | Good — strong technical core; remaining issues are mostly API contract, docs CI, and performance hygiene |
| Security impact | Low |
| Merge readiness | Needs fixes — docs CI failure + unresolved blocking review on wrap_with_directory default and CID/chunk-size messaging |
| Confidence | High (full local test run + code review on PR branch; CI docs failure reproduced locally) |
Summary: This PR materially fixes real Kubo Bitswap/DHT interoperability gaps and adds valuable infrastructure (BlockService, persistent store, batch wants). Earlier major review findings (signed PUT propagation, network provider lookup, multi-key verification) appear addressed. Before merge: fix docs build, resolve wrap_with_directory default / breaking disclosure, and correct newsfragment claims about chunk size and root CID parity with default Kubo ipfs add.
- Added `payment_client_1_3.py` for handling in-band payment messages, including: - Processing PaymentTerms and sending PaymentAuthorization. - Handling PaymentReceipts and PaymentRejections. - Validating payment amounts and signing EIP-3009 transactions. - Introduced `payment_ledger.py` for tracking payments at the root CID level: - Supports registration of DAG structures for child blocks. - Implements nonce deduplication to prevent replay attacks. - Provides methods to check payment status and record payments. - Updated protobuf definitions in `bitswap_1_3_0_pb2.pyi` to reflect new payment structures: - Added PaymentTerms, PaymentAuthorization, and PaymentReceipt messages. - Introduced TxReceipt for transaction details. - Created `pricing_engine.py` to compute block pricing based on configurable strategies: - Supports free, fixed, size-based, and custom pricing strategies. - Allows marking specific CIDs as free and setting per-CID prices.
- Added PaymentExtension class to handle payment-related protobuf fields and wantlists in Bitswap 1.3.0. - Integrated payment terms, receipts, and rejections processing for client-side and server-side. - Enhanced PaymentLedger to track root CID payments and manage payment records. - Updated pricing engine to support configurable pricing strategies. - Refactored tests to accommodate changes in block storage and encoding, ensuring raw blocks are used for leaves. - Improved type hints and documentation across the codebase for better clarity and maintainability.
f16afc1 to
be24050
Compare
|
Thanks @sumanjeet0012, for the fix. Overall Assessment
HEAD: What the PR Does WellThe Kubo interoperability fixes are sound on the current branch:
Earlier review findings (provider network lookup, signed Open Blockers / Major Issues
Validation Results (this run)
Recommended Before Merge
Post-merge hygiene (reviewer-accepted): FULL REVIEW AI PR Review: #1321 — feat: Bitswap improvements for Kubo compatibilityPR: #1321 1. Summary of ChangesThis PR is a large feature change centered on Bitswap / Merkle DAG interoperability with Kubo (go-ipfs) and supporting infrastructure:
Related issues:
The PR body does not include Breaking / behavior changes (user-facing):
PR description vs. implementation: The PR body still describes dag-pb-wrapped leaves ( Modules touched: 2. Branch Sync Status and Merge ConflictsBranch Sync Status
Merge Conflict Analysis✅ No merge conflicts detected. Test merge of 3. Strengths
4. Issues FoundCriticalNone identified on commit Major
async def add_file(
self,
file_path: str,
chunk_size: int | None = None,
progress_callback: Callable[[int, int, str], None] | None = None,
wrap_with_directory: bool = True,
) -> bytes:
Minor
Maintainer feedback status
5. Security Review
Overall security impact: Low–Medium (payment extension adds financial surface without test coverage). 6. Documentation and ExamplesPresent: docstrings on new APIs, updated Gaps:
7. Newsfragment Requirement
Not a blocker on newsfragment presence alone; blocker if breaking default is shipped without 8. Tests and ValidationLinting (
|
| Environment | Result |
|---|---|
| tox (3.10–3.13, core) | ✅ 2625 passed, 12 skipped |
| tox (3.10–3.13, interop) | ✅ pass |
| Windows (3.11–3.13, core) | ✅ pass |
| Bitswap subset (local) | ✅ 238 passed in ~8.9 s |
Local make test: 2875 passed, 16 skipped, 1 failed in 114 s.
| Test | Result | Notes |
|---|---|---|
tests/core/network/test_connection_management.py::TestExtractIpFromMultiaddr::test_invalid_dns |
❌ FAILED | Local DNS sinkhole resolves this-domain-does-not-exist-12345.com to 127.0.0.1. CI passes — environment artifact, not a PR regression. |
Log: downloads/AI-PR-REVIEWS/1321/test_output.log
Documentation (make linux-docs)
- Exit code: 0 ✅
- Log:
downloads/AI-PR-REVIEWS/1321/docs_output.log
CI summary (HEAD 50d3c26a)
| Check | Status |
|---|---|
| tox (*, lint) | ✅ pass |
| tox (3.10, docs) | ✅ pass |
| tox (3.10–3.13, core) | ✅ pass |
| tox (3.10, interop) | ✅ pass |
| Read the Docs | ✅ pass |
| Windows core | ✅ pass |
9. Recommendations for Improvement
- Resolve
wrap_with_directorydefault — revert toFalseor ship1321.breaking.rst. - Add payment extension tests or split payment into a follow-up PR.
- Remove
payment_client_1_3.py.backupfrom the branch. - Update PR description (raw leaves, 256 KiB, actual file list); add
Fixes #1321. - Consolidate newsfragments for feat: Bitswap improvements for Kubo compatibility #1321 vs Enhancement: Implement Full Kubo Interoperability for Bitswap, UnixFS, and DHT #1347; align wording with raw-leaf encoding.
- Performance:
bytearrayinfetch_file/encode_dag_pb; atomic writes inFilesystemBlockStore. - Reliability: propagate errors from
_send_wantlist_to_peer; remove mock fallback in_get_blocks_batch. - Tests: add explicit wire-order assertion (
0x12before0x0a). - Hygiene: remove
print(..., flush=True)fromfetch_file; split Yamux/QUIC fixes; add missing payment example or remove from PR description.
10. Questions for the Author
- Is
wrap_with_directory=Trueintentional for IPFS parity, and will you document it as breaking? - Should feat: Bitswap improvements for Kubo compatibility #1321 and Enhancement: Implement Full Kubo Interoperability for Bitswap, UnixFS, and DHT #1347 both keep newsfragments, or should one subsume the other?
- Is the payment extension ready to merge without tests, or should it be a follow-up?
- Was
payment_client_1_3.py.backupleft in intentionally? - Were Yamux/QUIC changes required for Bitswap interop, or can they be split out?
11. Overall Assessment
| Metric | Rating |
|---|---|
| Quality | Good — strong Kubo interoperability core; API-default, payment-test, and hygiene gaps remain |
| Security impact | Low–Medium |
| Merge readiness | Needs maintainer sign-off — CI is fully green; @yashksaini-coder's wrap_with_directory blocker and payment test gap are the main open items |
| Confidence | High (synced to 50d3c26a; CI + local lint/docs/bitswap tests verified) |
Summary: CI/CD is fully green on the current PR head (50d3c26a). The Kubo interoperability work is technically sound: raw leaves, 256 KiB chunks, balanced layout, canonical DAG-PB encoding, signed DHT records, and batch fetching address the real root causes. Substantive remaining concerns are the reviewer-flagged breaking default (wrap_with_directory=True), untested payment extension, committed backup file, and process/docs hygiene (stale PR body, duplicate newsfragments, missing files listed in description). None of these are CI failures.
…ry default to false and removed extra files
…pport, and improve payment client defaults
…atch fetching logic
|
Good work @sumanjeet0012 , thanks ! Recommendation Full Review HERE: AI PR Review: #1321 — feat: Bitswap improvements for Kubo compatibilityPR: #1321 1. Summary of ChangesThis PR is a large feature change centered on Bitswap / Merkle DAG interoperability with Kubo (go-ipfs) and supporting infrastructure.
Related issues:
Breaking / behavior changes (user-facing):
PR description vs. implementation: The PR body still describes dag-pb-wrapped leaves ( Modules touched: 2. Branch Sync Status and Merge ConflictsBranch Sync Status
Merge Conflict Analysis✅ No merge conflicts detected. Test merge of 3. Strengths
4. Issues FoundCriticalNone identified on commit Major
Minor
print(
f"\n📥 RECEIVED WANTLIST from peer {peer_id_str} with "
f"{len(msg.wantlist.entries)} entries",
flush=True,
)
Maintainer feedback status
5. Security Review
Overall security impact: Low–Medium (primarily from the committed binary and untested payment client path). 6. Documentation and ExamplesPresent: docstrings on new APIs, updated Gaps:
7. Newsfragment Requirement
Not a blocker on newsfragment presence; minor blocker on newsfragment accuracy. 8. Tests and ValidationLinting (
|
| Environment | Result |
|---|---|
| tox (3.10–3.13, core) | ✅ pass |
| tox (3.10–3.13, interop) | ✅ pass |
| Windows (3.11–3.13, core) | ✅ pass |
| Bitswap subset (local) | ✅ 245 passed in ~8.7 s |
Local make test: 2882 passed, 16 skipped, 1 failed in 127 s.
| Test | Result | Notes |
|---|---|---|
tests/core/network/test_connection_management.py::TestExtractIpFromMultiaddr::test_invalid_dns |
❌ FAILED | Local DNS sinkhole resolves this-domain-does-not-exist-12345.com to 127.0.0.1. CI passes — environment artifact, not a PR regression. |
Log: downloads/AI-PR-REVIEWS/1321/test_output.log
Documentation (make linux-docs)
- Exit code: 0 ✅
- Log:
downloads/AI-PR-REVIEWS/1321/docs_output.log
CI summary (HEAD 622e61f7)
| Check | Status |
|---|---|
| tox (*, lint) | ✅ pass |
| tox (3.10, docs) | ✅ pass |
| tox (3.10–3.13, core) | ✅ pass |
| tox (3.10, interop) | ✅ pass |
| Read the Docs | ✅ pass |
| Windows core | ✅ pass |
9. Recommendations for Improvement
- Remove
interop/bitswap/go_peer/go_peerbinary; add a build script or rely on interop tox infrastructure. - Update
newsfragments/1347.feature.rstto describe raw-leaf encoding accurately. - Rewrite the PR description to match current implementation and file list.
- Add payment client/extension tests or split payment into a follow-up PR.
- Add explicit wire-order regression test (
0x12before0x0a). - Replace
print()debug output inclient.pywith logger calls. - Clarify feat: Bitswap improvements for Kubo compatibility #1321 / Enhancement: Implement Full Kubo Interoperability for Bitswap, UnixFS, and DHT #1347 issue relationship for maintainers.
10. Questions for the Author
- What is the purpose of the committed
go_peerMach-O binary, and can it be removed or replaced with a build step? - Should feat: Bitswap improvements for Kubo compatibility #1321 be closed as duplicate of Enhancement: Implement Full Kubo Interoperability for Bitswap, UnixFS, and DHT #1347, or should both keep separate newsfragments?
- Is the payment client/extension ready to merge without dedicated tests, or planned as a fast follow-up?
- Were Yamux/QUIC changes required for Bitswap Kubo interop, or can they be split out?
- Will you update the PR description before maintainer sign-off?
11. Overall Assessment
| Metric | Rating |
|---|---|
| Quality | Good — Kubo interoperability core is solid; most prior blockers resolved |
| Security impact | Low–Medium (committed binary; untested payment client path) |
| Merge readiness | Needs minor fixes — CI fully green; remove binary, fix newsfragment/PR body, add payment client tests or defer |
| Confidence | High (synced to 622e61f7; CI + local lint/docs/bitswap tests verified) |
Summary: This PR materially fixes real Kubo Bitswap/DHT interoperability gaps. Since review v3, the author addressed the major reviewer blockers (wrap_with_directory default, chunk size, atomic writes, batch fail-fast, payment component tests for pricing/ledger/gating, backup file removal). CI is fully green. Remaining concerns are the 43 MB committed binary, stale PR description, inaccurate newsfragment wording, and missing tests for BitswapPaymentClient_1_3 / PaymentExtension. None are CI failures; the binary removal is the highest-priority hygiene item before merge.
AI PR Review: #1321PR: #1321 1. Summary of ChangesThis PR is a large Bitswap/Kubo compatibility update for issue #1347. It adds raw-leaf chunking, 256 KiB chunk defaults, balanced UnixFS/DAG-PB construction, canonical DAG-PB internal-node encoding, a filesystem-backed block store, a It also adds a new Bitswap 1.3.0 payment stack: payment protocol buffers, The PR touches core Bitswap behavior, DHT value storage, record signing, examples, and tests. It is not a narrow patch; it changes multiple cross-module contracts. 2. Strengths
3. Issues FoundCriticalNo critical correctness issue was confirmed in the non-payment Bitswap/UnixFS path during this review. Major1. Payment rejection responses can be dropped by an unbound local variableFile: Issue: Impact: Invalid signatures, insufficient payment, and other rejected authorizations may time out client-side instead of receiving a protocol-level rejection. That breaks the Bitswap 1.3.0 payment error path and makes failures hard to diagnose. Suggestion: Initialize 2. Payment block responses use invalid CID prefixesFile: Issue: Impact: Paid or payment-gated blocks can be reconstructed under the wrong CID, which can prevent pending requests from completing or store data under an unexpected key. This undermines the new payment retrieval path even when the block bytes are correct. Suggestion: Use the same CID-prefix helper as the standard Bitswap send path, for example 3. Payment-gated serving fails open when no transaction verifier is configuredFile: Issue: Impact: A payment-gated provider that is misconfigured or deployed without a verifier releases blocks for syntactically sufficient but unverified payment authorizations. For a public payment extension, fail-open behavior is a security-sensitive default. Suggestion: Fail closed when 4. The new payment protocol surface is under-tested at the integration boundaryFiles: Issue: Impact: The two functional bugs above both sit at the untested protocol boundary. CI can pass while the payment extension fails for real streams or reconstructs blocks under incorrect CIDs. Suggestion: Add focused tests for Minor1. Bitswap 1.3.0 enablement is not obvious from docs or examplesFile: Issue: Suggestion: Add a short example or docs note showing how to register 2. Some block-size comments are stale after the chunk-size changesFile: Issue: Comments still describe a roughly 63 KB maximum payload assumption, while Suggestion: Refresh the comments so future changes do not accidentally reintroduce the old 64 KiB assumption. 4. Security ReviewThe main security concern is the fail-open payment verifier behavior. If payment-gated serving is intended for production use, releasing blocks without verifier-backed transaction validation is unsafe. This should either be changed to fail closed or made an explicit testing/demo mode. The rejection-path bug is also relevant to security operations because rejected authorizations may not be communicated to clients. That can obscure invalid-payment behavior and make monitoring or debugging harder. No direct remote code execution, unsafe subprocess usage, obvious key leakage, or filesystem traversal issue was identified in this diff. The new 5. Documentation and ExamplesThe existing Bitswap example was updated and the newsfragment covers the main Kubo-compatibility work. The PR would benefit from documentation for the new payment path because it is a separate operational mode with non-obvious setup:
The comments around Bitswap block-size limits should also be updated to match the new 256 KiB chunk default and 512 KiB max block size. 6. Newsfragment RequirementStatus: Present and acceptable. File: The filename matches the linked issue number, the extension is valid, the file has a newline at EOF, and the content describes raw-leaf chunks, canonical DAG-PB internal nodes, balanced layout support, This is not a merge blocker. Optional follow-up: If maintainers consider the 7. Tests and ValidationLocal validation performed:
GitHub validation observed:
Coverage assessment:
8. Recommendations
9. Questions
10. Overall AssessmentOverall status: Needs work before merge. The Kubo compatibility work is substantial and many previously reported blockers appear fixed on the current head. The regular UnixFS/Bitswap changes look directionally sound, and CI is green. However, the newly introduced payment path has security and correctness issues at the protocol boundary: rejection-only responses can be dropped, payment payload CID prefixes are malformed, and missing transaction verification currently fails open. Recommended merge decision: do not merge until the payment-path issues are fixed or the payment feature is explicitly marked experimental and gated away from production use. Confidence: Medium-high for code-review findings. Local test confidence is limited by missing workspace dependencies, but GitHub CI is fully green for the reviewed head. |
CI fix: type annotations for Generic DAG node APILint CI was failing on Fix commit: acul71@061cda19 Branch: To apply locally: git fetch https://github.com/acul71/py-libp2p-fork.git improvement/bitswap
git cherry-pick 061cda19
git push origin improvement/bitswapVerified locally: |
What was wrong?
Fixes #1347
Issue: py-libp2p's Bitswap file operations were not compatible with Kubo (Go-IPFS)
and the broader IPFS network. Files added by py-libp2p could not be fetched by Kubo
nodes, and vice versa. Additionally, DHT records lacked cryptographic verification
required by Kubo peers.
Several root causes were identified:
1. Flat DAG structure (no balanced layout)
create_file_node()put all chunks as direct links on the root node — a flat 1-level tree. Kubo uses a balanced tree with a maximum of 174 links per node.2. Wrong DAG-PB wire encoding (field ordering)
encode_dag_pb()usedPBNode.SerializeToString()which emitsDatabeforeLinks. The DAG-PB spec requires Links before Data for canonical encoding.3. Wrong default chunk size
The previous default (~63 KiB) did not match Kubo's default
size-262144(256 KiB), producing different root CIDs fromipfs add.4. No persistent block storage
MemoryBlockStorewas the onlyBlockStoreimplementation, meaning all fetched blocks were lost when the process exited.5. No transparent caching layer
MerkleDagcalledbitswap.get_block()/bitswap.add_block()directly with no service layer, meaning fetched blocks were not auto-cached locally.6. No stream input support
add_file()only accepted a file path string.7. Magic integers in Bitswap message construction
create_wantlist_entry(cid, want_type=1)used raw integers with no type safety.8. DHT record signing/verification not compatible with Kubo
DHT value records lacked proper signing and verification, causing incompatibility with Kubo's DHT implementation.
9. Sequential block fetching caused Kubo disconnects
Fetching many blocks one-by-one opened hundreds of streams, causing Kubo to issue
GO_AWAYdisconnects.How was it fixed?
Bitswap: Kubo CID compatibility
Raw leaf blocks (
CODEC_RAW) indag.py:Multi-chunk files store raw leaf blocks, matching Kubo's
RawLeaves=truedefault foripfs add.Default chunk size
256 * 1024inchunker.py:Matches Kubo's default
size-262144.New
balanced_layout(leaves)indag_pb.py:Groups leaves into batches of 174, builds a tree level by level.
Fixed
encode_dag_pb()indag_pb.py:Manually constructs wire bytes with Links before Data (canonical DAG-PB ordering).
create_leaf_node()indag_pb.py:Utility for dag-pb-wrapped leaves (
RawLeaves=falsemode); used by tests and layout helpers, not the default file-ingestion path.Bitswap: batch fetching
Enhanced
get_blocks_batch()inclient.py:Sends all CIDs in a single wantlist message per batch, waits for all responses on the same stream.
New: FilesystemBlockStore
New
FilesystemBlockStoreinblock_store.py:Stores each block as a file at
<base>/<cid[:2]>/<cid[2:]>with atomic writes. Drop-in replacement forMemoryBlockStore.New: BlockService
New
block_service.py:Transparent local→network fallback layer between
MerkleDagandBitswapClient. Auto-caches network-fetched blocks locally.New:
add_stream()+chunk_stream()New
chunk_stream(stream: io.IOBase)inchunker.pyandadd_stream()inMerkleDag:Reads one chunk at a time from any
io.IOBasestream.New: Generic IPLD node API
New
add_node()/get_node()/remove_node()indag.py:Store and retrieve structured DAG-JSON, DAG-CBOR, and RAW nodes.
New: Wantlist / Message dataclasses
New
wantlist.pywith typed dataclasses (WantType,BlockPresenceType,WantlistEntry,Wantlist,BlockPresence,BitswapMessage).DHT: record signing and verification & ProviderQueryManager
Enhanced DHT record handling with proper signing and verification for compatibility with Kubo's DHT implementation. Added
ProviderQueryManagerfor robust DHT-based provider discovery and caching in Bitswap.Bitswap 1.3.0 Payment Extension (experimental)
Partial implementation:
BlockPricingEngine,PaymentLedger, andPaymentGatedDecisionEngineare implemented and tested.BitswapPaymentClientandPaymentExtensionprovide the wire protocol scaffolding; full client/extension test coverage is deferred to a follow-up PR.Breaking changes
MerkleDag.fetch_file()now returnstuple[bytes, str | None](data + optional filename) instead of barebytes.MerkleDag.add_file(..., wrap_with_directory=False)by default (was effectively file-root CID before).Files changed
libp2p/bitswap/dag_pb.pycreate_leaf_node(),balanced_layout(), fixedencode_dag_pb()canonical orderinglibp2p/bitswap/dag.pyadd_file(),add_bytes(), newadd_stream(),add_node()/get_node(),BlockServiceroutinglibp2p/bitswap/client.pyget_blocks_batch(), modularized withIBitswapExtensionlibp2p/bitswap/chunker.pychunk_stream(stream: io.IOBase),DEFAULT_CHUNK_SIZE = 256 * 1024libp2p/bitswap/block_store.pyFilesystemBlockStorewith atomic writeslibp2p/bitswap/block_service.pyBlockServicelibp2p/bitswap/wantlist.pylibp2p/bitswap/payment_extension.pylibp2p/bitswap/messages.pycreate_wantlist_entry()acceptsWantType | intlibp2p/bitswap/__init__.pylibp2p/kad_dht/ProviderQueryManagerlibp2p/records/libp2p/peer/envelope.py,peer_record.pyexamples/bitswap/bitswap.pyTest files added
test_unixfs_encoding.pycreate_leaf_node()test_filesystem_blockstore.pyFilesystemBlockStorepersistence + round-triptest_block_service.pyBlockServicelocal hit / miss / auto-cache / announcetest_io_stream.pychunk_stream()+add_stream()with BytesIO, GzipFile, file handlestest_wantlist.pyWantType,Wantlist,BitswapMessage,to_proto()/from_proto()test_dag_pb.pytest_encode_canonical_field_ordering(Links before Data on wire)test_provider_query.pyProviderQueryManagerDHT provider discoverytest_payment.pyBlockPricingEngine,PaymentLedger,PaymentGatedDecisionEnginetest_dag.pyadd_node/get_node/remove_node)To-Do