IPFS Kit Python is a comprehensive, production-ready Python toolkit for building distributed storage applications on IPFS. It provides high-level APIs, advanced cluster management, AI/ML integration, and seamless MCP (Model Context Protocol) server support for modern decentralized applications.
- Build Decentralized Apps: High-level Python API for IPFS without complexity
- Scale with Clusters: Multi-node cluster management with automatic replication
- Integrate AI Models: Store and retrieve ML models/datasets on IPFS
- Create Storage Services: Production-ready foundation for IPFS-based services
- Distributed Datasets: Store and share large datasets across IPFS network
- Model Versioning: Track and distribute ML models with content addressing
- Reproducible Research: Immutable data storage with cryptographic verification
- Collaborative Workflows: Share data and models via IPFS with team members
- High Availability: Multi-node clusters with leader election and failover
- Observability: Built-in metrics, logging, and monitoring
- Container Native: Docker and Kubernetes ready deployment
- Auto-Healing: Automatic error detection and recovery system
- 🌐 High-Level API: Simplified Python interface wrapping IPFS complexity
- 📦 Content Management: Add, get, pin, and manage content with ease
- 🔗 IPNS Support: Mutable pointers to immutable IPFS content
- 📊 Directory Operations: Work with IPFS directories and file structures
- 🔍 Content Discovery: Find and retrieve content across the IPFS network
- 🔄 Multi-Node Clusters: Deploy 3+ node clusters with role hierarchy
- 👑 Leader Election: Automatic leader selection and failover
- 🎭 Role-Based: Master, Worker, and Leecher role management
- 📈 Auto-Scaling: Automatically replicate content based on demand
- 🔗 Peer Management: Dynamic peer discovery and connection handling
- 💾 Distributed Storage: Spread content across multiple nodes
- 🤖 Model Registry: Store and version ML models on IPFS
- 📊 Dataset Management: Manage large datasets with IPFS chunking
- �� Framework Support: LangChain, LlamaIndex, Transformers integration
- 📉 Metrics Tracking: Model performance metrics and visualization
- 🧮 Distributed Training: Share training data across nodes
- 🎯 Vector Search: GraphRAG and knowledge graph integration
- 🌟 Production Ready: Full-featured MCP server implementation
- 🛠️ Tool Integration: Expose IPFS operations as MCP tools
- 🔌 Plugin System: Extensible architecture for custom tools
- 📡 Real-Time: WebSocket support for streaming operations
- 🎨 Dashboard: Web-based management and monitoring interface
- 🔐 Secure: Built-in authentication and authorization
- 📦 Tiered Storage: Multi-tier caching (memory, SSD, network)
- ⚡ High Performance: Async/await throughout for concurrency
- 🔄 Write-Ahead Log: Crash recovery and data consistency
- 🗜️ Compression: Automatic compression for large files
- 📊 Metadata Index: Fast content lookup and search
- 🚀 Prefetching: Predictive content loading for speed
- 🔍 Observability: Prometheus metrics, structured logging, tracing
- 🏥 Health Checks: Built-in health endpoints for monitoring
- 🔧 Auto-Healing: Detect and fix common errors automatically
- 📈 Performance Metrics: Real-time performance tracking
- 🎛️ Configuration: Flexible YAML/JSON configuration
- 🔔 Alerting: Integration with monitoring systems
- 🐳 Docker Ready: Multi-arch Docker images (AMD64, ARM64)
- ☸️ Kubernetes: Helm charts and operator support
- 🔄 CI/CD: GitHub Actions workflows included
- 🌐 Cloud Native: Deploy on any cloud provider
- 🔌 Extensible: Plugin system for custom functionality
- 📚 Well Documented: Comprehensive guides and examples
┌─────────────────────────────────────────────────────────────┐
│ Applications Layer │
│ (Your App, CLI, Web Dashboard, API Services) │
└───────────────────────────┬─────────────────────────────────┘
│
┌───────────────────────────▼─────────────────────────────────┐
│ High-Level API │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌────────────┐ │
│ │ IPFS │ │ Cluster │ │ AI/ML │ │ MCP │ │
│ │ Ops │ │ Mgmt │ │ Tools │ │ Server │ │
│ └──────────┘ └──────────┘ └──────────┘ └────────────┘ │
└───────────────────────────┬─────────────────────────────────┘
│
┌───────────────────────────▼─────────────────────────────────┐
│ Core Services Layer │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌────────────┐ │
│ │ Tiered │ │ WAL & │ │ Metadata │ │ Pin │ │
│ │ Cache │ │ Journal │ │ Index │ │ Manager │ │
│ └──────────┘ └──────────┘ └──────────┘ └────────────┘ │
└───────────────────────────┬─────────────────────────────────┘
│
┌───────────────────────────▼─────────────────────────────────┐
│ IPFS Daemon Layer │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌────────────┐ │
│ │ Kubo │ │ Cluster │ │ Lotus │ │ Lassie │ │
│ │ (IPFS) │ │ Service │ │(Filecoin)│ │ (Retrieval)│ │
│ └──────────┘ └──────────┘ └──────────┘ └────────────┘ │
└─────────────────────────────────────────────────────────────┘
IPFS Kit supports 7 integrated storage backends for maximum flexibility and redundancy:
- IPFS/Kubo - Decentralized content-addressed storage
- Filecoin/Lotus - Long-term archival with economic incentives
- S3-Compatible - AWS S3, MinIO, and other S3-compatible services
- Storacha (Web3.Storage) - Web3 storage built on IPFS + Filecoin
- HuggingFace - ML model and dataset storage
- Lassie - High-performance IPFS retrieval client
- Walrus - fsspec-compatible blob storage with direct blob-id reads and local logical-path indexing
┌─────────────────────────────────────────────────────────────┐
│ Tier 1: Memory Cache (100MB default) │
│ • Fastest access (microseconds) │
│ • Hot content, recently accessed │
│ • ARC algorithm (Adaptive Replacement Cache) │
└────────────────────────┬────────────────────────────────────┘
│ Auto-promotion/demotion
┌────────────────────────▼────────────────────────────────────┐
│ Tier 2: Disk Cache (1GB+ default) │
│ • Fast persistent storage (milliseconds) │
│ • Warm content, frequently accessed │
│ • Heat-based eviction, zero-copy mmap │
└────────────────────────┬────────────────────────────────────┘
│ Overflow & long-term
┌────────────────────────▼────────────────────────────────────┐
│ Tier 3: IPFS Network │
│ • Distributed content-addressed storage │
│ • Peer discovery, automatic replication │
│ • DHT-based content routing │
└────────────────────────┬────────────────────────────────────┘
│ Backup & durability
┌────────────────────────▼────────────────────────────────────┐
│ Tier 4: Cloud Backends (S3, Storacha, Filecoin) │
│ • Long-term archival, geographical distribution │
│ • Economic persistence, compliance storage │
│ • Cross-region replication │
└─────────────────────────────────────────────────────────────┘
from ipfs_kit_py.high_level_api import IPFSSimpleAPI
# Initialize with multiple backends
api = IPFSSimpleAPI(
storage_backends={
'ipfs': {'enabled': True},
'filecoin': {
'enabled': True,
'lotus_path': '/path/to/lotus'
},
's3': {
'enabled': True,
'bucket': 'my-ipfs-backup',
'region': 'us-west-2'
},
'storacha': {
'enabled': True,
'token': 'your_token',
'space': 'your_space_did'
}
}
)
# Content automatically distributed across backends
cid = api.add("important_data.txt", backends=['ipfs', 'filecoin', 's3'])See Also: Storage Backends Documentation
The Walrus backend registers the walrus:// protocol with fsspec and supports
publisher writes, aggregator reads, direct blob-id reads, and index-backed
logical paths. ipfs_kit_py delegates the backend implementation to the
standalone walrus-fsspec package while preserving the historical
ipfs_kit_py.walrus_fsspec import path and Walrus environment variable aliases:
import fsspec
import ipfs_kit_py.walrus_fsspec # registers walrus://
fs = fsspec.filesystem("walrus")
entry = fs.pipe_file("walrus://examples/hello.txt", b"hello walrus\n")
with fsspec.open("walrus://examples/hello.txt", "rb") as handle:
print(handle.read())
with fsspec.open(f"walrus://{entry['blob_id']}", "rb") as handle:
print(handle.read())Set WALRUS_PUBLISHER_URL for writes, WALRUS_AGGREGATOR_URL for reads,
and WALRUS_DELETE_URL for deletes. See the
Walrus fsspec integration guide for full
configuration, examples, and listing/deletion limitations.
IPFS Kit provides sophisticated replica management for high availability and data durability:
Cluster-Based Replication:
# Set replication factor for automatic distribution
api = IPFSSimpleAPI(role="master")
# Add content with 3 replicas across cluster
result = api.cluster_add(
"dataset.tar.gz",
replication_factor=3, # Distribute to 3 nodes
replication_policy="distributed" # Strategy: distributed, local-first, geo-aware
)
# Check replication status
status = api.cluster_status(result['cid'])
print(f"Replicas: {len(status['peers'])} nodes")
print(f"Locations: {status['peer_locations']}")Pin Management with Replication:
# Pin with min/max replica constraints
api.pin_add(
cid,
replication_min=2, # Minimum 2 copies
replication_max=5, # Maximum 5 copies
replication_priority="high" # Auto-repair if below min
)
# Monitor replica health
health = api.get_replication_health(cid)
# Returns: {'total': 3, 'healthy': 3, 'degraded': 0, 'locations': [...]}Replication Policies:
- Distributed: Spread replicas across maximum geographic/network distance
- Local-First: Keep replicas in nearby nodes first, then expand
- Geo-Aware: Place replicas in specific regions or datacenters
- Cost-Optimized: Balance between redundancy and storage costs
- Latency-Optimized: Replicate to nodes with best access patterns
Automatic Repair:
# Enable auto-repair for critical content
api.enable_auto_repair(
cid,
check_interval=3600, # Check every hour
repair_threshold=2, # Repair if below 2 replicas
target_replicas=3 # Maintain 3 replicas
)See Also: Cluster Management, Pin Management
IPFS Kit implements a sophisticated Adaptive Replacement Cache (ARC) with multiple tiers:
Cache Tiers:
-
Memory Cache (T1/T2)
- ARC algorithm balances recency vs frequency
- Configurable size (default: 100MB)
- Submillisecond access times
- Automatic size-based decisions
-
Disk Cache
- Persistent across restarts
- Heat-based eviction (access patterns + recency)
- Memory-mapped for zero-copy access
- Configurable size (default: 1GB+)
-
Network Cache
- IPFS network acts as distributed cache
- Content-addressed retrieval
- Peer caching benefits
from ipfs_kit_py.tiered_cache import TieredCacheManager
# Custom cache configuration
cache = TieredCacheManager(
config={
'memory_cache_size': 500 * 1024 * 1024, # 500MB
'disk_cache_size': 10 * 1024 * 1024 * 1024, # 10GB
'disk_cache_path': '/fast/ssd/cache',
'enable_mmap': True, # Zero-copy for large files
'eviction_policy': 'heat', # heat, lru, lfu
'promotion_threshold': 3, # Access count for promotion
}
)
# Cache operations (automatic tier selection)
cache.put(cid, content) # Intelligent tier placement
content = cache.get(cid) # Fastest available tier
# Cache statistics
stats = cache.get_stats()
print(f"Hit rate: {stats['hit_rate']:.2%}")
print(f"Memory: {stats['memory_usage']}, Disk: {stats['disk_usage']}")Heat Scoring - Combines multiple factors:
- Access frequency (recent access count)
- Recency (time since last access)
- Content size (smaller = higher priority)
- Access pattern (sequential vs random)
Automatic Optimization:
- Content promoted from disk → memory on repeated access
- Large files use memory-mapped I/O (no duplication)
- Rarely accessed content demoted to network tier
- Cache pre-warming for predictable workloads
See Also: Tiered Cache Documentation
IPFS Kit provides a POSIX-like virtual filesystem on top of IPFS, enabling familiar file operations:
from ipfs_kit_py.vfs_manager import get_global_vfs_manager
vfs = get_global_vfs_manager()
# File operations (like regular filesystem)
vfs.mkdir("/data/projects")
vfs.write("/data/projects/notes.txt", "Project notes...")
content = vfs.read("/data/projects/notes.txt")
# Directory operations
files = vfs.ls("/data/projects")
vfs.mv("/data/projects/old", "/data/archive/old")
vfs.rm("/data/temp/cache.db")
# Batch operations
vfs.copy_recursive("/data/input", "/data/processed")Buckets are isolated namespaces within the VFS for organizing content:
# Create and manage buckets
vfs.create_bucket("ml-models", quota="10GB", policy="hot")
vfs.create_bucket("datasets", quota="100GB", policy="warm")
vfs.create_bucket("archive", quota="1TB", policy="cold")
# Bucket operations
vfs.write("/ml-models/resnet50.h5", model_data)
vfs.set_bucket_policy("ml-models", {
'replication': 3,
'cache_priority': 'high',
'backup_schedule': 'daily'
})
# List buckets and usage
buckets = vfs.list_buckets()
for bucket in buckets:
print(f"{bucket['name']}: {bucket['used']}/{bucket['quota']}")Journaling & Change Tracking:
# Filesystem journal tracks all changes
journal = vfs.get_journal(since="2024-01-01")
for entry in journal:
print(f"{entry['timestamp']}: {entry['operation']} {entry['path']}")
# Replicate changes to other nodes
vfs.replicate_journal(target_node="node2.example.com")Metadata & Indexing:
# Automatic metadata extraction and indexing
vfs.write("/docs/paper.pdf", pdf_data,
metadata={'author': 'Smith', 'year': 2024})
# Enhanced pin index for fast lookup
results = vfs.search(query="machine learning", content_type="pdf")See Also: VFS Management, Filesystem Journal
IPFS Kit integrates GraphRAG (Graph-based Retrieval Augmented Generation) for semantic search and knowledge management:
VFS GraphRAG indexing adds a dependency-light local index for virtual
filesystem metadata, text chunks, embedding metadata, graph entities,
relationships, snapshots, checkpoints, and portable export bundles. JSONL
storage works without live IPFS, vector database, LLM, or ipfs_datasets_py
services; optional adapters can provide richer chunking, embeddings, and
knowledge graph extraction.
python -m ipfs_kit_py.cli vfs index \
--index-root /tmp/vfs-graphrag \
--namespace research \
--path /data/reports/policy.md \
--backend local \
--protocol file \
--mime-type text/markdown \
--metadata-json '{"classification":"public"}'
python -m ipfs_kit_py.cli vfs search "policy" \
--index-root /tmp/vfs-graphrag \
--namespace research \
--type hybrid \
--filters-json '{"classification":"public"}'from ipfs_kit_py.vfs_manager import VFSManager
vfs = VFSManager(storage_path="/srv/ipfs-kit-state")
vfs.enable_graphrag_indexing_sync(
index_path="/srv/ipfs-kit-state/.vfs_graphrag_index",
namespace="research",
)
vfs.index_namespace_sync("research", root_path="/data/reports", recursive=True)
results = vfs.search_sync(
"policy",
namespaces=["research"],
metadata_filters={"classification": "public"},
search_type="hybrid",
)Export a searchable VFS snapshot with:
python -m ipfs_kit_py.cli vfs export-index \
--index-root /tmp/vfs-graphrag \
--namespace research \
--output /tmp/vfs-snapshotSee VFS GraphRAG Indexing for configuration, indexing workflows, metadata/vector/graph search examples, export and import bundles, privacy controls, dependency requirements, and backend limitations.
Automatic Content Indexing:
# All VFS operations auto-index content
vfs.write("/docs/research.md", markdown_content)
# → Automatic entity extraction, relationship mapping, graph building
# Search across indexed content
results = api.search_text("quantum computing applications")
results = api.search_graph("quantum computing", max_depth=2)
results = api.search_vector("semantic similarity query", threshold=0.7)Entity Recognition:
- Automatic extraction of people, places, organizations, concepts
- Relationship mapping between entities
- RDF triple store for structured knowledge
- Graph analytics (centrality, importance scoring)
Search Methods:
- Text Search - Full-text with relevance scoring
- Graph Search - Traverse knowledge graph connections
- Vector Search - Semantic similarity using embeddings
- SPARQL Queries - Structured RDF queries
- Hybrid Search - Combine multiple methods
# Hybrid search combines all methods
results = api.search_hybrid(
query="AI model deployment",
search_types=["text", "graph", "vector"],
limit=20,
min_score=0.6
)
# SPARQL for structured queries
results = api.search_sparql("""
SELECT ?model ?accuracy ?dataset
WHERE {
?model rdf:type :MLModel .
?model :accuracy ?accuracy .
?model :trainedOn ?dataset .
FILTER (?accuracy > 0.95)
}
""")Graph Analytics:
# Analyze knowledge graph
stats = api.search_stats()
print(f"Entities: {stats['entity_count']}")
print(f"Relationships: {stats['relation_count']}")
print(f"Indexed documents: {stats['document_count']}")
# Find important entities
important = api.get_top_entities(limit=10, metric="centrality")See Also: GraphRAG Documentation, Knowledge Graph
IPFS Kit provides a unified credential manager for securely storing API keys, tokens, and credentials:
from ipfs_kit_py.credential_manager import CredentialManager
cred_manager = CredentialManager()
# Add credentials for different services
cred_manager.add_s3_credentials(
name="production",
aws_access_key_id="AKIA...",
aws_secret_access_key="secret...",
region_name="us-west-2"
)
cred_manager.add_storacha_credentials(
name="default",
api_token="your_token",
space_did="did:web:..."
)
cred_manager.add_filecoin_credentials(
name="mainnet",
api_key="fil_api_key"
)
# Retrieve credentials securely
s3_creds = cred_manager.get_s3_credentials("production")
storacha_token = cred_manager.get_storacha_credentials()YAML Configuration:
# ~/.ipfs_kit/config.yaml
storage:
backends:
ipfs:
enabled: true
api_addr: "/ip4/127.0.0.1/tcp/5001"
filecoin:
enabled: true
lotus_path: "/path/to/lotus"
s3:
enabled: true
credential_name: "production"
bucket: "ipfs-backup"
region: "us-west-2"
storacha:
enabled: true
credential_name: "default"
cache:
memory_size: 500MB
disk_size: 10GB
disk_path: "/fast/ssd/cache"
cluster:
role: "master"
replication_factor: 3
peers:
- "/ip4/10.0.0.2/tcp/9096"
- "/ip4/10.0.0.3/tcp/9096"
vfs:
buckets:
ml-models:
quota: 10GB
policy: hot
replication: 3
datasets:
quota: 100GB
policy: warm
replication: 2# Credentials
export IPFS_KIT_S3_ACCESS_KEY="AKIA..."
export IPFS_KIT_S3_SECRET_KEY="secret..."
export W3_STORE_TOKEN="storacha_token"
export FILECOIN_API_KEY="fil_api_key"
# Configuration
export IPFS_PATH="/custom/ipfs/path"
export IPFS_KIT_CONFIG="/custom/config.yaml"
export IPFS_KIT_CACHE_DIR="/fast/ssd/cache"
# Feature flags
export IPFS_KIT_ENABLE_GRAPHRAG="true"
export IPFS_KIT_ENABLE_AUTO_HEALING="true"
# Optional: auto-install external daemon binaries (IPFS/Lotus) when missing
# Note: this may download platform-specific binaries.
export IPFS_KIT_AUTO_INSTALL_BINARIES="true"
# Optional: where downloaded binaries are placed
export IPFS_KIT_BIN_DIR="$HOME/.local/share/ipfs_kit_py/bin"Credential Storage:
- Store credentials in
~/.ipfs_kit/credentials.jsonwithchmod 600 - Never commit credentials to version control
- Use environment variables in CI/CD
- Consider system keyring integration for production
Configuration Security:
- Separate configs for dev/staging/prod
- Use secrets management services (AWS Secrets Manager, Vault)
- Rotate credentials regularly
- Audit access logs
See Also: Credential Management, Secure Credentials Guide
# Install core features
pip install ipfs_kit_py
# Walrus and fsspec backends are included in the core dependency set.
# The lazy loader can also install declared feature dependencies at first use
# unless IPFS_KIT_AUTO_INSTALL_LAZY_DEPS=0 is set.
# Install with AI/ML support
pip install ipfs_kit_py[ai_ml]
# Install with all features
pip install ipfs_kit_py[full]
# Development installation
git clone https://github.com/endomorphosis/ipfs_kit_py.git
cd ipfs_kit_py
pip install -e .[dev]The supervised implementation run completed the three active VFS/fsspec task
boards tracked under data/agent_supervisor/ipfs_kit_todo/state/:
| Track | Task board | Completed |
|---|---|---|
| Walrus fsspec backend | TODO_WALRUS_FSSPEC.md |
7 / 7 |
| fsspec backend improvements | TODO_FSSPEC_BACKENDS.md |
8 / 8 |
| VFS GraphRAG indexing | TODO_VFS_GRAPHRAG_INDEXING.md |
12 / 12 |
The state JSON files are the authoritative progress ledger. Some markdown checkboxes may still appear unchecked because the daemon could not rewrite the source boards after completing tasks, but the implementation state records all 27 tasks as completed with no ready, waiting, or blocked work remaining.
The Walrus, fsspec, and VFS GraphRAG work is available across the package, CLI, MCP server, dashboard, and browser SDK surfaces:
ipfs-kit walrus status
ipfs-kit walrus ls
ipfs-kit fsspec protocols
ipfs-kit graphrag status
ipfs-kit graphrag search "example query"Python imports are available lazily from the package root:
from ipfs_kit_py import (
VFSGraphRAGIndex,
WalrusFileSystem,
WalrusStorageClient,
create_walrus_filesystem,
register_fsspec_implementations,
)MCP clients can call walrus_status, walrus_list, walrus_get,
walrus_put, walrus_delete, fsspec_list_protocols,
fsspec_backend_status, fsspec_read, fsspec_write,
vfs_graphrag_status, vfs_graphrag_search,
vfs_graphrag_metadata_search, vfs_graphrag_vector_search,
vfs_graphrag_hybrid_search, vfs_graphrag_graph_search, and
vfs_graphrag_graph_hybrid_search, and vfs_graphrag_export. The dashboard
JavaScript SDK also exposes these through MCP.Walrus, MCP.FSSpec, and
MCP.VFSGraphRAG.
from ipfs_kit_py.high_level_api import IPFSSimpleAPI
# Initialize
api = IPFSSimpleAPI()
# Add content
result = api.add("Hello, IPFS!")
cid = result['cid']
print(f"Content added: {cid}")
# Retrieve content
content = api.get(cid)
print(f"Retrieved: {content}")
# Pin content for persistence
api.pin(cid)
# List all pins
pins = api.list_pins()from ipfs_kit_py.high_level_api import IPFSSimpleAPI
# Initialize as cluster master
api = IPFSSimpleAPI(role="master")
# Add content to cluster (distributed across nodes)
result = api.cluster_add("large_file.dat", replication_factor=3)
# Check replication status
status = api.cluster_status(result['cid'])
print(f"Replicated on {len(status['peers'])} nodes")
# List cluster peers
peers = api.cluster_peers()from ipfs_kit_py.high_level_api import IPFSSimpleAPI
import pandas as pd
api = IPFSSimpleAPI()
# Store dataset
df = pd.read_csv("training_data.csv")
result = api.ai_dataset_add(
dataset=df,
metadata={
"name": "customer_data_v1",
"version": "1.0",
"description": "Customer behavior dataset"
}
)
# Retrieve dataset later
dataset_cid = result['cid']
loaded_df = api.ai_dataset_get(dataset_cid)# Start MCP server with dashboard
ipfs-kit mcp start --port 8004
# Check server status
ipfs-kit mcp status
# View deprecation warnings
ipfs-kit mcp deprecations
# Start 3-node cluster
python tools/start_3_node_cluster.pyComprehensive documentation available in docs/:
- Installation Guide - Setup and requirements
- Quick Reference - Common operations
- API Reference - Complete API docs
- Cluster Guide - Cluster setup
- AI/ML Integration - Machine learning features
- MCP Server - MCP server documentation
- Examples - Code examples and tutorials
# Store application data immutably
api = IPFSSimpleAPI()
user_data = {"user_id": 123, "preferences": {...}}
cid = api.add(json.dumps(user_data))['cid']
# Share CID with users - data is permanently accessible
return f"ipfs://{cid}"# Publish trained model
model_path = "model.h5"
result = api.ai_model_add(
model=load_model(model_path),
metadata={"architecture": "ResNet50", "accuracy": 0.95}
)
# Others can load your model
model = api.ai_model_get(result['cid'])# Deploy content across cluster
api = IPFSSimpleAPI(role="master")
for file in website_files:
api.cluster_add(file, replication_factor=5)
# Content automatically available on all nodes# Backup with verification
result = api.add("important_data.zip", pin=True)
cid = result['cid']
# Later verification
assert api.exists(cid), "Backup lost!"
restored_data = api.get(cid)from ipfs_kit_py.high_level_api import IPFSSimpleAPI
api = IPFSSimpleAPI(
role="master", # master, worker, or leecher
resources={
"max_memory": "2GB",
"max_storage": "100GB"
},
cache={
"memory_size": "500MB",
"disk_size": "5GB"
},
timeouts={
"api": 60,
"gateway": 120
}
)# IPFS configuration
export IPFS_PATH=/path/to/.ipfs
export IPFS_KIT_CLUSTER_MODE=true
# MCP server
export IPFS_KIT_MCP_PORT=8004
export IPFS_KIT_DATA_DIR=~/.ipfs_kit
# Performance tuning
export IPFS_KIT_CACHE_SIZE=1GB
export IPFS_KIT_MAX_CONNECTIONS=50# Run all tests
pytest
# Run specific test suite
pytest tests/unit/
pytest tests/integration/
# Run with coverage
pytest --cov=ipfs_kit_py --cov-report=html
# Run cluster tests
pytest tests/test_cluster_startup.py -vWe welcome contributions! See CONTRIBUTING.md for guidelines.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- Python: 3.12+ required
- System: Linux (primary), macOS (supported), Windows (experimental)
- Memory: 4GB minimum, 8GB recommended for clusters
- Storage: 10GB minimum, 50GB+ recommended for production
- Network: Internet access for IPFS network connectivity
- Enhanced GraphRAG integration
- S3-compatible gateway
- WebAssembly support
- Mobile SDK (iOS/Android)
- Enhanced analytics dashboard
- Multi-region cluster support
This project is licensed under the AGPL-3.0 License - see the LICENSE file for details.
Built with:
- IPFS/Kubo - InterPlanetary File System
- IPFS Cluster - Cluster orchestration
- py-libp2p - LibP2P networking
- FastAPI - Modern web framework
- Documentation: docs/
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- ✅ Core IPFS operations - Production ready
- ✅ Cluster management - Production ready
- ✅ MCP server - Production ready
- ✅ AI/ML integration - Beta
- ✅ Auto-healing - Beta
- 🚧 GraphRAG - In development
- 📋 S3 Gateway - Planned
Version: 0.3.0
Status: Production Ready
Maintained by: Benjamin Barber (@endomorphosis)