Skip to content

snugpenguin968/Distributed-File-System

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Distributed File System (DFS)

A high-performance distributed file system implemented in modern C++.

System Architecture

The DFS is composed of three primary components: the NameNode, multiple DataNodes, and a Client Library.

NameNode (Metadata Master)

The NameNode manages the global namespace and file system state.

  • Metadata Registry: In-memory file tree backed by a write-ahead log (WAL) to ensure durability and crash recovery. Each FileData entry stores block IDs and an optional opaque encryption-key blob (set only for encrypted files).
  • Heartbeat Receiver: Tracks DataNode liveness and which blocks each node holds, based on periodic heartbeats and block reports.
  • Replication Manager: Listens to heartbeat deltas and schedules replication work for under‑ or over‑replicated blocks.
  • Authentication and Access Control: Verifies principal credentials and issues HMAC-signed, time-limited tokens that gate all RPCs via middleware.
  • RPC Frontend: A secure RPC server exposes operations such as Login, CreateFile, AllocateBlock, GetFileBlocks, Stat, StoreEncryptionMeta, GetEncryptionMeta, and heartbeat.

DataNode (Storage Worker)

DataNodes provide the storage backbone of the system.

  • Local Storage Engine: Manages 64MB data blocks. Block files are mapped directly into the process address space using memory-mapped I/O (mmap) for high-performance, zero-copy reads.
  • Concurrency Control: Uses a striped ConcurrentBlockStore with per-stripe reader–writer locks to allow high-concurrency access while serializing writes per block.
  • Health Monitoring: A background HeartbeatSender thread periodically sends heartbeats and incremental/full block reports to the NameNode, including disk usage and listen port.
  • Zero-Knowledge Storage: DataNodes store only raw bytes. When a file is uploaded via EncryptedDfsClient, those bytes are AES-256-GCM ciphertext — DataNodes cannot decrypt or inspect file contents.

Client Library

The client library provides a simplified interface to the distributed system.

  • Session Management: DfsSession manages authentication, NameNode RPCs, and DataNode connections, including connection pooling and token handling.
  • High-level API: DfsClient and ResilientDfsClient expose operations like mkdir, stat, put_file, and get_file.
  • Encrypted API: EncryptedDfsClient is a drop-in replacement for DfsClient that transparently encrypts all file data with AES-256-GCM before upload and decrypts on download. The plaintext FileKey never leaves the client process.
  • Fault Tolerance: A resilience layer implements exponential backoff, circuit breaking, and idempotent writes so transient NameNode/DataNode failures are retried safely.
  • Transport Security: All client↔NameNode and client↔DataNode traffic uses a SecureChannel (ECDH + AES-256-GCM) so data and metadata are encrypted on the wire.

Technical Features

1. Networking and RPC Protocol

  • Thread-Pooled Secure RPC Server: Built on POSIX networking primitives (send, recv, accept) with a fixed-size thread pool and a connection semaphore to bound concurrency.
  • Synchronous RPC: RPC calls use a request–response model over length-prefixed message channels (MessageChannel) layered on TcpStream.
  • Binary Frame Format: Messages follow a compact binary format (header + payload) using custom serialization to minimize overhead.

2. Authentication & Authorization

  • HMAC-Based Identity: Clients authenticate using a symmetric shared secret. The initial handshake uses HMAC-SHA256 to verify identity without transmitting secrets in plaintext.
  • Session-Based Tokens: Upon login, the NameNode issues an HMAC-signed AuthToken containing a permissions bitmask and expiry timestamp.
  • Granular Permissions: Supports POSIX-like permissions (Read, Write, Delete, List, Admin) enforced via middleware in the RPC layer.
  • Revocation System: The NameNode maintains an in-memory revocation list for immediate invalidation of compromised tokens.

3. Security & Integrity

  • Per-Session Transport Encryption: All traffic between Client, NameNode, and DataNodes is encrypted using AES-256-GCM. Symmetric keys are derived per-session via ECDH P-256 key exchange during the initial handshake.
  • Client-Side File Encryption: EncryptedDfsClient encrypts each file with a unique 256-bit FileKey before upload. The FileKey is PBKDF2-SHA256 wrapped (100,000 iterations) with the user's master password and stored as an opaque 76-byte blob on the NameNode. DataNodes store only ciphertext and GCM authentication tags.
  • Block Integrity: Every 64MB data block is protected by a SHA-256 hash stored in a local sidecar file on the DataNode. Bit-rot is detected automatically upon every read operation.
  • Secure Zeroing: Sensitive key material is stored in SecureBuffer containers that use platform-specific primitives (e.g., memset_s) to zero memory on destruction.

4. Client-Side Encryption (AES-256-GCM)

The EncryptedDfsClient provides transparent at-rest encryption:

Key Hierarchy

Master Password ──PBKDF2-SHA256 (100k iters, 16B random salt)──▶ Wrap Key (32B)
                                                                         │
Random FileKey (32B) ──AES-256-GCM (12B random nonce)──▶ EncryptedFileKey (76B)
                                                                         │
NameNode stores: salt(16) ‖ wrap_nonce(12) ‖ ciphertext+tag(48) = 76 bytes

Per-Block Encryption Each 64MB block at index i uses a deterministic nonce derived from its position:

nonce = BE64(i) ‖ 0x00000000   (12 bytes, prevents nonce reuse across blocks)
AAD   = BE64(i)                 (8 bytes,  prevents block reordering attacks)

Upload Flow (put_file):

  1. Generate random FileKey (32B) and wrap it with PBKDF2 + AES-GCM → store 76B blob on NameNode
  2. Split plaintext into 64MB chunks; encrypt each chunk with the FileKey using its block index
  3. Upload ciphertext+tag to DataNodes — zero plaintext ever written to disk

Download Flow (get_file):

  1. Fetch the 76B blob from NameNode; re-derive wrap key via PBKDF2; unwrap FileKey
  2. Fetch each ciphertext block and decrypt in order; GCM authentication detects any tampering

5. Replication & Availability

  • Durable Single-Master Registry: The NameNode's namespace is backed by a WAL so metadata survives crashes and restarts.
  • Automated Replication: ReplicationManager consumes heartbeat deltas and maintains target replication by scheduling copy/delete commands.
  • Disk-Aware Placement: Replication targets are chosen based on reported disk usage to keep data reasonably balanced.
  • Failure Detection: Nodes are declared dead after missing heartbeats beyond a configurable threshold, which triggers re-replication of blocks they held.

Usage Guide

Prerequisites

  • CMake 3.20+
  • Clang or GCC with C++23 support
  • OpenSSL 3.0+

Building the Project

mkdir build
cmake -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build -j$(nproc)

Running the Services

From the repository root after building:

  1. Start the NameNode (metadata master):

    ./build/namenode
    • Listens on port 9000
    • Uses namenode.wal in the working directory for crash recovery
    • Registers a default admin principal (id=1) with full permissions
  2. Start one or more DataNodes:

    ./build/datanode --node-id 1 --data-dir /tmp/dn1
    ./build/datanode --node-id 2 --data-dir /tmp/dn2
    ./build/datanode --node-id 3 --data-dir /tmp/dn3 --port 9010

    Each DataNode:

    • Connects to the NameNode and sends periodic heartbeats with block reports
    • Serves WriteBlock, ReadBlock, and WriteBlockIdempotent RPCs
    • Stores blocks in the specified --data-dir
    • Listens on an OS-assigned ephemeral port by default (override with --port)

    Full option list:

    Flag Default Description
    --node-id 1 Unique numeric identifier for this DataNode
    --nn-host 127.0.0.1 NameNode hostname
    --nn-port 9000 NameNode port
    --port 0 (ephemeral) DataNode RPC listen port
    --data-dir ./datanode_storage Block storage directory
    --workers 4 Number of RPC worker threads
  3. Shutdown: Press Ctrl+C (sends SIGINT) on any service for a graceful shutdown.

Using the Encrypted Client

#include "client/encrypted_dfs_client.hpp"

// Populate password bytes (e.g. from user input)
dfs::crypto::SecureBuffer password(my_password.size());
std::memcpy(password.data(), my_password.data(), my_password.size());

dfs::client::SessionConfig cfg{ .namenode_host = "127.0.0.1", .namenode_port = 9000, ... };
dfs::client::EncryptedDfsClient client(cfg, std::move(password));

client.connect();  // authenticates with NameNode

// Upload: plaintext is encrypted on the client before any network I/O
client.put_file("/data/secret.txt", plaintext_bytes);

// Download: ciphertext fetched from DataNodes, decrypted locally
auto plaintext = client.get_file("/data/secret.txt");

Testing and Verification

Automated Test Suite

The project uses ctest to run all 19 module and integration tests defined in CMakeLists.txt.

cmake --build build
cd build
ctest --output-on-failure

Running a Specific Test

ctest --test-dir build -R <TestName> -V
# e.g.:
ctest --test-dir build -R EncryptionTest -V

Or run the test binary directly:

./build/dfs_encryption_test

Module Test Reference

Binary CTest Name Coverage
dfs_block_manager_test BlockManagerTest Block file creation, mmap I/O, integrity checks
dfs_serialization_test SerializationTest Binary encode/decode round-trips
dfs_hash_test HashTest SHA-256 block hashing and sidecar files
dfs_block_store_test BlockStoreTest Block store read/write with integrity
dfs_network_test NetworkTest TCP stream send/recv
dfs_rpc_test RpcTest RPC request/response framing
dfs_crypto_test CryptoTest AES-256-GCM, ECDH, PBKDF2, random bytes
dfs_secure_channel_test SecureChannelTest ECDH handshake + encrypted channel
dfs_thread_pool_test ThreadPoolTest Thread pool task dispatch and shutdown
dfs_concurrent_block_store_test ConcurrentBlockStoreTest Concurrent read/write with striped locking
dfs_rpc_server_test RpcServerTest DataNode RPC server handler dispatch
dfs_heartbeat_test HeartbeatTest Heartbeat sender/receiver and node registry
dfs_metadata_registry_test MetadataRegistryTest Namespace operations and WAL persistence
dfs_replication_manager_test ReplicationManagerTest Under/over-replication detection and scheduling
dfs_auth_manager_test AuthManagerTest HMAC login, token issuance, revocation
dfs_client_test DfsClientTest Client session, namespace ops, block put/get
dfs_namenode_server_test NameNodeServerTest End-to-end NameNode + DataNode + client integration
dfs_fault_tolerance_test FaultToleranceTest Retry, backoff, and circuit-breaker logic
dfs_encryption_test EncryptionTest PBKDF2, key wrap/unwrap, block encryption, encrypted put/get (23 tests)

Project Structure

  • include/ — Module-based header definitions.
    • include/common/ — Shared types, crypto primitives, serialization, RPC, networking.
    • include/client/ — DFS client library (DfsClient, ResilientDfsClient, EncryptedDfsClient, session, key management).
    • include/namenode/ — NameNode server, metadata registry, auth, replication.
    • include/datanode/ — DataNode block storage and RPC server.
  • src/ — Core implementation logic.
  • tests/ — Unit and integration tests for all architectural layers.

About

A high-performance distributed file system implemented in modern C++

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors