GitClone++ is a standalone, production-grade implementation of a Git-like Version Control System. Built from scratch as a systems engineering project, it strictly adheres to the architectural concepts and content-addressable storage model underlying real Git. It supports a full staging index, dynamic branch resolution, working tree restoration, and a 3-way reconciliation status engine.
GitClone++ enforces a strict separation between the user-facing Porcelain layer and the low-level Plumbing layer, providing robust component isolation.
graph TD
CLI[CLI Porcelain] --> Cmd[Commands Layer]
Cmd --> Repo[Repository Layer]
Repo --> Index[.git/index Staging Area]
Repo --> Storage[Object Storage Plumbing]
Storage -.-> Blobs[Blob Objects]
Storage -.-> Trees[Tree Objects]
Storage -.-> Commits[Commit Objects]
This project demonstrates strong systems programming patterns, performance optimization, and modern C++ fundamentals:
-
Content-Addressable Storage: Files are indexed and stored entirely by the
SHA-1hashes of their binary contents, bypassing standard filesystem hierarchies. -
The Git Object Model: Implements full 2-way deserialization and serialization for Git's core primitives:
blobs,trees, andcommits. Standard object headers (e.g.,blob <size>\0) are systematically stripped and injected during disk I/O. -
Staging / Index Design: Features a dedicated staging index (
.git/index) operating as a high-speed caching layer between the physical working directory and the permanent repository snapshot. -
Branch References & Detached HEAD: Dynamically resolves symbolic branch pointers (
ref: refs/heads/main) and seamlessly handles explicit detached HEAD states natively. -
Working Tree Restoration (
checkout): Performs surgical multi-file operations. It compares existing index metadata to disk, identifies untracked conflicts to prevent data loss, actively removes obsolete tracked files, and restores target blobs perfectly. -
3-Way Reconciliation Status Engine: Features a high-performance
$O(1)$ lookup status engine that triangulates state between the physical Working Directory, the Staging Index, and the HEAD Tree to perfectly categorize files intostaged,modified,deleted, anduntracked.
| Command | Description |
|---|---|
mygit init |
Initializes a new empty repository in the current directory. |
mygit hash-object <file> |
Low-level plumbing command to calculate the SHA-1 of a physical file. |
mygit add <file|.> |
Stages a file or directory for commit. Recursively skips ignored directories. |
mygit commit -m "<msg>" |
Snapshots the staging area into a permanent, traversable commit object. |
mygit log |
Traverses and formats the commit lineage backward from the active HEAD. |
mygit branch |
Lists all local branches, highlighting the active checkout context. |
mygit branch <name> |
Creates a new branch pointing to the current commit. |
mygit checkout <branch> |
Safely restores the working tree and index to the target branch state. |
mygit status |
Resolves and displays the exact state of the staging area and working directory. |
$ mygit init
Initialized empty Git repository in .git/
$ echo "int main() { return 0; }" > main.cpp
$ mygit status
On branch main
Untracked files:
main.cpp
$ mygit add .
$ mygit commit -m "Initial commit"
[refs/heads/main 7e6b541] Initial commit
$ mygit status
On branch main
Nothing to commit, working tree cleanThis project has zero external dependencies and relies entirely on standard C++20 and STL features.
Generate and Build via CMake:
cmake -B build
cmake --build build --config ReleaseWhile structurally accurate to real Git, this MVP diverges in a few specific ways for project scoping:
- Flattened Tree Snapshot: Instead of a deeply recursive hierarchy of
Treeobjects wrapping other nested directories, ourTreeobject stores a completely flattened list of paths (e.g.,src/utils/file.cpp). - Line-based Diffing: The system computes file modifications perfectly via size/SHA-1 deltas, but it does not yet output inline LCS diff patches to the console.
- Packfiles: Objects are stored strictly loose inside
.git/objects/. Packfile compression and delta encoding are not currently implemented.