DistGrep is a distributed grep system implementation in Go. It demonstrates key concepts in distributed systems including Leader Election (using Raft) and distributed processing (using MapReduce).
The system consists of a fixed 4-node cluster. Each node runs:
- HTTP Server: Handles client requests for job submission and status checks.
- Raft Consensus Module: Manages leader election and cluster coordination.
- MapReduce Engine: Executes the distributed grep search logic.
By default, the 4 nodes are configured as follows:
| Node ID | HTTP Port | Raft Port | Data Dir |
|---|---|---|---|
| node-1 | 8081 | 9001 | /tmp/dgrep-node-1 |
| node-2 | 8082 | 9002 | /tmp/dgrep-node-2 |
| node-3 | 8083 | 9003 | /tmp/dgrep-node-3 |
| node-4 | 8084 | 9004 | /tmp/dgrep-node-4 |
- Go 1.18 or higher
To start the 4-node cluster in a single process (simulate only):
make runThis will spin up all 4 nodes. You will see logs indicating leader election and server startup.
Once the cluster is running and a leader is elected, you can submit a grep job via HTTP POST to the leader (or any node, though forwarding might not be fully implemented if not leader-aware).
Endpoint: POST /submit
Body:
{
"pattern": "pattern",
"files": ["file1.txt", "/home/dir/"]
}Example: Search for "error" in a log file:
curl -X POST http://localhost:8081/submit \
-H "Content-Type: application/json" \
-d '{
"pattern": "error",
"files": ["/home/file1.txt", "/home/file2.txt"]
}'You can check the status of any node to see if it is the leader.
Endpoint: GET /status
Example:
curl http://localhost:8081/status- Distributed Search: Uses MapReduce to parallelize the search across available workers (simulated in this single-process version).
- Fault Tolerance: Uses Raft for coordination (Leader/Follower states).
- Match Highlighting: Returns matches with the pattern highlighted (ANSI colors).
main.go: Entry point, sets up the cluster.internal/:coordinator/: Master logic for job distribution.grep/: The specific Grep implementation (Map/Reduce functions).http/: HTTP API server.raft/: Raft consensus implementation.mapreduce/: General MapReduce.