ZipZap is CLI tool that compresses text files using Huffman coding. It is made from scratch with custom data structures.
- Lossless compression using canonical Huffman coding
- Custom data structures implemented from scratch (no external DS libraries)
- Rich CLI interface with progress indicators and detailed output
- Compression analysis with file size reduction and time stats
- Debug tools to inspect codebooks, Huffman trees, and bitstreams
- Python 3.10+
- Poetry (recommended) or pip
# Clone the repository
git clone https://github.com/andrianllmm/zipzap.git
cd zipzap# Install with Poetry
poetry install# Or install with pip
pip install -e .Compress a text file to .zz format:
zipzap zip input.txt -o output.zzWithout specifying an output file, it creates <filename>.zz.
Decompress a .zz file back to text:
zipzap zap compressed.zz -o output.txtWithout specifying an output file, it creates <filename>_decoded.txt:
zipzap zip input.txt --time
# Shows encoding and writing timeszipzap zip input.txt --codebook
# Displays character-to-code mappings with frequencieszipzap zip input.txt --tree
# Shows the complete Huffman tree structurezipzap zip input.txt --contents
# Displays the few first lines of the input and output filesThe .zz file format consists of:
[Header]
- Number of unique characters (2 bytes)
- For each character:
- UTF-8 byte length (2 bytes)
- UTF-8 character bytes (variable)
- Code length in bits (2 bytes)
[Body]
- Total bit length (4 bytes)
- Packed bitstream (variable)
Run tests with pytest:
poetry run pytestContributions are welcome! To get started:
- Fork the project
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a pull request
Found a bug or issue? Report it on the issues page.