Overview

ByteForge is an experimental C++ project exploring whether quantized .gguf model files contain enough byte-level structure to support meaningful custom compression.

The project began while working with local Small Language Models (SLMs), where even aggressively quantized models often remain hundreds of megabytes or multiple gigabytes in size.

The goal is not to compete with mature compressors, but to understand how model files behave at the binary level and whether custom formats can exploit recurring patterns within metadata, tokenizer data, and quantized tensor blocks.

All experiments are lossless. Every compressed output must rebuild the original source bytes exactly.

Research Goals

Understand the byte structure of .gguf files.
Measure how compressible quantized models actually are.
Explore custom binary formats.
Experiment with nibble-based encoding schemes.
Test dictionary-based approaches.
Compare metadata regions versus tensor regions.
Build a reliable compression → decompression pipeline.

Core Workflow

GGUF File
    ↓
Read Raw Bytes
    ↓
Compress
    ↓
Write Custom Format
    ↓
Decompress
    ↓
Rebuild GGUF
    ↓
Byte-for-Byte Validation

Success is only achieved when the rebuilt file exactly matches the source.