Audit trail

Cryptographic integrity verification: merkle proofs, root progression, proof stability, and tamper detection.

Merkle proofs SHA-256 Verification Root progression Tamper detection

What this example covers

This example focuses exclusively on merkql's integrity layer—the merkle tree that backs every partition. No consumers are used; instead, the example works directly with the partition API to generate and verify proofs.

Merkle inclusion proofs

For any record at a given offset, generate a proof consisting of the leaf hash and the sibling hashes along the path to the root. A verifier can recompute the root from the leaf and confirm it matches—proving the record is unmodified.

Root hash progression

The merkle root changes with every append. Capturing the root at a point in time creates a cryptographic commitment to the entire log up to that point. If any record is later modified, the root would be different.

Proof stability

merkql uses an append-only binary carry chain for its merkle tree. Appending new records never modifies existing tree nodes, so proofs generated earlier remain valid after growth.

Content-addressed tamper detection

Every object in the pack file is stored under its SHA-256 hash. If the underlying bytes are modified on disk, re-reading and re-hashing produces a different hash—an immediate integrity violation.

Merkle proof verification: prove and verify panels showing how inclusion proofs work Prove root h₀₋₃ s₃ h₀₋₁ s₂ s₁ leaf {leaf, siblings[], root} Verify leaf hash + s₁ H(leaf+s₁) + s₂ H(..+s₂) + s₃ root' = root

Step by step

Steps 1–2

Produce audit events

Open a broker with defaults (no compression, no retention) to preserve the full history—critical for audit scenarios. Produce 50 events with structured JSON payloads containing user, action, resource, and timestamp fields.

let broker = Broker::open(BrokerConfig::new(dir.path())).unwrap();
let producer = Broker::producer(&broker);

let users = ["alice", "bob", "charlie", "diana", "eve"];
let actions = ["CREATE", "READ", "UPDATE", "DELETE", "LOGIN"];

for i in 0..50 {
    let event = AuditEvent {
        user: users[i % users.len()].to_string(),
        action: actions[i % actions.len()].to_string(),
        resource: resources[(i * 3) % resources.len()].to_string(),
        ts: 1700000000 + i as u64,
    };
    producer.send(&ProducerRecord::new(
        "audit", Some(event.user.clone()),
        serde_json::to_string(&event).unwrap(),
    )).unwrap();
}
Step 3

Generate and verify merkle proofs

Access the partition directly through broker.topic().partition(). For each of 5 sampled offsets, call partition.proof(offset) to generate a Proof struct containing the leaf hash, sibling hashes, and the root. Then verify with MerkleTree::verify_proof().

let topic = broker.topic("audit").unwrap();
let part_arc = topic.partition(0).unwrap();
let partition = part_arc.read().unwrap();

let sample_offsets = [0, 10, 25, 37, 49];
for &offset in &sample_offsets {
    let proof = partition.proof(offset).unwrap().unwrap();
    let valid = MerkleTree::verify_proof(
        &proof, partition.store()
    ).unwrap();
    assert!(valid);
}

The proof depth reflects the tree height at that offset. Offset 49 (the last leaf of 50) has depth 3 because the tree's right edge is shorter than the interior.

Offset  0: leaf=ed0b5b61ff5a084a... depth=6 valid=true
Offset 10: leaf=cf2a783ab91ecc1b... depth=6 valid=true
Offset 25: leaf=f0597ca3b071b2b3... depth=6 valid=true
Offset 37: leaf=75a8f5069a0171cf... depth=6 valid=true
Offset 49: leaf=7e26e538033f0c75... depth=3 valid=true
Steps 4–6

Root hash progression

Capture the merkle root, append 10 more events, and capture it again. The root changes because each append creates new branch nodes that propagate up to a new root.

let root_before = partition.merkle_root().unwrap().unwrap();

// Produce 10 more events...

let root_after = partition.merkle_root().unwrap().unwrap();
assert_ne!(root_before, root_after);

In practice, you would publish or escrow the root hash at regular intervals. Any future claim that “the log contained X at time T” can be verified against the escrowed root.

Root before: a67c8df3b074b586...
Root after:  398fdbbf5afde7fe...
Root changed as expected.
Step 7

Proof stability after appending

The 5 proofs generated before appending are re-verified after the 10 new records are added. All still pass. This is a key property of merkql's merkle tree: new leaves are added to the right, and the existing tree nodes are immutable in the content-addressed object store.

for proof in &earlier_proofs {
    let valid = MerkleTree::verify_proof(
        proof, partition.store()
    ).unwrap();
    assert!(valid);
}

Note: each proof was captured with the root hash at the time of generation. The proof verifies against that root, not the current one. Since the underlying tree nodes haven't changed, the proof remains valid even though the tree has grown.

Proof for offset 0: still valid = true
Proof for offset 10: still valid = true
Proof for offset 25: still valid = true
Proof for offset 37: still valid = true
Proof for offset 49: still valid = true
Step 8

Tamper detection

This step demonstrates what happens when data is modified on disk. The example reads a record at offset 10, computes its SHA-256 hash, then directly corrupts the pack file by flipping bytes at the entry's storage location. When the data is re-read through the object store and re-hashed, the hash no longer matches.

// Read the record and compute its expected hash
let record = partition.read(tamper_offset).unwrap().unwrap();
let serialized = record.serialize();
let expected_hash = Hash::digest(&serialized);

// Corrupt the pack file on disk (flip bytes at the entry)...
tamper_pack_file(&pack_path, &expected_hash);

// Re-read through the store — still finds the entry by offset,
// but the bytes are now different
let corrupted_data = partition.store().get(&expected_hash).unwrap();
let corrupted_hash = Hash::digest(&corrupted_data);
assert_ne!(expected_hash, corrupted_hash);

The tamper function walks the pack file's entry format ([4B length][32B hash][data]) to find and corrupt the target entry. In production, you wouldn't need this—the hash comparison alone detects any modification.

Record at offset 10: user=alice, action=CREATE
Expected hash: 36f0a3094a59beb0...
Tampered with pack file data for offset 10.
Re-read hash:  c24733d31c33d848...
Integrity check: hashes match = false (tamper DETECTED)

How the merkle tree works

Binary carry chain

merkql uses an incremental binary carry chain (similar to binary addition) for tree construction. When two entries at the same height exist, they merge into a branch one level up. This makes appends O(log n) and avoids rebuilding the tree from scratch.

Content-addressed storage

Every object—records, leaf nodes, branch nodes—is stored in a pack file keyed by its SHA-256 hash. This means identical content is never stored twice, and any modification to stored bytes is detectable by re-hashing.

Proof structure

A Proof contains the leaf hash (the record's hash), a list of sibling hashes with their side (Left or Right), and the root hash. Verification walks from leaf to root, hashing pairs at each level.

Tree snapshots

The tree's state (pending entries and count) is persisted to tree.snapshot after every write using atomic temp+fsync+rename. On reopen, the tree resumes without replaying the log.

Root hash progression and tamper detection across three phases Phase 1: 50 records root = a3f2... +10 Phase 2: 60 records root = 7b1c... Earlier proofs still valid tamper Phase 3: Byte flip ! hash mismatch Append-only growth preserves history — new records never modify existing tree nodes Any mutation breaks the hash chain — tamper is detected immediately on re-read

Regulatory compliance

For SOX, HIPAA, PCI-DSS, or GDPR audit requirements: periodically capture and escrow the merkle root, then hand auditors individual proofs. They can independently verify any record's inclusion without access to the full dataset.


APIs used

APIPurpose
broker.topic(name)Get a topic by name
topic.partition(id)Get a partition (returns Arc<RwLock>)
partition.proof(offset)Generate a merkle inclusion proof
MerkleTree::verify_proof()Verify a proof against the object store
partition.merkle_root()Get the current root hash
partition.read(offset)Read a single record by offset
partition.store()Access the content-addressed pack file store
store.get(hash)Retrieve object bytes by SHA-256 hash
Hash::digest(data)Compute SHA-256 hash
hash.to_hex()Convert hash to hex string for display
record.serialize()Serialize record to bytes (for hashing)
cargo run -p merkql-audit-trail