merkql

Kafka semantics. Merkle integrity. Zero infrastructure.
An embedded event log for Rust.

101 tests Crash-safe LZ4 Jepsen-verified Fuzz-tested

Why merkql

Zero infrastructure

No JVM, no ZooKeeper, no network, no containers. Add a dependency, open a directory. Your event log starts when your process starts.

Tamper-evident

Every partition is a merkle tree. Every record gets a SHA-256 inclusion proof. Prove any record hasn't been modified—the math is the trust model.

Crash-safe

Atomic writes (temp+fsync+rename) for all metadata. Index fsynced on every write. Jepsen-style fault injection proves recovery from crashes and truncation.

Kafka-compatible

Topics, partitions, consumer groups, offset management. Subscribe, poll, commit, close—the lifecycle you already know.

Concurrent

RwLock per partition, RwLock on topic map, Mutex per consumer group. Readers never block readers. Writers only block their own partition.

Persistent and compact

All state survives restarts. Optional LZ4 compression. Configurable retention with max_records per topic. Batch API amortises fsync.


Performance

Single-threaded, single-partition, uncompressed. Batch writes amortise fsync.

31 µs
append (256B)
179K/s
sequential read
4.8 µs
random access
150 µs
proof gen+verify

See BENCHMARKS.md for full results.


Use cases

Tamper-evident audit logs

Regulatory environments (SOX, HIPAA, PCI-DSS, GDPR) require demonstrating that audit records haven't been modified after the fact. merkql provides a cryptographic inclusion proof for every record—hand an auditor a proof and a root hash and they can independently verify authenticity without access to your systems.

Produce and verify an audit event
// Write an audit event
let producer = Broker::producer(&broker);
producer.send(&ProducerRecord::new("audit",
    Some("user-1001".into()),
    r#"{"action":"access_patient_record","id":"R-2847"}"#
)).unwrap();

// Later: generate a proof for any record
let topic = broker.topic("audit").unwrap();
let partition = topic.partition(0).unwrap().read().unwrap();
let proof = partition.proof(0).unwrap().unwrap();

// Anyone can verify independently
assert!(MerkleTree::verify_proof(&proof, partition.store()).unwrap());

Event sourcing without infrastructure

Event-sourced applications need an append-only log with multiple consumer groups, offset tracking, retention, and durability. merkql provides all of these as a library—no Docker, no cluster management, no network configuration.

Domain events with compression and retention
let config = BrokerConfig {
    compression: Compression::Lz4,
    default_retention: RetentionConfig { max_records: Some(100_000) },
    ..BrokerConfig::new("./data/events")
};
let broker = Broker::open(config).unwrap();

// Each projection gets its own consumer group
let mut projections = Broker::consumer(&broker, ConsumerConfig {
    group_id: "order-summary-projection".into(),
    auto_commit: false,
    offset_reset: OffsetReset::Earliest,
});

Integration testing

Replace a real Kafka cluster in your test suite. Same produce/subscribe/poll/commit lifecycle, starts in microseconds from a temp directory, provides the same consumer group semantics—without Docker, port conflicts, or test isolation problems.

Test with real consumer group semantics
#[test]
fn order_processing_pipeline() {
    let dir = tempfile::tempdir().unwrap();
    let broker = Broker::open(BrokerConfig::new(dir.path())).unwrap();

    // Simulate upstream events
    let producer = Broker::producer(&broker);
    for order in test_orders() {
        producer.send(&ProducerRecord::new("orders", None, order)).unwrap();
    }

    // Run your real consumer logic
    let mut consumer = Broker::consumer(&broker, config);
    consumer.subscribe(&["orders"]).unwrap();
    let records = consumer.poll(Duration::from_millis(100)).unwrap();
    assert_eq!(records.len(), test_orders().len());
}

Edge and embedded systems

IoT gateways, POS terminals, medical devices, vehicle telemetry—anywhere you need local event buffering with integrity guarantees and no network dependencies. LZ4 compression keeps storage manageable. Retention policies prevent unbounded growth.

Local development

Run your Kafka-based microservices locally without Docker Compose. Same topic/partition/consumer-group semantics, zero startup time.


Quick start

cargo add merkql
Produce and consume
use merkql::broker::{Broker, BrokerConfig};
use merkql::consumer::{ConsumerConfig, OffsetReset};
use merkql::record::ProducerRecord;
use std::time::Duration;

let broker = Broker::open(BrokerConfig::new("/tmp/my-log")).unwrap();

// Produce
let producer = Broker::producer(&broker);
producer.send(&ProducerRecord::new(
    "events", Some("user-1".into()), r#"{"action":"login"}"#
)).unwrap();

// Consume
let mut consumer = Broker::consumer(&broker, ConsumerConfig {
    group_id: "my-service".into(),
    auto_commit: false,
    offset_reset: OffsetReset::Earliest,
});
consumer.subscribe(&["events"]).unwrap();
let records = consumer.poll(Duration::from_millis(100)).unwrap();

for record in &records {
    println!("{}: {}", record.topic, record.value);
}

consumer.commit_sync().unwrap();
consumer.close().unwrap();
Verify integrity
let topic = broker.topic("events").unwrap();
let partition = topic.partition(0).unwrap().read().unwrap();
let proof = partition.proof(0).unwrap().unwrap();

use merkql::tree::MerkleTree;
assert!(MerkleTree::verify_proof(&proof, partition.store()).unwrap());

Kafka API mapping

If you know Kafka, you already know merkql.

merkqlKafka
Broker::open(config)bootstrap.servers
Broker::producer(broker)new KafkaProducer<>(props)
producer.send(record)producer.send(record)
producer.send_batch(records)merkql-specific
Broker::consumer(broker, config)new KafkaConsumer<>(props)
consumer.subscribe(&["topic"])consumer.subscribe(List.of(...))
consumer.poll(timeout)consumer.poll(Duration)
consumer.commit_sync()consumer.commitSync()
consumer.close()consumer.close()

Correctness

Jepsen-style test suite: fault injection with real assertions, property-based testing, and data-backed correctness claims at scale.

Merkle tree structure: 8-leaf binary tree with proof path highlighted root h₀₋₃ h₄₋₇ h₀₋₁ h₂₋₃ h₄₋₅ h₆₋₇ h₀ h₁ h₂ h₃ h₄ h₅ h₆ h₇ Proof for record h₁: 3 sibling hashes (dashed), O(log n) verification Highlighted path shows leaf → root traversal with sibling hashes at each level
101
tests passing
10K
total order verified
10K
merkle proofs verified
5
fault scenarios
PropertyEvidence
Total orderOffsets monotonically increasing, gap-free across 10K records and 4 partitions
Durability5,000 records survive 3 broker close/reopen cycles
Exactly-onceConsumer groups deliver every record exactly once across 4 commit/restart phases
Merkle integrity100% valid inclusion proofs for 10,000 records
Byte fidelity500 edge-case payloads preserved exactly (unicode, CJK, RTL, up to 64KB)
Crash recovery1,000 records recovered after 10 ungraceful drop cycles

Fault injection

Every fault test asserts correctness—not just survival.

FaultAssertion
Crash (drop without close)All committed records recovered
Truncated snapshotSafe failure or graceful reopen
Truncated indexReopens with N-1 records, zero read errors
Missing snapshotAll records readable, new appends succeed
Index ahead of snapshotRecords readable, proofs valid

Comparison

merkqlKafkaSQLite WALCustom file
Consumer groups
Offset tracking
Partitions
Tamper detectionMerkle proofs
DeploymentLibraryClusterLibraryLibrary
CompressionLZ4MultipleManual
RetentionManualManual
Crash safetyAtomic + fsyncReplicated logWALManual
Distributed

Architecture

Content-addressed storage with a git-style object layout.

Architecture overview: Broker → Topic → Partition → PackFileStore, MerkleTree, OffsetTracker Broker RwLock<HashMap> Topic RwLock<Partition> per partition Partition owns store, tree, offsets append + read + proof PackFileStore [len: 4B][hash: 32B][data] content-addressed, SHA-256 MerkleTree binary carry chain incremental, snapshot-safe OffsetTracker fixed-width index RwLock boundary ownership
.merkql/
  topics/
    {topic-name}/
      meta.bin                    # Topic config (partition count)
      partitions/
        {id}/
          objects/ab/cdef...      # Content-addressed merkle nodes + records
          offsets.idx             # Fixed-width index: offset → record hash
          tree.snapshot           # Incremental tree state (atomic write)
          retention.bin           # Retention marker (atomic write)
  groups/
    {group-id}/
      offsets.bin                 # Committed offsets per topic-partition
  config.bin                      # Broker config

Every object is keyed by its SHA-256 hash—identical data is never stored twice. The merkle tree uses an incremental binary carry chain: appends are O(log n) and the tree state is captured in a compact snapshot that survives restarts without replaying the log.


What merkql does not do

No distributed replication

merkql runs on one machine. If you need fault tolerance across nodes, use Kafka.

No network protocol

merkql is a library, not a server. For remote producers/consumers, put an API in front of it.

No cross-partition transactions

Each partition is independent.

No schema registry

Records are opaque byte strings.

merkql is for workloads where the Kafka programming model is right but the Kafka deployment model is wrong.