Replacing Kafka in docker-compose

If your Kafka cluster is a single docker-compose service, you're paying for distribution you don't use. merkql gives you the same lifecycle as a library dependency.

Single-node Zero infrastructure Same API Sub-ms startup Merkle integrity

What docker-compose Kafka actually costs

Running Kafka on a single node gives you all the operational complexity of a distributed system with none of the distributed benefits.

Container count

2–5 containers: Kafka, ZooKeeper (or KRaft controller), Schema Registry, a UI tool, an init container for topic creation. Each one needs health checks, restart policies, and volume mounts.

Memory

1–2 GB JVM heap minimum just for Kafka. The official recommendation is 6 GB+. ZooKeeper adds another 512 MB–1 GB. On a laptop, that's your IDE's budget.

Startup time

30–60 seconds before your application can produce its first message. JVM class loading, log recovery, controller election, topic creation—all sequential.

YAML complexity

50+ lines of docker-compose configuration. 10+ environment variables. KAFKA_ADVERTISED_LISTENERS alone has caused more debugging sessions than most application bugs.

Port management

9092, 29092, 2181—three ports minimum, often conflicting with other services. Internal vs external listener configuration is a perennial source of "connection refused" errors.

CI tax

Docker-in-Docker or service containers in CI. 1–3 minute pipeline penalty just to spin up Kafka before tests can run. Flaky healthcheck timeouts on shared runners.

Before and after architecture: Docker containers with Kafka vs merkql as a library Before: docker-compose Your App rdkafka tcp Docker Kafka JVM 1–2 GB ZooKeeper JVM 512 MB After: merkql Your App merkql Library call. No network. No containers. 30–60s startup ~14µs startup
A typical docker-compose Kafka setup
# docker-compose.yml
version: "3.8"
services:
  zookeeper:
    image: confluentinc/cp-zookeeper:7.5.0
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181
      ZOOKEEPER_TICK_TIME: 2000
    ports:
      - "2181:2181"

  kafka:
    image: confluentinc/cp-kafka:7.5.0
    depends_on:
      - zookeeper
    ports:
      - "9092:9092"
      - "29092:29092"
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:29092,PLAINTEXT_HOST://localhost:9092
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
      KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_AUTO_CREATE_TOPICS_ENABLE: "true"
      KAFKA_HEAP_OPTS: "-Xmx1G -Xms1G"
    healthcheck:
      test: kafka-broker-api-versions --bootstrap-server localhost:9092
      interval: 10s
      timeout: 10s
      retries: 5
2–5
containers
1–2 GB
heap minimum
30–60s
startup
50+
lines YAML

Before and after

Same verbs, same mental model. subscribe, poll, commit—the lifecycle you already know.

Setup
docker-compose.yml + Cargo.toml
# docker-compose.yml — just the service definitions
zookeeper:
  image: confluentinc/cp-zookeeper:7.5.0
  ports: ["2181:2181"]
  environment:
    ZOOKEEPER_CLIENT_PORT: 2181

kafka:
  image: confluentinc/cp-kafka:7.5.0
  depends_on: [zookeeper]
  ports: ["9092:9092"]
  environment:
    KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
    KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092
    KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1

# Cargo.toml
[dependencies]
rdkafka = { version = "0.36", features = ["cmake-build"] }
Cargo.toml
# That's it.
[dependencies]
merkql = "0.1"
Producer
rdkafka
let producer: FutureProducer = ClientConfig::new()
    .set("bootstrap.servers", "localhost:9092")
    .set("message.timeout.ms", "5000")
    .create()
    .expect("Producer creation failed");

let record = FutureRecord::to("events")
    .key("user-1")
    .payload(r#"{"action":"login"}"#);
producer.send(record, Duration::from_secs(5)).await.unwrap();
merkql
let broker = Broker::open(BrokerConfig::new("./data")).unwrap();
let producer = Broker::producer(&broker);

producer.send(&ProducerRecord::new(
    "events", Some("user-1".into()), r#"{"action":"login"}"#
)).unwrap();
Consumer
rdkafka
let consumer: StreamConsumer = ClientConfig::new()
    .set("bootstrap.servers", "localhost:9092")
    .set("group.id", "my-service")
    .set("auto.offset.reset", "earliest")
    .set("enable.auto.commit", "false")
    .create()
    .expect("Consumer creation failed");

consumer.subscribe(&["events"]).unwrap();
loop {
    let msg = consumer.recv().await.unwrap();
    let payload = msg.payload_view::<str>().unwrap().unwrap();
    // process...
    consumer.commit_message(&msg, CommitMode::Sync).unwrap();
}
merkql
let mut consumer = Broker::consumer(&broker, ConsumerConfig {
    group_id: "my-service".into(),
    auto_commit: false,
    offset_reset: OffsetReset::Earliest,
});
consumer.subscribe(&["events"]).unwrap();
let records = consumer.poll(Duration::from_millis(100)).unwrap();
// process records...
consumer.commit_sync().unwrap();
API mapping: Kafka verbs map directly to merkql equivalents Kafka lifecycle → merkql lifecycle rdkafka ClientConfig create() subscribe() recv() commitSync() close() = = = = = = merkql BrokerConfig open() subscribe() poll() commit_sync() close() Same verbs, same order, same mental model. No async runtime required.

What you keep

Everything you use from Kafka on a single node is available in merkql.

Featuremerkql
TopicsAuto-created on first send
Partitionsdefault_partitions, key-based routing
Consumer groupsIndependent offset tracking per group
Offset trackingcommit_sync() persists to disk
Offset resetEarliest / Latest
Retentionmax_records per topic
CompressionLZ4, transparent
Crash safetyAtomic writes + fsync
Batch APIsend_batch(), single fsync

What you gain

Merkle proofs

SHA-256 inclusion proofs for every record. Prove any record hasn't been modified—the math is the trust model. Kafka doesn't offer this at any price.

Sub-millisecond startup

Broker reopen takes ~14µs. No JVM boot, no controller election, no log recovery. Your tests start instantly.

Zero-config testing

tempdir(), produce, consume, assert, drop. No Docker, no port conflicts, no test isolation problems, no healthcheck waits.

Smaller footprint

Rust library, 5 dependencies, hundreds of KB. No JVM, no GC pauses, no heap tuning.


When to stay with Kafka

merkql is not a Kafka replacement for every workload. Stay with Kafka if you need:

Multi-node replication

If you need fault tolerance across machines, Kafka's replication protocol is battle-tested. merkql runs on one machine.

Network consumers on different machines

merkql is a library, not a server. All producers and consumers must be in the same process. For remote consumers, put an API in front of it.

Kafka ecosystem

Kafka Connect, Schema Registry, ksqlDB, Debezium CDC—if your architecture depends on these, Kafka is the right choice.

Extreme distributed throughput

Kafka's multi-broker, multi-partition architecture can sustain millions of messages per second across a cluster. merkql targets single-node workloads.

Cross-partition transactions

Each merkql partition is independent. If you need atomic writes across partitions, Kafka's transaction protocol handles this.

If your docker-compose has one Kafka broker and all producers/consumers are on the same machine, merkql is likely a better fit.


Migration checklist

Seven steps from docker-compose Kafka to merkql.

Step 1

Add merkql, remove rdkafka

Replace the rdkafka dependency in your Cargo.toml with merkql. No C library build, no cmake-build feature flag, no librdkafka system dependency.

# Before
[dependencies]
rdkafka = { version = "0.36", features = ["cmake-build"] }

# After
[dependencies]
merkql = "0.1"
Step 2

Replace connection setup

Replace ClientConfig::new().set("bootstrap.servers", ...) with Broker::open(BrokerConfig::new(path)). The path is a directory on the local filesystem.

// Before
let config = ClientConfig::new()
    .set("bootstrap.servers", "localhost:9092");

// After
let broker = Broker::open(BrokerConfig::new("./data/events")).unwrap();
Step 3

Replace producer

Replace FutureProducer creation and FutureRecord sends with Broker::producer() and ProducerRecord. No async runtime required.

Step 4

Replace consumer

Replace StreamConsumer with Broker::consumer(). The config maps directly: group.idgroup_id, auto.offset.resetoffset_reset, enable.auto.commitauto_commit.

Step 5

Remove docker-compose Kafka services

Delete the kafka, zookeeper, and any related services (schema-registry, kafka-ui, init containers) from your docker-compose.yml.

Step 6

Update CI

Remove Kafka service containers from your CI pipeline. No more Docker-in-Docker, no more healthcheck waits, no more 1–3 minute startup penalty.

Step 7

Add integrity verification (optional)

This is a new capability that Kafka doesn't offer. Generate and verify merkle proofs for any record. Use this for audit trails, compliance, or tamper detection.

let topic = broker.topic("events").unwrap();
let partition = topic.partition(0).unwrap().read().unwrap();
let proof = partition.proof(0).unwrap().unwrap();

assert!(MerkleTree::verify_proof(&proof, partition.store()).unwrap());