Data Engineering

  • 7th June 2026

Building a Bounded-Memory Pipeline for 944 Million Zeek Events in Rust

This article explores the design and implementation of a high-performance Rust pipeline that processed nearly one billion Zeek connection records while maintaining bounded memory usage. Starting from a simple NDJSON-to-CSV conversion task, the project evolved into a practical study of streaming architectures, backpressure, and large-scale log processing. Along the way, it highlights lessons learned about thread pools, bounded queues, and why scalability is often more about system design than raw speed. The resulting solution processed 944 million records with zero errors using only 16 worker threads and a small bounded queue. The techniques discussed form a strong foundation for building larger SOC, SIEM, and network telemetry ingestion systems in Rust.

Read more