Mail Logs Grow Fast

A modest production proxy handling a few thousand messages per hour generates hundreds of megabytes of log data per day. Across multiple log streams (main log, forward error log, debug log when enabled), and multiple weeks of retention, you're easily looking at tens of gigabytes of disk consumed by logs alone.

This is fine on a server with terabytes of storage, but mail proxies often run on small VMs or appliances where disk space is more constrained. And even on big servers, log volume tends to grow over time as traffic scales — what was 30GB of retention a year ago is 80GB today, and nobody notices until /var/log fills up and the proxy starts failing writes.

Why Compression Helps So Much

Mail logs are nearly ideal compression input. They consist of repetitive structured text — timestamps in the same format, similar sender/recipient patterns, repeated server hostnames, status codes from a small enumerated set. Real-world compression ratios for SMTP logs typically range from 8:1 to 20:1 with gzip, and even higher with zstd or xz.

That means a 1GB rotated log file becomes 50-125MB compressed. Across a 30-rotation retention window, that's the difference between consuming 30GB of disk and consuming 1.5-3.5GB. The compression cost happens once at rotation time and is invisible during normal logging.

Configuration

Compression is opt-in via the compress_rotated setting in the logging block:

logging:
  file: "/var/log/spam-filter/spam-filter.log"
  level: "INFO"
  max_size: 52428800       # 50MB before rotation
  backup_count: 30         # keep 30 rotated files
  compress_rotated: true   # gzip rotated files
  forward_error_log: "/var/log/spam-filter/forward-errors.log"

When enabled, the rotation process is:

  1. Active log file reaches max_size
  2. Active file is renamed to spam-filter.log.1
  3. A new empty active file is created
  4. The newly-rotated file is gzipped to spam-filter.log.1.gz
  5. Existing .gz files shift down (.1 → .2, .2 → .3, etc.)
  6. Files past backup_count are deleted

The compression step happens in a background task to avoid blocking incoming mail processing during rotation.

Querying Compressed Logs

Compressed logs are fully queryable without manual decompression. Standard tools handle .gz files natively:

# grep across all rotated logs (compressed and uncompressed)
zgrep "Connection refused" /var/log/spam-filter/spam-filter.log*

# View a specific compressed log
zcat /var/log/spam-filter/spam-filter.log.5.gz | less

# Tail a compressed log
zcat /var/log/spam-filter/spam-filter.log.5.gz | tail -100

# Combined query across compressed forward-error log
zcat /var/log/spam-filter/forward-errors.log.*.gz | jq 'select(.error | contains("timeout"))'

The bundled spam-filter-stats CLI also handles compressed logs transparently. When you pass --log-file pointing to a directory, it reads all rotated logs (compressed and uncompressed) automatically.

Why Not Just Use logrotate?

Linux's logrotate can compress rotated files too, so you might wonder why Spam Killer rotates and compresses internally rather than relying on the system tool.

Two reasons:

Atomic rotation. Spam Killer's internal rotation reopens the log file pointer atomically as part of the rotation. With logrotate + copytruncate, there's a small window where logs can be lost during the copy. With logrotate + create + postrotate hook, the daemon needs a SIGHUP to reopen — extra moving pieces. Internal rotation is just simpler and safer.

No external dependency. Spam Killer ships with everything it needs to manage logs. You don't have to remember to install or configure logrotate on a new deployment, and the default config does the right thing out of the box.

If you prefer to manage rotation externally with logrotate, just set backup_count: 0 in the Spam Killer config and configure logrotate as you normally would. The two systems don't conflict; you just want to use one or the other, not both.

The CPU Cost

Compression isn't free, but it's cheap enough that you won't notice. Gzipping a 50MB log file takes around 1-2 seconds on a modern CPU and uses one core. Since rotation happens once per max_size bytes (50MB by default), that's at most a few seconds of background CPU per gigabyte of mail traffic. On a server already busy processing mail, it's invisible.

If you want even better compression at slightly higher CPU cost, future versions may add zstd support. For now, gzip provides the right balance of compatibility (every Linux distro has it), speed, and ratio.