The One-Way Mail Asymmetry

A real business sending email expects to receive replies. Customer responses, bounce notifications, abuse reports, even just out-of-office auto-replies — all of these depend on the sending domain having a working MX record that points to a server willing to accept inbound mail.

A spammer doesn't care about replies. They might care about looking superficially legitimate enough to pass SPF and DKIM checks (which they can buy or set up cheaply with throwaway domains), but configuring an MX record that actually accepts mail back is extra work they have no incentive to do. If they're not planning to read responses, why pay for inbound mail infrastructure?

This asymmetry is what makes MX presence such a useful spam signal. The cost to a legitimate sender of having an MX record is zero — they were going to have one anyway. The cost to a spammer is non-trivial, and provides them no benefit. So spam infrastructure tends to skip MX setup, and "no MX record" becomes a high-precision signal that something is off.

How Spam Killer Uses MX Validation

Starting in v1.9.0, Spam Killer's promotional classifier requires a valid MX record on the sender domain (specifically, the domain in the envelope sender, falling back to the From header domain). Without a valid MX, a message cannot be classified as [PROMOTION] regardless of how many other promotional indicators it has.

This is intentional. Real marketing senders — Mailchimp, SendGrid, every legitimate ESP and the businesses that use them — always have proper MX setup on their sender domains because they need bounce handling, abuse routing, and the ability to receive replies to their newsletters. A "newsletter" coming from a domain with no MX record is almost certainly spam pretending to be promotional.

The check happens after authentication checks pass, so it adds a DNS lookup only for messages that are already candidates for promotional classification. For messages that fail SPF or DKIM, the MX check is never performed.

Why a Cache Is Essential

A naive implementation would do a DNS query for every message that gets to the MX check. Under load, this would saturate your local resolver and add significant latency to every message — DNS round-trips can take 50-200ms even on a healthy network.

Spam Killer maintains a thread-safe MX cache with a configurable TTL. The default settings are:

classification:
  promotional:
    mx_cache_ttl: 86400   # 24 hours
    mx_timeout: 3         # 3 seconds for the DNS query

The cache holds up to 10,000 entries (a fixed-size open-addressed hash table in the C version, a Python dict with size limits and lock protection in the Python version). After the first message from a domain, all subsequent lookups for that domain are O(1) hash lookups with no network I/O.

The 24-hour TTL is a balance between freshness and load reduction. MX records change rarely — when a domain migrates email providers, the change is usually planned days in advance. A 24-hour cache misses these transitions briefly, which is generally acceptable for spam classification purposes.

Thread Safety in High-Concurrency Servers

Both the Python and C versions of the MX cache are designed for high-concurrency mail processing. The Python version uses a module-level dict with a threading.Lock — any worker thread can read or write the cache, with the lock preventing race conditions during eviction.

The C version uses a fixed-size hash table protected by a pthread mutex. Open addressing with linear probing keeps the table compact in memory and predictable in lookup time. When the table approaches capacity, the oldest entries are evicted to make room.

Lock contention in practice is minimal. Each lookup holds the lock for microseconds, and most lookups hit the cache rather than going to DNS. We've measured the overhead at less than 1% of total message processing time even on busy mail servers.

Why MX Validation Isn't Enough Alone

MX presence is a useful signal, but it's not by itself a spam indicator. Many legitimate domains exist purely for outbound mail — transactional senders, ESP-managed sender domains, ephemeral campaign domains. Some of these have MX records that point to a "no-mailbox" server that returns 5xx for everything. Others legitimately route inbound to a different domain.

This is why MX validation is one of three required signals for promotional classification, alongside SPF and DKIM. A domain might lack MX but pass auth perfectly — that doesn't make it spam, just unusual. We require all three signals together to classify as promotional, which catches the actual pattern of legitimate marketing infrastructure: real domain, real auth, real reply path.

For pure spam detection (not promotional classification), MX absence is just one of many signals fed into the heuristic scorer. It contributes to the score but isn't dispositive on its own. A new domain with no MX, scoring high on content heuristics, will get tagged as spam. A new domain with no MX but otherwise clean content will not.

Configuration

The MX validation requirement for promotional classification is on by default. To disable it (allowing promotional classification to ignore MX presence):

classification:
  promotional:
    require_mx: false

You might do this if you're seeing legitimate marketing mail being missed by promotional classification because the sender domain has unusual DNS setup. In practice, this is rare with major ESPs but more common with smaller or self-hosted marketing platforms.