The Binary Verdict Problem

Open your inbox right now. Of the last fifty messages, count how many are direct human correspondence. For most people, the answer is somewhere between three and ten. The rest are notifications from LinkedIn or Facebook, marketing emails from companies you actually do business with, automated alerts, password resets, shipping confirmations, and yes — some genuine spam.

Traditional spam filters treat all of that mail the same way. Either it scores above the spam threshold and gets quarantined, or it lands in your inbox and competes for your attention with mail that actually requires a response. The result is one of two failure modes: either marketing email floods your inbox and buries the things that matter, or your spam filter gets aggressive and starts catching legitimate newsletters and notifications you actually wanted to see.

Neither outcome is good. The problem is that "spam" and "not spam" is a category mistake — it lumps together fundamentally different types of mail and asks the user to sort it out manually after the fact.

Five Classifications, Not Two

Starting in v1.9.0, Spam Killer classifies every inbound message into one of five categories before it reaches your mail server:

  • Ham — Personal correspondence, transactional mail, and anything that doesn't match other categories. Forwarded unmodified.
  • Social — Notifications from social platforms (LinkedIn, Facebook, Instagram, Twitter, etc.) that pass SPF and DKIM checks. Tagged [SOCIAL] in the subject.
  • Promotional — Marketing email from legitimate senders: newsletters, product announcements, retail promotions. Tagged [PROMOTION].
  • Spam + Promotional — Marketing email that also exceeds the spam threshold. Tagged [SPAM][PROMOTION] so you can sort it as either.
  • Spam — High-scoring messages that are not classified as social or promotional. Tagged [SPAM].

Each message also gets an X-Spam-Classification header containing one of: ham, social, promotional, spam-promotional, or spam. This header is what mail rules in Outlook, Gmail, Postfix, and other clients can match on to route mail into separate folders.

How Classification Decisions Are Made

Classification runs in priority order, with whitelist and blacklist checks happening first. If a message matches a whitelist, it bypasses classification entirely. If it matches a blacklist, it is rejected before classification even runs.

Otherwise, the order is: Social → Promotional → Spam → Ham.

Social classification requires three things: the sender domain matches the configured social domain list (LinkedIn, Facebook, Instagram, Twitter/X, Discord, Reddit, etc.), SPF passes, and DKIM passes. Both authentication checks are required by default because social platforms always sign their outgoing mail — a message claiming to be from linkedin.com without a valid DKIM signature is almost certainly a phishing attempt impersonating LinkedIn.

Promotional classification is more nuanced. It requires SPF pass, DKIM pass, a valid MX record on the sender domain, and at least one promotional indicator. Indicators include the presence of a List-Unsubscribe header (RFC 8058 compliance is now standard for legitimate marketers), a Precedence: bulk header, a List-Id header, sender addresses matching marketing patterns (newsletter@, marketing@, noreply@), X-Mailer headers from known marketing platforms (Mailchimp, SendGrid, Constant Contact, etc.), or sender domains belonging to ESP infrastructure.

A message that fails any of the authentication checks cannot be classified as social or promotional, no matter how many other indicators it has. This is what stops a phishing email from impersonating Mailchimp's sender format and sliding into your promotional folder — the missing DKIM signature gives it away.

Why Five Categories Beat Two

The point of granular classification is to give your mail rules something useful to act on. Consider how a typical knowledge worker can route the five categories:

  • Ham → Inbox. This is the mail that actually requires attention.
  • Social → Social folder, optionally with notifications muted. You can check it once a day.
  • Promotional → Promotions folder. Reviewed weekly when you have time, never interrupting your day.
  • Spam-Promotional → Spam folder, but with the [PROMOTION] tag so a quick visual scan tells you it's "probably just aggressive marketing" rather than "potentially malicious."
  • Spam → Spam folder. Check occasionally for false positives.

The result is an inbox that contains only mail you might actually need to respond to today. Newsletters and platform notifications are still received, archived, and searchable — they just don't compete for your attention with real correspondence.

Configuring Classification

Classification is enabled by default in Spam Killer v1.9.0 and v2.6.0 (C version). The configuration block looks like this:

classification:
  enabled: true

  social:
    enabled: true
    require_spf: true
    require_dkim: true
    domains:
      - "facebookmail.com"
      - "linkedin.com"
      - "instagram.com"
      - "twitter.com"
      # ... and so on

  promotional:
    enabled: true
    require_spf: true
    require_dkim: true
    require_mx: true
    spam_tag: true
    spam_threshold: 0.6
    min_indicators: 1

You can disable either classifier independently if you prefer to receive social notifications or promotional mail untagged. You can also tighten the rules — for example, set min_indicators: 2 on promotional to require multiple marketing signals before tagging, which reduces false positives for transactional mail that happens to come from a marketing-style sender address.

Classification as a Phishing Defense

One unexpected benefit: tighter classification rules make certain phishing attacks much easier to spot. A phishing email impersonating LinkedIn cannot be classified as [SOCIAL] because it will not pass DKIM verification against linkedin.com's keys. A phishing email impersonating Mailchimp will not have a valid List-Unsubscribe header pointing back to a real Mailchimp campaign. The classification engine effectively raises the bar for impersonation by requiring multiple corroborating signals.

This is why we recommend keeping require_spf and require_dkim set to true for both social and promotional classifiers, even if it means some legitimate-but-poorly-configured marketing mail ends up tagged as spam. The asymmetry is the point — a real LinkedIn notification will always pass these checks; a fake one will not.