Introduction
============

Normally, Postfix receives mail, stores it in the mail queue and
then delivers it. With the external content filter described here,
mail is filtered AFTER it is queued. This gives you maximal control
over how many filtering processes you are willing to run in parallel.

This is not to be confused with the approach that is described in
the SMTPD_PROXY_README document, where SMTP mail is filtered BEFORE
it is queued.

An external content filter receives unfiltered mail from Postfix
(as described further below) and does one of the following:

1 - Re-inject the mail back into Postfix, perhaps after changing
    content.

2 - Reject the mail (by sending a suitable status code back to
    Postfix). Postfix will return the mail to the sender.

3 - Send the mail somewhere else.

This document describes two approaches to content filtering: simple
and advanced. Both approaches filter all the mail by default.

At the end are examples that show 1) how to filter only mail from
remote users, 2) how to use different filters for different
domains that you provide MX service for, and 3) how to set up
selective filtering on the basis of message envelope and/or
header/body patterns.

Simple content filtering example
================================

The first example is simple to set up.  A shell script receives
unfiltered mail from the Postfix pipe delivery agent, and feeds
filtered mail back into the Postfix sendmail command.

This means that mail submitted via the Postfix sendmail command
cannot be content filtered.

                  ..................................
                  :            Postfix             :
Unfiltered mail----->smtpd \                /local---->Filtered mail
                  :         -cleanup->queue-       :
               ---->pickup /                \smtp----->Filtered mail
               ^  :                        |       :
               |  :                         \pipe-----+
               |  ..................................  |
               |                                      |
               |                                      |
               +-Postfix sendmail<----filter script<--+

In this example, mail is filtered by a /some/where/filter program.
This can be a simple shell script like this:

    #!/bin/sh

    # Localize these.
    INSPECT_DIR=/var/spool/filter
    SENDMAIL="/usr/sbin/sendmail -i"

    # Exit codes from <sysexits.h>
    EX_TEMPFAIL=75
    EX_UNAVAILABLE=69

    # Clean up when done or when aborting.
    trap "rm -f in.$$" 0 1 2 3 15

    # Start processing.
    cd $INSPECT_DIR || { echo $INSPECT_DIR does not exist; exit $EX_TEMPFAIL; }

    cat >in.$$ || { echo Cannot save mail to file; exit $EX_TEMPFAIL; }

    # filter <in.$$ || { echo Message content rejected; exit $EX_UNAVAILABLE; }

    $SENDMAIL "$@" <in.$$

    exit $?

The idea is to first capture the message to file and then run the
content through a third-party content filter program.

- If the mail cannot be captured to file, mail delivery is deferred
  by terminating with exit status 75 (EX_TEMPFAIL). Postfix places
  the message in the deferred mail queue and tries again later.

- If the content filter program finds a problem, the mail is bounced
  by terminating with exit status 69 (EX_UNAVAILABLE).  Postfix
  will return the message to the sender as undeliverable.

- If the content is OK, it is given as input to the Postfix sendmail
  command, and the exit status of the filter command is whatever
  exit status the Postfix sendmail command produces. Postfix will
  deliver the message as usual.

I suggest that you run this script by hand until you are satisfied
with the results. Run it with a real message (headers+body) as
input:

    % /some/where/filter -f sender recipient... <message-file

Once you're satisfied with the content filtering script:

1 - Create a dedicated local user account called "filter".  This
    user handles all potentially dangerous mail content - that is
    why it should be a separate account. Do not use "nobody", and
    most certainly do not use "root" or "postfix".  The "filter"
    user will never log in, and can be given a "*" password and
    non-existent shell and home directory.

2 - Create a directory /var/spool/filter that is accessible only
    to the "filter" user. This is where the content filtering script
    is supposed to store its temporary files.

3 - Define the content filter in the Postfix master file:

    /etc/postfix/master.cf:
      # =============================================================
      # service type  private unpriv  chroot  wakeup  maxproc command
      #               (yes)   (yes)   (yes)   (never) (100)
      # =============================================================
      filter    unix  -       n       n       -       10      pipe
        flags=Rq user=filter argv=/somewhere/filter -f ${sender} -- ${recipient}

Instead of a limit of 10 concurrent processes, use whatever process
limit is feasible for your machine.  Content inspection software
can gobble up a lot of system resources, so you don't want to have
too much of it running at the same time.

To turn on content filtering for mail arriving via SMTP only, append
"-o content_filter=filter:dummy" to the master.cf entry that defines
the Postfix SMTP server:

    /etc/postfix/master.cf:
      # =============================================================
      # service type  private unpriv  chroot  wakeup  maxproc command
      #               (yes)   (yes)   (yes)   (never) (100)
      # =============================================================
      smtp      inet  ...other stuff here, do not change...   smtpd
            -o content_filter=filter:dummy

The content_filter configuration parameter accepts the same syntax
as the right-hand side in a Postfix transport table.  Execute
"postfix reload" to complete the change.

To turn off content filtering, edit the master.cf file, remove the
"-o content_filter=filter:dummy" text from the entry that defines
the Postfix SMTP server, and execute another "postfix reload".

With the shell script as shown above you will lose a factor of four
in Postfix performance for transit mail that arrives and leaves
via SMTP. You will lose another factor in transit performance for
each additional temporary file that is created and deleted in the
process of content filtering.  The performance impact is less for
mail that is submitted or delivered locally, because such deliveries
are already slower than SMTP transit mail.

Simple content filter limitations
=================================

The problem with content filters like the one above is that they
are not very robust. The reason is that the software does not talk
a well-defined protocol with Postfix. If the filter shell script
aborts because the shell runs into some memory allocation problem,
the script will not produce a nice exit status as defined in the
file /usr/include/sysexits.h.  Instead of going to the deferred
queue, mail will bounce.  The same lack of robustness can happen
when the content filtering software itself runs into a resource
problem.

The simple content filter method is not suitable for content filter
actions that are invoked via header_checks or body_checks patterns.
These patterns will be applied again after mail is re-injected with
the Postfix sendmail command, resulting in a mail filtering loop.
Instead, use the advanced content filtering method (see below) and
turn off header_checks or body_checks patterns for filtered mail.

Advanced content filtering example
===================================

The second example is more complex, but can give better performance,
and is less likely to bounce mail when the machine runs into some
resource problem.  This approach requires content filtering software
that can receive and deliver mail via SMTP.

For non-SMTP capable content filtering software, Bennett Todd's
SMTP proxy implements a nice PERL/SMTP content filtering framework.
See: http://bent.latency.net/smtpprox/

The example given here filters all mail, including mail that arrives
via SMTP and mail that is locally submitted via the Postfix sendmail
command. See examples near the end of this document for how to
exclude local users from filtering.

You can expect to lose about a factor of two in Postfix performance
for mail that arrives and leaves via SMTP, provided that the content
filter creates no temporary files. Each temporary file created by
the content filter adds another factor to the performance loss.

In the text that follows we will set up a content filtering program
that receives SMTP mail via localhost port 10025, and that submits
SMTP mail back into Postfix via localhost port 10026.

      ..................................
      :            Postfix             :
   ----->smtpd \                /local---->
      :         -cleanup->queue-       :
   ---->pickup /    ^       |   \smtp----->
      :             |       v          :
      :           smtpd    smtp        :
      :           10026     |          :
      ......................|...........
                    ^       |
                    |       v
                ....|............
                :   |     10025 :
                :     filter    :
                :               :
                .................

To enable content filtering in this manner, specify in main.cf:

/etc/postfix/main.cf:
    content_filter = scan:localhost:10025
    receive_override_options = no_address_mappings

- The "content_filter" line causes Postfix to add one content
  filtering record to each incoming mail message, with content
  scan:localhost:10025.  The content filtering records are added
  by the smtpd, pickup and qmqpd servers.

- The "receive_override_options" line disables address manipulation
  before the content filter, so that the content filter sees the
  original mail addresses instead of the result of virtual alias
  expansion, canonical mapping, automatic bcc, address masquerading,
  etc.

To turn off content filtering, delete or comment out the two above
main.cf lines. All the changes made in the text below have no effect
when content filtering is turned off.

Content filter information is stored in queue files; this is how
Postfix keeps track of what mail needs filtering.  When a queue
file contains content filter information, the queue manager will
deliver the mail to the specified content filter regardless of its
final destination.

In this example, "scan" is an instance of the Postfix SMTP client
with slightly different configuration parameters. This is how
one would set up the service in the Postfix master.cf file:

/etc/postfix/master.cf:
    # =============================================================
    # service type  private unpriv  chroot  wakeup  maxproc command
    #               (yes)   (yes)   (yes)   (never) (100)
    # =============================================================
    scan      unix  -       -       n       -       10      smtp
        -o smtp_send_xforward_command=yes

- Instead of a limit of 10 concurrent processes, use whatever
  process limit is feasible for your machine.  Content inspection
  software can gobble up a lot of system resources, so you don't
  want to have too much of it running at the same time.

- With "-o smtp_send_xforward_command=yes", the scan transport will
  try to forward the original client name and IP address to the
  after-filter smtpd process, so that filtered mail is logged with
  the real client name IP address.  See sample-smtp.cf and smtp(8).

The content filter can be set up with the Postfix spawn service,
which is the Postfix equivalent of inetd. For example, to instantiate
up to 10 content filtering processes on demand:

    /etc/postfix/master.cf:
	# ===================================================================
	# service       type  private unpriv  chroot  wakeup  maxproc command
	#                     (yes)   (yes)   (yes)   (never) (100)
	# ===================================================================
	localhost:10025 inet  n       n       n       -       10      spawn
	    user=filter argv=/some/where/filter localhost 10026

- "filter" is a dedicated local user account.  The user will never
  log in, and can be given a "*" password and non-existent shell
  and home directory.  This user handles all potentially dangerous
  mail content - that is why it should be a separate account.

- In the above example, Postfix listens on port localhost:10025.
  If you want to have your filter listening on port localhost:10025
  instead of Postfix, then you must run your filter as a stand-alone
  program, and not use the Postfix spawn service.

The simplest content filter just copies SMTP commands and data
between its inputs and outputs. If it has a problem, all it has to
do is to reply to an input of `.' with `550 content rejected', and
to disconnect without sending `.' on the connection that injects
mail back into Postfix.

The job of the content filter is to either bounce mail with a
suitable diagnostic, or to feed the mail back into Postfix through
a dedicated listener on port localhost 10026:

/etc/postfix/master.cf:
    # ===================================================================
    # service       type  private unpriv  chroot  wakeup  maxproc command
    #                     (yes)   (yes)   (yes)   (never) (100)
    # ===================================================================
    localhost:10026 inet  n       -       n       -       10      smtpd
        -o content_filter= 
        -o receive_override_options=no_unknown_recipient_checks,no_header_body_checks
        -o smtpd_helo_restrictions=
        -o smtpd_client_restrictions=
        -o smtpd_sender_restrictions=
        -o smtpd_recipient_restrictions=permit_mynetworks,reject
        -o mynetworks=127.0.0.0/8
        -o smtpd_authorized_xforward_hosts=127.0.0.0/8

- Note: do not use spaces around the "=" or "," characters.

- Note: the SMTP server must not have a smaller process limit than
  the "filter" master.cf entry.

- The "-o content_filter=" overrides main.cf and requests no content
  filtering for incoming mail. This is required or else mail will
  stay in the content filtering loop.

- The "-o receive_override_options" overrides main.cf. It is
  complementary to the options that are specified in main.cf:

    - Disable attempts to find out if a recipient is unknown, and
      disable header/body checks. This work was already done before
      the content filter and repeating it would be wasteful.

    - Enable virtual alias expansion, canonical mappings, address
      masquerading, and other address mappings.

  These receive override options are either implemented by the SMTP
  server itself, or they are passed on to the cleanup server.

- The "-o smtpd_xxx_restrictions" and "-o mynetworks=127.0.0.0/8"
  override main.cf turn off junk mail controls that would only
  waste time here.

- With "-o smtpd_authorized_xforward_hosts=mynetworks=127.0.0.0/8",
  the scan transport will try to forward the original client name
  and IP address to the after-filter smtpd process, so that filtered
  mail is logged with the real client name IP address.  See
  sample-smtpd.cf and smtpd(8).

Filtering mail from outside users only
======================================

The easiest approach is to configure ONE Postfix instance with
multiple SMTP server IP addresses in master.cf:

- Two SMTP server IP addresses for inside users only that never
  invoke content filtering.

- One SMTP server address for outside users that always invokes
  content filtering.

/etc/postfix.master.cf:
    # SMTP service for internal users only, no content filtering.
    1.2.3.4:smtp        inet  n       -       n       -       -       smtpd
        -o smtpd_client_restrictions=permit_mynetworks,reject
    127.0.0.1:smtp      inet  n       -       n       -       -       smtpd
        -o smtpd_client_restrictions=permit_mynetworks,reject

    # SMTP service for external users, with content filtering.
    1.2.3.5:smtp        inet  n       -       n       -       -       smtpd
        -o content_filter=foo:bar 
        -o receive_override_options=no_address_mappings

After this, you can follow the same procedure as outlined in the
"advanced" or "simple" content filtering examples above, except
that you do not need to specify "content_filter" settings in the
main.cf file.

Getting really nasty
====================

The above filtering configurations are static. Mail that follows
a given path is either always filtered or it is never filtered. As
of Postfix 2.0 you can also turn on content filtering on the fly.

    FILTER foo:bar

You can do this in smtpd access maps as well as the cleanup server's
header/body_checks.  This feature must be used with great care:
you must disable all the UCE features in the after-filter smtpd
and cleanup daemons or else you will have a content filtering loop.

Limitations:

- FILTER actions from smtpd access maps and header/body_checks take
  precedence over filters specified with the main.cf content_filter 
  parameter.

- If a message triggers more than one filter action, only the last
  one takes effect.

- The same content filter is applied to all the recipients of a
  given message.
