More On Migrating From Drupal To Blogofile

I wrote an earlier post on converting a Drupal site to Blogofile. I couldn’t be happier with how that turned out as it allowed me to immediately start using Blogofile while still keeping my old Drupal content online. Sweet! But who wants to continue running the two systems in parallel forever? I certainly don’t, and after a code sprint I came up with a way to completely transition off Drupal.

First, I manually converted the most popular nodes on my Drupal site to Blogofile. I moved all of my Project content to GitHub, and wrote a Blogofile photo album that replaced 99% of the Drupal photo album functionality that I actually used. At the end of this process, there were only Blog, Story, and Page nodes left in Drupal.

Next, I changed drupalmigrate.py to generate a bunch of Blogofile post files for each of those remaining Drupal nodes. I added these settings to my _config.py:

controllers.drupalmigrate.makeposts = False
controllers.drupalmigrate.mainusername = 'kirk'
controllers.drupalmigrate.startpostnum = 1000

then ran “../bin/blogofile build -v” to create about 70 new post files. When that finished successfully, I removed those settings.

Finally, I added new code to drupalmigrate.py to generate a different set of RewriteRules that redirect the old Drupal node permalinks to their new Blogofile locations. It makes rules like:

RewriteRule ^/yet-another-python-map(/|$) /2008/04/16/yet-another-python-map/ [R=301,L]

where /yet-another-python-map is the node’s old location in Drupal and /2008/04/16/yet-another-python-map/ is the new Blogofile URL. To activate that, I added this to my _config.py:

controllers.drupalmigrate.makepermalinkredirs = True
controllers.drupalmigrate.redirectrulefile = '_generatedfiles/redirectrewriterules.txt'

With this in place, there was nothing left in Drupal so I decommissioned it and changed my Apache configuration to something like:

    <VirtualHost *:80>
        ServerName honeypot.net
        ServerAlias www.honeypot.net
        CustomLog /var/log/httpd/honeypot.net-access.log combined
        DirectoryIndex index.html
        DocumentRoot /usr/local/www/honeypot.net/honeypot/_site
        <Directory /usr/local/www/honeypot.net/honeypot/_site>
            Order Deny,Allow
            Allow from All
        </Directory>

        RewriteEngine On
        Include /usr/local/www/honeypot.net/honeypot/_generatedfiles/redirectrewriterules.txt
    </VirtualHost>

I’ll leave the Drupal node redirects in there permanently so that all the links to my site will continue to work.

Migrating Drupal To Blogofile

I have a Drupal site with nearly a thousand nodes, several having over 100,000 hits. I wanted to migrate to Blogofile but absolutely did not want to start over or make this a major hassle. Instead, I used some Apache RewriteRules to gradually and seamlessly switch from Drupal to Blogofile one post at a time. Here’s how I did it, using my site’s real name of http://honeypot.net/ to give concrete examples:

  1. Migrated my comments to Disqus. Drupal’s own disqus module has built-in export functionality to make this a snap.
  2. Set up internal DNS to add a new A record, drupal.honeypot.net, for my Drupal site. Since this is a site that no one ever needs to visit directly, an entry in /etc/hosts should work just as well.
  3. Renamed my Apache’s honeypot.net.conf to drupal.honeypot.net.conf and changed the ServerName to “drupal.honeypot.net”.
  4. Created a new honeypot.net.conf to serve only static Blogofile files while passing all other requests through to drupal.honeypot.net. Here’s the shortened version of it:

    <VirtualHost *:80>
        ServerName honeypot.net
        ServerAlias www.honeypot.net
        CustomLog /var/log/httpd/honeypot.net-access.log combined
        DirectoryIndex index.html
        DocumentRoot /usr/local/www/honeypot.net/honeypot/_site
        <Directory /usr/local/www/honeypot.net/honeypot/_site>
            Order Deny,Allow
            Allow from All
        </Directory>
    
    
    
    # Serve some stuff locally but pass everything else through to Drupal
    RewriteEngine On
    Include /usr/local/www/honeypot.net/honeypot/_generatedfiles/rewriterules.txt
    RewriteRule ^/?(.*)$ http://drupal.honeypot.net/$1 [P,L]
    

    </VirtualHost>

The last lines are where the magic happens. I created a new Blogofile controller: drupalmigrate.py. It creates the file mentioned above (rewriterules.txt) that tells Apache not to tamper with any file generated by Blogofile. That file has entries like:

RewriteRule ^/$ - [L]
RewriteRule ^/2011(/|$) - [L]
RewriteRule ^/archive(/|$) - [L]
RewriteRule ^/category(/|$) - [L]
RewriteRule ^/favicon.ico$ - [L]
RewriteRule ^/feed(/|$) - [L]
RewriteRule ^/filtering-spam-postfix(/|$) - [L]
RewriteRule ^/index.html$ - [L]
RewriteRule ^/my-ecco-shoes-are-junk(/|$) - [L]
RewriteRule ^/page(/|$) - [L]
RewriteRule ^/robots.txt$ - [L]
RewriteRule ^/scam-calls-card-services(/|$) - [L]
RewriteRule ^/theme(/|$) - [L]

for every file in Blogofile’s _site directory. When an incoming web request matches one of those patterns, Apache stops processing any further RewriteRules and serves the file directly from the DocumentRoot directory. If none of those pattern matches, the final RewriteRule in my honeypot.net.conf file makes a proxy request for the same path from http://drupal.honeypot.net/ .

That is, if a visitor requests http://honeypot.net/robots.txt, the RewriteRule in rewriterules.txt will cause Apache to serve the file from /usr/local/www/honeypot.net/honeypot/_site/robots.txt. If a visitor requests http://honeypot.net/something-else, Apache will request the file from http://drupal.honeypot.net/something-else and return the results. None of this is visible to the user. They don’t receive any redirects, links to drupal.honeypot.net, or any other indication that they’re seeing content from two different systems.

The controller also creates a list of every node in my Drupal database so I can <%include /> it into my site’s index.html.mako file like this:

<hr />
<p>These posts haven't been converted to the new system but are still available:</p>
<%include file="_templates/olddrupalindex.mako" />

When drupalmigrate.py is generating the site index, it skips any Drupal nodes that have the same permalink as a Blogofile post. As I take my time converting my Drupal content, more and more will be served by Blogofile until eventually none is left.

By starting with the most popular content and working my way down, I can make sure that all my heavy-traffic pages are served as lightning-fast static pages. This could even work as a form of website accelerator where popular pages are “compiled” by Blogofile for fast access. By fast, I mean that my untuned home server can sustain about 9,100 hits and 240MB of traffic per second. Until Google decides to use me for their home page, I think that’ll be sufficient for my little site.

Making DOS USB Images On A Mac

I needed to run a BIOS flash utility that was only available for DOS. To complicate matters, the server I needed to run it on doesn’t have a floppy or CD-ROM drive. I figured I’d hop on the Internet and download a bootable USB flash drive image. Right? Wrong.

I found a lot of instructions for how to make such an image if you already have a running Windows or Linux desktop, but they weren’t very helpful for me and my Mac. After some trial and error, I managed to create my own homemade bootable USB flash drive image. It’s available at http://www.mediafire.com/?aoa8u1k1fedf4yq if you just want a premade ready-to-download file.

If you want a custom version, or you don’t trust the one I’ve made – and who’d blame you? I’m some random stranger on the Internet! – here’s how you can make your own bootable image under OS X:

Relax!

  1. There are a lot of steps, but they’re easy! I wanted to err on the side of being more detailed than necessary, rather than skipping “obvious” steps that might not be quite so easy for people who haven’t done this before.

Download VirtualBox and install it

  1. Download VirtualBox. I used version 4.1.4. The version available to you today might look different but should work mostly the same way.
  2. Open the “VirtualBox-[some-long-number]-OSX.dmg” disk image.
  3. Double-click the “VirtualBox.mpkg” icon to run the installer.
  4. Click “Continue”.
  5. Click “Continue”.
  6. Click “Install”.
  7. Enter your password and click “Install Software”.
  8. When it’s finished copying files, etc., click “Close”.

Download FreeDOS and create a virtual machine for it

  1. Download the FreeDOS “Base CD” called “fdbasecd.iso”. Note: the first mirror I tried to download from didn’t work. If that happens, look around on the other mirrors until you find one that does.
  2. Open your “Applications” folder and run the “VirtualBox” program.
  3. Click the “New” button to create a new virtual machine. This launches the “New Virtual Machine Wizard”. Click “Continue” to get past the introduction.
  4. Name your new VM something reasonable. I used “FreeDOS”, and whatever name you enter here will appear throughout all the following steps so you probably should, too.
  5. Set your “Operating System” to “Other”, and “Version” to “DOS”. (If you typed “FreeDOS” in the last step, this will already be done for you.) Continue.
  6. Leave the “Base Memory Size” slider at 32MB and continue.
  7. Make sure “Start-up Disk” is selected, choose “Create new hard disk”, and continue.
  8. Select “File type” of “VDI (VirtualBox Disk Image)” and continue.
  9. Select “Dynamically allocated” and continue.
  10. Keep the default “Location” of “FreeDOS”.
  11. Decision time: how big do you want to make your image? The full install of FreeDOS will take about 7MB, and you’ll want to leave a little room for your own files. On the other hand, the larger you make this image, the longer it’ll take to copy onto your USB flash drive. You certainly don’t want to make it so large that it won’t actually fit on your USB flash drive. An 8GB nearly-entirely-empty image will be worthless if you only have a 2GB drive. I splurged a little and made my image 32MB (by clicking in the “Size” textbox and typing “32MB”. I hate size sliders.). Click “Continue”.
  12. Click “Create”.
  13. Make sure your new “FreeDOS” virtual machine is highlighted on the left side of the VirtualBox window.
  14. On the right-hand side, look for the section labeled “Storage” and click on the word “Storage” in that title bar.
  15. Click the word “Empty” next to the CD-ROM icon.
  16. Under “Attributes”, click the CD-ROM icon to open a file chooser, select “Choose a virtual CD/DVD disk file…”, and select the FreeDOS Base CD image you downloaded at the beginning. It’ll probably be in your “Downloads” folder. When you’ve selected it, click “Open”.
  17. Back on the “FreeDOS – Storage” window, click “OK”.

Install FreeDOS

  1. Back on the main VirtualBox window, near the top, click “Start” to launch the virtual machine you just made.
  2. A note about VirtualBox: when you click the VM window or start typing, VirtualBox will “capture” your mouse cursor and keyboard so that all key presses will go straight to the VM and not your OS X desktop. To get them back, press the left [command] key on your keyboard.
  3. At the FreeDOS boot screen, press “1” and [return] to boot from the CD-ROM image.
  4. Hit [return] to “Install to harddisk”.
  5. Hit [return] to select English, or the up and down keyboard arrow keys to choose another language and then [return].
  6. Hit [return] to “Prepare the harddisk”.
  7. Hit [return] in the “XFDisk Options” window.
  8. Hit [return] to open the “Options” menu. “New Partition” will be selected. Hit [return] again. “Primary Partition” will be selected. Again, [return]. The maximum drive size should appear in the “Partition Size” box. If not, change that value to the largest number it will allow. Hit [return].
  9. Do you want to initialize the Partition Area? Yes. Hit [return].
  10. Do you want to initialize the whole Partition Area? Oh, sure. Press the left arrow key to select “YES”, then hit [return].
  11. Hit [return] to open the “Options” menu again. Use the arrow keys to scroll down to “Install Bootmanager” and hit [return].
  12. Press [F3] to leave XFDisk.
  13. Do you want to write the Partition Table? Yep. Press the left arrow to select “YES” and hit [return]. A “Writing Changes” window will open and a progress bar will scroll across to 100%.
  14. Hit [return] to reboot the virtual machine.
  15. This doesn’t actually seem to reboot the virtual machine. That’s OK. Press the left [command] key to give the mouse and keyboard back to OS X, then click the red “close window” button on the “FreeDOS [running]” window to shut it down. Choose “Power off the machine” and click “OK”.
  16. Back at the main VirtualBox window, click “Start” to re-launch the VM.
  17. Press “1” and [return] to “Continue to boot FreeDOS from CD-ROM”, just like you did before.
  18. Press [return] to select “Install to harddisk” again. This will take you to a different part of the installation process this time.
  19. Select your language and hit [return].
  20. Make sure “Yes” is selected, and hit [return] to let FreeDOS format your virtual disk image.
  21. Proceed with format? Type “YES” and hit [return]. The format process will probably finish too quickly for you to actually watch it.
  22. Now you should be at the “FreeDOS 1.0 Final Distribution” screen with “Continue with FreeDOS installation” already selected. Hit [return] to start the installer.
  23. Make sure “1) Start installation of FreeDOS 1.0 Final” is selected and hit [return].
  24. You’ll see the GNU General Public License, version 2 text. Follow that link and read it sometime; it’s pretty brilliant. Hit [return] to accept it.
  25. Ready to install the FreeDOS software? You bet. Hit [return].
  26. Hit [return] to accep the default installation location.
  27. “YES”, the above directories are correct. Hit [return].
  28. Hit [return] again to accept the selection of programs to install.
  29. Proceed with installation? Yes. Hit [return].
  30. Watch in amazement and how quickly the OS is copied over to your virtual disk image. Hit [return] to continue when it’s done.
  31. The VM will reboot. At the boot screen, press “h” and [return] to boot your new disk image. In a few seconds, you’ll see an old familiar “C:\” prompt.
  32. Press the left [command] key to release your keyboard and mouse again, then click the red “close window” icon to shut down the VM. Make sure “Power off the machine” is selected and click “OK”.

Convert the VirtualBox disk image into a “raw” image

  1. Open a Terminal.app window by clicking the Finder icon in your dock, then “Applications”, then opening the “Utilies” folder, then double-clicking “Terminal”.
  2. Copy this command, paste it into the terminal window, then hit [return]:

    /Applications/VirtualBox.app/Contents/Resources/VirtualBoxVM.app/Contents/MacOS/VBoxManage internalcommands converttoraw ~/"VirtualBox VMs/FreeDOS/FreeDOS.vdi" ~/Desktop/freedos.img
    

This will turn your VirtualBox disk image file into a “raw” image file on your desktop named “freedos.img”. It won’t alter your original disk image in any way, so if you accidentally delete or badly damage your “raw” image, you can re-run this command to get a fresh, new one.

Prepare your USB flash drive

  1. Plug your USB flash drive into your Mac.
  2. If your Mac can’t the drive, a new dialog window will open saying “The disk you inserted was not readable by this computer.” Follow these instructions:

    1. Click “Ignore”.
    2. Go back into your terminal window and run this command:

      diskutil list
      
    3. You’ll see a list of disk devices (like “/dev/disk2”), their contents, and their sizes. Look for the one you think is your USB flash drive. Run this command to make sure, after replacing “/dev/disk2” with the actual name of the device you picked in the last step.

      diskutil info /dev/disk2
      

    Make sure the “Device / Media Name:” and “Total Size:” fields look right. If not, look at the output of diskutil list again to pick another likely candidate and repeat the step until you’re sure you’ve picked the correct drive to complete eradicate, erase, destroy, and otherwise render completely 100% unrecoverable. OS X will attempt to prevent you from overwriting the contents of drives that are currently in use – like, say, your main system disk – but don’t chance it. Remember the name of this drive!

  3. If your Mac did read the drive, it will have automatically mounted it and you’ll see its desktop icon. Follow these instructions:

    1. Go back into your terminal window and run this command:

      diskutil list
      
    2. Look for the drive name in the output of that command. It will have the same name as the desktop icon.

    3. Look for the name of the disk device (like “/dev/disk2”) for that drive and remember it (with the same warnings as in the section above that you got to skip).
    4. Unmount the drive by running this command:

      diskutil unmount "/Volumes/[whatever the desktop icon is called]"
      
    5. This is not the same as dragging the drive into the trash, so don’t attempt to eject it that way.

Copy your drive image onto the USB flash drive

  1. Go back to your terminal window.
  2. Run these commands, but substitute “/dev/fakediskname” with the device name you discovered on the previous section:

    cd ~/Desktop
    sudo dd if=freedos.img of=/dev/fakediskname bs=1m
    
  3. After the last command finishes, OS X will automatically mount your USB flash drive and you’ll see a new “FREEDOS” drive icon on your desktop.

Add your own apps to the image

  1. Drag your BIOS flasher utility, game, or other program onto the “FREEDOS” icon to copy it onto the USB flash drive.
  2. When finished, drag the “FREEDOS” drive icon onto the trashcan to unmount it.

Done.

  1. You’re finished. Use your USB flash drive to update your computer’s BIOS, play old DOS games, or do whatever else you had in mind.
  2. Keep the “freedos.img” file around. If you ever need it again, start over from the “Prepare your USB flash drive” section which is entirely self-contained. That is, it doesn’t require any software that doesn’t come pre-installed on a Mac, so even if you’ve uninstalled VirtualBox you can still re-use your handy drive image.

How To Make A Survival Kit

On my birthday in 2005, I read a Slashdot article discussing what things you might want to take with you if you had to evacuate your home. This was only a few months after Hurricane Katrina leveled southern Louisiana and Mississippi, so quite a few people had given this a lot of recent thought.

The article started off talking about which personal documents you should take copies of (driver licenses, marriage certificates, passports, etc.) – in other words, an electronic survival kit. However, the topic soon veered off into the kinds of things you need to physically stay alive. That made me realize that I’ve never made any such preparations, short of putting some bottled water in our tornado shelter. Below is a summary of the recommendations I came across.

Note: This isn’t meant as a list of things you’ll need to form your own private society out in the desert. I have absolutely zero interest in “survivalism”; I just want to have the stuff needed to keep me and my family alive until the National Guard arrives.

Second note: I primarily wrote this for me and my family. It’s biased towards scenarios that I might have to cope with, but completely ignores things that I could never hope to deal with anyway (such as being lost at sea).

References

How To Carry It

There are two schools of thought here:

  1. Pack everything inside a small metal pan that you can use for cooking, carrying water, etc.
    • Pros: no wasted space or weight
    • Cons: small metal pans can get crushed or soaked
  2. Put everything “fragile” inside a hardened case, like an OtterBox
    • Pros: your gear stays dry and intact
    • Cons: the box isn’t probably very intrinsically useful

Your application affects your choice very heavily. If you plain to carry mainly camping gear that’s pretty durable, the first option is probably your best choice. If you expect to carry many fragile items, such as an electronic survival kit or other small electronics, then the second is likely better. I personally use an OtterBox.

The List

Note, some of this is blatantly, word-for-word plagiarized from the above sources. My goal is to condense their ideas into one handy list, and there are only so many ways to say “strike anywhere matches”.

  • Instructions
  • Tools
    • Good, metal knife
    • Small multi-tool (for the scissors, screwdrivers, etc.)
    • Compass
    • Thermometer
    • Magnifying glass – possibly a Fresnel lens
    • Flashlight with batteries, preferably with a blinker
  • Metal dining utensils (that can be sanitized before and after use)
  • Fire starters – at least one of:
    • Strike anywhere matches in a waterproof safe
    • Firestarting piston
    • Disposable lighter
    • Magnesium/flintbar
  • Water
    • Personal water filter
    • Water purifying straw
    • Water purification tablets
  • Several sheets of paper and a pencil
  • A bottle of alcohol. Distilled, drinkable grain alcohol is best.
  • Medicine / Health
    • Anti-diarrheals
    • Aspirin
    • Antihistimines – to counter allergic reactions
    • Any other drugs you personally need to stay alive
    • Scalpel blades
    • Sunscreen
    • Suture kit
  • Homemade soda can stove
  • 5 pounds of gorp (“good old raisins and peanuts”)
  • Emergency blanket
  • Ziploc Baggies
  • Camelback water reservoir recently filled with known good water
  • 100 feet of parachute cord
  • Wool cloth. Two shirtweight peices 45″X 72″. One heavier weight 60″X108″. These are your clothes, your hammock, your chair, your carryall, etc. Do not substitute cotton!
  • Three yards of 36″ wide cotton could come in handy as well. This is your hat, your belt, your shoulder bag, your sling, etc.
  • Clothing
    • Two pair of wool socks
    • Waterproof, windproof shell or parka. Yes, even if you’re in a tropical zone.
    • Work gloves for digging through post-disaster rubble
    • A warm hat
  • A pennywhistle or any other tiny musical instrument. If you can turn a disaster into a party, your odds of survival will go up.
  • Signalling
    • Referee’s whistle
    • Mirror
    • Mini LED flashlight
  • Money – your eventual goal is to get back to civilization
  • Repairs
    • Mini roll of duct tape
    • Sewing needle and thread
    • Safety pins
  • 9’x7′ painting tarp (to make a tent) or a few trashbags
  • Slingshot kit – can be used to kill small game or fish

Filtering Spam With Postfix

If you are responsible for maintaining an internet-connected mail-server, then you have, no doubt, come to hate spam and the waste of resources which comes with it. When I first decided to lock down my own mail-server, I found many resources that helped in dealing with these unwanted messages. Each of them contained a trick or two, however very few of them were presented in the context of running a real server, and none of them demonstrated an entire filtering framework. In the end I created my own set of rules from the bits and pieces I found, and months of experimentation and fine-tuning resulted in the system detailed in this article.

This article will show you how to configure a Postfix mail-server in order to reject the wide majority of unwanted incoming “junk email”, whether they contain unsolicited commercial email (UCE), viruses, or worms. Although my examples are specific to Postfix, the concepts are generic and can be applied to any system that can be configured at this level of detail.

For a real world example, I’ll use my server’s configuration details:

My server’s configuration details
Hostname Kanga.honeypot.net
Public address 208.162.254.122
Postfix configuration path /usr/local/etc/postfix

Goals

This configuration was written with two primary rules in mind:

  1. Safety is important above all else. That is, I would much rather incorrectly allow an unwanted message through my system than reject a legitimate message.
  2. The system had to scale well. A perfect system that uses all of my server’s processing power at moderate workloads is not useful.

To these ends, several checks – that may have reduced the number of “false negatives” (messages that should have been rejected but were not) at the cost of increasing the number of “false positives” (legitimate messages that were incorrectly rejected) – were avoided. The more resource-intensive tests were moved toward the end of the configuration. This way, the quick checks at the beginning eliminated the majority of UCE so that the longer tests had a relatively small amount of traffic to examine.

HELO restrictions

When a client system connects to a mail-server, it’s required to identity itself using the SMTP HELO command. Many viruses and spammers skip this step altogether, either by sending impossibly invalid information, or lying and saying that they are one of your own trusted systems – in the hopes that they will be granted undeserved access. The first step in the filtering pipeline, then, is to reject connections from clients that are severely mis-configured or that are deliberately attempting to circumvent your security. Given their simplicity, these tests are far more effective than might be imagined, and they were implemented in my main.cf file with this settings block:

1  smtpd_delay_reject = yes
2  smtpd_helo_required = yes
3  smtpd_helo_restrictions =
4     permit_mynetworks,
5     check_helo_access hash:/usr/local/etc/postfix/helo_access,
6     reject_non_fqdn_hostname,
7     reject_invalid_hostname,
8     permit

Line 1 is a fix for certain broken (but popular) clients, and is required in able to use HELO filtering at all. The second line rejects mail from any system that fails to identify itself. Line 4 tells Postfix to accept connections from any machines on my local network. The next line references an external hash table that contains a set of black- and whitelisted entries; mine looks like this:

woozle.honeypot.net     OK
honeypot.net            REJECT You are not me. Shoo!
208.162.254.122         REJECT You are not me. Shoo!

The first line in the table explicitly allows connections from my laptop so that I can send mail when connected through an external network. At this point Postfix has already been told to accept connections from all of my local machines and my short list of remote machines, so any other computer in the world that identifies itself as one of my systems is lying. Since I’m not particularly interested in mail from deceptive systems, those connections were flatly rejected.

Lines 6 through 8 complete this stage by rejecting connections from hosts that identify themselves in blatantly incorrect ways, such as “MAILSERVER” and “HOST@192.168!aol.com”.

Some administrators also use the rejectunknownhostname option to ignore servers whose hostnames can’t be resolved, but in my experience this causes too many false positives from legitimate systems with transient DNS problems or other harmless issues.

You can test the effect of these rules, without activating them on a live system, by using the warnifreject option to cause Postfix to send debugging information to your maillog without actually processing them. For example, line 6 could be replaced with:

   warn_if_reject,
   reject_non_fqdn_hostname,

This way the results can be observed without the risk of inadvertently getting false positives.

Sender restrictions

The next step is to reject invalid senders with these options:

9  smtpd_sender_restrictions =
10    permit_sasl_authenticated,
11    permit_mynetworks,
12    reject_non_fqdn_sender,
13    reject_unknown_sender_domain,
14    permit

Lines 10 and 11 allow clients that have authenticated with a username and password or Kerberos ticket, or who are hosts on my local network, to continue onward. Lines 12 and 13 work similarly lines 6 and 7 in the “HELO restrictions” section; if the sender’s email address is malformed or provably nonexistent, then there’s no reason to accept mail from them. The next line allows every other message to move on to the next phase of filtering.

Recipient restrictions and expensive tests

By this time, it’s clear that the client machine isn’t completely mis-configured and that the sender stands a reasonable chance of being legitimate. The final step is to see that the client has permission to send to the given recipient and to apply the remaining “expensive” tests to the small number of messages that have made it this far. Here’s how I do it:

15 smtpd_recipient_restrictions =
16   reject_unauth_pipelining,
17   reject_non_fqdn_recipient,
18   reject_unknown_recipient_domain,
19   permit_mynetworks,
20   permit_sasl_authenticated,
21   reject_unauth_destination,
22   check_sender_access hash:/usr/local/etc/postfix/sender_access,
23   check_recipient_access hash:/usr/local/etc/postfix/recipient_access,
24   check_helo_access hash:/usr/local/etc/postfix/secondary_mx_access,
25   reject_rbl_client relays.ordb.org,
26   reject_rbl_client list.dsbl.org,
27   reject_rbl_client sbl-xbl.spamhaus.org,
28   check_policy_service unix:private/spfpolicy
29   check_policy_service inet:127.0.0.1:10023
30   permit

Many spammers send a series of commands without waiting for authorization, in order to deliver their messages as quickly as possible. Line 16 rejects messages from those attempting to do this.

Options like lines 17 and 18 are probably becoming familiar now, and they work in this case by rejecting mail targeted at domains that don’t exist (or can’t exist). Just as in the “Sender restrictions” section, lines 19 and 20 allow local or authenticated users to proceed – which here means that their messages will not go through any more checks. Line 21 is critically important because it tells Postfix not to accept messages with recipients at domains not hosted locally or that we serve as a backup mail server for; without this line, the server would be an open relay!

The next line defines an access file named senderaccess that can be used as a black- or whitelist. I use this to list my consulting clients’ mail-servers so that none of the remaining tests can inadvertently drop important requests from them. I added line 23, which creates a similar blacklist called recipientaccess for recipient addresses, as an emergency response to a “joe job”. Once in 2003, a spammer forged an email address from my domain onto several million UCE messages and I was getting deluged with bounce messages to “michelle@honeypot.net”. I was able to reject these by adding an entry like:

michelle@honeypot.net REJECT This is a forged account.

Although the event was annoying, this allowed my server to continue normal operation until the storm had passed.

Line 24 is the last of the “inexpensive” checks. It compares the name that the remote system sent earlier via the HELO command to the list of my secondary mail servers and permits mail filtered through those systems to be delivered without further testing. This is the weak link in my filtering system, because if a spammer were clever enough to claim that they were one of my backup servers then my mail server would cheerfully deliver any message sent to it. In practice, though, I’ve never seen a spammer that crafty and this line could be removed without side effects should the need arise.

Lines 25 through 27 are somewhat more controversial than most of my techniques, in that they consult external DNS blackhole lists in order to selectively reject messages based on the IP address of the sender. Each of these lists have been chosen because of their good reputation and very conservative policies toward adding new entries, but you should evaluate each list for yourself before using their databases to drop messages.

SPF and greylisting

Lines 28 and 29 add Sender Policy Framework (SPF) and greylisting filtering respectively. SPF works by attempting to look up a DNS record, that domains can publish, which gives the list of addresses allowed to send email for that domain. For example, webtv.net’s SPF record is currently “v=spf1 ip4:209.240.192.0/19 -all”, which means that a message claiming to be from joeuser@webtv.net sent from the IP address 64.4.32.7 is forged and can be safely rejected.

Greylists are pure gold when it comes to rejecting junk email. Whenever a client attempts to send mail to a particular recipient, the greylist server will attempt to find that client’s address and the recipient’s address in its database. If there is no such entry then one will be created, and Postfix will use a standard SMTP error message to tell the client that the recipient’s mailbox is temporarily unavailable and to try again later. It will then continue to reject similar attempts until the timestamp is of a certain age (mine is set to five minutes). The theory behind this is that almost no special-purpose spam sending software will actually attempt to re-send the message, but almost every legitimate mail server in existence will gladly comply and send the queued message a short time later. This simple addition cut my incoming junk email load by over 99% at the small cost of having to wait an extra five minutes to receive email for the first time from a new contact. It has worked flawlessly with the many mailing lists that my clients and I subscribe to and has not caused any collateral problems that I am aware of. If you take nothing else from this article, let it be that greylisting is a Good Thing and your customers will love you for using it.

I use the smtpd-policy.pl script that ships with Postfix to handle SPF, and Postgrey as an add-on greylisting policy server. They’re defined in my master.cf file as:

spfpolicy  unix  -       n       n       -       -       spawn
  user=nobody argv=/usr/bin/perl /usr/local/libexec/postfix/smtpd-policy.pl
greypolicy  unix  -       n       n       -       -       spawn
  user=nobody argv=/usr/bin/perl /usr/local/libexec/postfix/greylist.pl

Content filtering

The messages remaining at this point are very likely to be legitimate, although all that Postfix has actually done so far is enforce SMTP rules and reject known spammers. Their final hurdle on the way from their senders to my users’ mailboxes is to pass through a spam filter and an antivirus program. The easiest way to do this with Postfix is to install AMaViS, SpamAssasin, and ClamAV and then configure Postfix to send messages to AMaViS (which acts as a wrapper around the other two) before delivering them, and line 31 does exactly that:

31 content_filter = smtp-amavis:127.0.0.1:10024

SpamAssassin is fantastic at identifying spam correctly. Its “hit rate” while used in this system will be much lower than if it were receiving an unfiltered stream, as most of the easy targets have already been removed. I recommend setting AMaViS to reject only the messages with the highest spam scores; I arbitrarily picked a score of 10.0 on my systems. Then, tag the rest with headers that your users can use to sort mail into junk folders.

ClamAV has proven itself to be an effective and trustworthy antivirus filter, and I now discard any message that it identifies as having a viral payload.

Unfortunately, the configuration of these systems is more complicated than I could hope to cover in this article, as it depends heavily on the setup of the rest of your email system. The good news is that literally less than 1% of junk email makes it this far into my system, and I’ve actually gone for several days without realizing that SpamAssassin or ClamAV hadn’t been restarted after an upgrade. Still, these extra filters are very good at catching those last few messages that make it through the cracks.

Other possibilities

If you want more aggressive filtering and can accept the increased risk of false positives, consider some of the other less-conservative blackhole lists such as the ones run by SPEWS or the various lists of blocks of dynamic IP addresses. You may also consider using the rejectunknownhostname option mentioned in the “HELO restrictions” section, but you can expect a small, measurable increase in false positives.

The ruleset described above should be sufficient on its own to eliminate the vast majority of junk email, so your time would probably be better spent implementing and adjusting it before testing other measures.

Conclusion

None of the techniques I use are particularly difficult to implement, but I faced quite a challenge in assembling them into a coherent system from the scraps I found laying about in various web pages and mailing lists.

The most important thing I learned from the process was that it’s easy to experiment with Postfix, and it can be customized to your level of comfort. When used in my configuration, the most effective filters are:

  • Greylisting
  • DNS blackhole lists
  • HELO enforcement

Greylisting has proven to be an excellent filter and I’ve deployed it successfully on several networks with nothing but positive feedback from their owners. Even the basic HELO filtering, though, can visibly decrease spam loads and should be completely safe. It can be difficult to find a good compromise between safety and effectiveness, but I believe I’ve found a solid combination that should work in almost any situation. Don’t be afraid to test these ideas on your own and make them a part of your own anti-spam system!

Notes and resources

Postfix Configuration – UCE Controls
SPF: Sender Policy Framework
Greylisting.org – a great weapon against spammers
Postgrey – Postfix Greylisting Policy Server
AMaViS – A Mail Virus Scanner
The Apache SpamAssassin Project
Clam AntiVirus

Copyright information

This article is made available under the “Attribution-Sharealike” Creative Commons License 2.5 available from http://creativecommons.org/licenses/by-sa/2.5/.

Reprint information

This article was originally published in Free Software Magazine. I’m reposting it here for backup purposes.