Emacs, part 2 - mail.

Why email in Emacs?

The obvious answer: for the text editing facilities that Emacs provides (evil-mode, etc.). Another answer: the potential for extensibility. As an example, when in message-mode, it's possible to use orgstruct-mode to utilize Org's structure editing capabilities. On top of that, you can use org-mime's org-mime-htmlize to export org structure to HTML, creating nicely formatted messages to impress your friends.

Choices to consider

When choosing an email client in Emacs, there are potentially several variables to be considered. If using one client, it may be necessary to install something that handles IMAP synchronization. When using another (Gnus), it may be possible to configure everything you need inside Emacs.

Mail syncrhonization

Based on conjecture and anecdote, I've found the following to be true:

  • OfflineIMAP and mbsync are the only choices (well, at least the best choices by a good margin)
  • mbsync is faster
  • mbsync is less buggy
  • The documentation for both are not particularly good

Trusting the Internet, I went with mbsync.

Mail indexer/searcher/interface

The most frequently recommended Emacs clients tended to be:

  • mu4e
  • notmuch
  • Gnus
  • Wanderlust

Making a choice here came down to my personal preference for Gmail-like mail consumption. I found this Reddit comment very helpful, and it ultimately convinced me that notmuch was the right way to go (for me, at least). If you're interested in the others, this Stack Exchange post contains a lot of good, detailed information.

Mail sender

As with OfflineIMAP vs isync, msmtp seemed to be the de facto standard for sending emails with Emacs, so that's what I picked. It's possible to communicate with an SMTP server directly via smtpmail.el, but supporting multiple accounts can be a little unwieldy (from what I understand).

Setup

Given that I currently use macOS, these instructions only cover setup and configuartion on macOS.

Install necessary software

Assumptions: Emacs and homebrew are already installed and configured.

Here's what we'll be installing via homebrew:

  • GnuPG for password management
  • isync for IMAP synchronization
  • notmuch for a mail indexer/Emacs mail client
  • msmtp for sending mail
$ brew install gpg gpg-agent pinentry-mac isync msmtp

Because we want some additional features with notmuch, we'll run a separate command with some flags:

$ brew install notmuch --with-emacs --with-python3

In addition to the above, we want to preemptively install Berkeley DB for use with Bogofilter:

$ brew install berkeley-db@4

Bogofilter is available via homebrew, but I couldn't get it to work. I compiled it from source instead. The following commands should do the trick:

$ wget https://downloads.sourceforge.net/project/bogofilter/bogofilter-1.2.4/bogofilter-1.2.4.tar.bz2
$ tar -vxjf bogofilter-1.2.4.tar.bz2
$ cd bogofilter-1.2.4

Once cd'd to bogofilter-1.2.4, follow the INSTALL instructions. I didn't have to make any changes to get it to work.

Configuration

  • GnuPG

    If you don't already have a PGP key, create it:

    $ gpg --gen-key
    

    Follow the prompts to complete the process. Next, create a new directory to store encrypted password files:

    $ mkdir ~/.passwd
    

    Create a file that contains your password in plain text, then run the following command:

    $ gpg --output <name>.gpg --encrypt --recipient <you>@example.com <source-file> && rm <source-file>
    

    <source-file> should be the file that contains your password in plain text.

    By default, gpg-agent uses a curses based pinentry and I couldn't get it to stop prompting me for my password. To avoid re-entering your password, we need to tell gpg-agent that we'd like to use pinentry-mac, which allows us to use macOS's Keychain feature to store our password. First, let's create a config file for gpg-agent:

    $ touch ~/.gnupg/gpg-agent.config
    

    Open the newly created file and paste the following:

    # Connects gpg-agent to the OSX keychain via the brew-installed
    # pinentry program from GPGtools. This is the OSX 'magic sauce',
    # allowing the gpg key's passphrase to be stored in the login
    # keychain, enabling automatic key signing.
    
    pinentry-program /usr/local/bin/pinentry-mac
    
  • mbsync

    We don't want to synchronize our mail using an unencrypted connection. Thankfully, mbsync let's us use TLS. For this, we need a CA cert bundle. According to this post, macOS doesn't have one (and if it does, it's probably in the wrong format). There are probably other ways to get one, but I used the following method:

    $ wget https://curl.haxx.se/download/curl-7.57.0.tar.gz
    $ tar xvf curl-7.57.0.tar.gz
    $ cd curl-7.57.0/lib
    $ ./mk-ca-bundle.crt
    $ mv ca-bundle.crt /path/to/ca-bundle.crt
    

    Move the file wherever you like and make note of it; we'll tell mbsync where to look for it in its configuration file. mbsync looks for a config file at ~/.mbsyncrc. Our ~/.mbsyncrc file should look something like this:

    IMAPAccount <name>
    # Address to connect to
    Host <IMAP host>
    User <name>@example.com
    AuthMechs LOGIN
    PassCmd "gpg -q --for-your-eyes-only --no-tty -d ~/.passwd/<name>.gpg"
    SSLType IMAPS
    CertificateFile ~/ca-bundle.crt
    
    IMAPStore <name>-remote
    Account <name>
    
    MaildirStore <name>-local
    # The trailing "/" is important
    Path ~/.mail/<name>/
    Inbox ~/.mail/<name>/Inbox
    
    Channel <name>
    Master :<name>-remote:
    Slave :<name>-local:
    # Exclude everything under the internal [Gmail] folder, except the interesting folders
    Patterns * ![Gmail]* "[Gmail]/Sent Mail" "[Gmail]/Starred" "[Gmail]/All Mail"
    # Or include everything
    #Patterns *
    # Automatically create missing mailboxes, both locally and on the server
    Create Both
    # Save the synchronization state files in the relevant directory
    SyncState *
    

    Replace values as needed. Also, the settings under Channel were stolen from elsewhere and worked for me. If you need additional assistance, see this link. For whatever reason, I had to reboot to get pinentry-mac to behave correctly, so you may have to do the same. You should now run mbsync <name> and watch as mbsync synchronizes mail. When prompted for a password, don't forget to check "Save to Keychain".

  • notmuch

    notmuch doesn't take much (heh) work to get off the ground. Just run notmuch setup and follow the prompts to get everything configured. notmuch will create a config file found at ~/.notmuch-config. Open it and change the [new] settings to reflect the following:

    [new]
    tags=new
    

    We'll utilize a filtering process that uses the new tag its search query for mail that needs to be processed and retagged.

  • msmtp

    Create msmtp's config file at ~/.msmtprc and add the following:

    # Set default values for all following accounts.
    defaults
    auth           on
    tls            on
    tls_trust_file /Users/<name>/ca-bundle.crt
    logfile        ~/.msmtp.log
    
    # Account
    account        <name>
    host           <smtp_host>
    port           587
    from           <name>@example.com
    user           <name>
    passwordeval   echo $(gpg --quiet --for-your-eyes-only --no-tty --decrypt ~/.passwd/password.gpg)
    
    # Set a default account
    account default : <name>
    

    Test your configuration by running the following command:

    $ msmtp --account=<name> -Sd
    
  • Emacs stuff

    Add the following to your Emacs init file:

    ;; Mail stuff
    ;; tell emacs about the path to notmuch
    (setenv "PATH" (concat (getenv "PATH") ":/usr/local/bin"))
    (setq exec-path (append exec-path '("/usr/local/bin")))
    ;; tell emacs about notmuch
    (autoload 'notmuch "notmuch" "notmuch mail" t)
    ;; use msmtp
    (setq message-send-mail-function 'message-send-mail-with-sendmail)
    (setq sendmail-program "/usr/local/bin/msmtp")
    ;; tell msmtp to choose the SMTP server according to the from field in
    ;; the outgoing email
    (setq message-sendmail-extra-arguments '("--read-envelope-from"))
    (setq message-sendmail-f-is-evil 't)
    ;; end mail stuff
    

    This should be just enough to get things running. You can find more neat settings here.

  • Mail filtering

    As the name implies, notmuch doesn't do any more than its intended functions: indexing and tagging. Thankfully, there's an aptly named Python script for mail tagging/filtering called afew. Install it:

    $ python -m venv --system-site-packages .venv
    $ source .venv/bin/activate
    $ pip install afew
    

    The documentation recommends creating a symlink to .venv/bin/afew somewhere in $PATH:

    $ ln -snr .venv/bin/afew ~/.bin/afew
    

    For additional configuration of afew, it's a good idea to familiarize yourself with its documentation. It's pretty simple and provides the ability to match email headers based on regular expressions and all sorts of fun stuff.

  • Spam filtering

    This is pretty Gmail specific; if you can figure out how to get a collection of spam email, do that and then use the instructions below to train Bogofilter.

    To collect spam samples from Gmail, I used Google Takeout to export anything that has been automatically or manually marked as spam. Takeout provides an .mbox file for download. Use the following command to train Bogofilter using spam samples:

    $ bogofilter -s < spam.mbox
    

    And the following to train Bogofilter using ham samples:

    $ bogofilter -n < ham.mbox
    

    This should create a word list found at ~/.bogofilter/wordlist.db that Bogofilter will use for classifying email.

    Now on to the fun part!

Automation

As it stands, we can only make all of this work by running commands manually. To remedy that, we'll write some scripts and use macOS's launchd to automaticall run them.

Spam filtering

We'll start with a spam filtering script. I found inspiration in the form of notspam, which I attempted to use but I couldn't get it to work. The neat thing about it is that it allows the user to pick their spam classifying backend of choice. Considering that spam filtering plays a minor role in my filtering process (most spam is filtered by Gmail), I cut my losses with notspam and wrote my own script.

The script uses four modules; we'll use the import statement to use them:

import subprocess
import notmuch
import sys
import os

All but the notmuch module should be included with a standard Python installation. The notmuch module should have been installed earlier with homebrew (using the --with-python3 flag).

The notmuch module provides a number of facilities for interacting with notmuch. You can find its documentation here. We'll begin by initializing a notmuch database object called db and set the mode parameter to 1, indicating that the database should be opened in READ_WRITE mode:

db = notmuch.Database(mode=1)

Next, we create a variable for storing a query that we'll use to find new mail in the notmuch database. Database() contains a method called create_query() that we'll use for this purpose; create_query() takes an argument that looks like a normal notmuch query:

query = db.create_query('tag:new')

The next variable we'll create contains the path to our Bogofilter binary:

bogofilter = '/usr/local/bin/bogofilter'

Now that we've defined a few basic variables, we can start to write a function that uses Bogofilter to classify our mail. We'll call it isSpam(). First, we need to decide what isSpam() should do. When given a mail message as its input, we want isSpam() to return True if it's spam or False if it's not. We know that Bogofilter will be doing the classifying, so we need a way to run it with Python. The subprocess module gives us the ability to do just that.

Subprocess's preferred method for invoking subprocesses is run(). run() takes several potential arguments, but we'll specify only two. The first is an array representing the command we might run on the command line:

[bogofilter, "-BT", path]

By reading Bogofilter's manual we can figure out that we'll probably need the two flags seen above:

The -B object … (bulk mode) option tells bogofilter to classify multiple objects named on the command line.

The -T provides an invariant terse mode for scripts to use. bogofilter will print an abbreviated spamicity message containing 1 letter and the score. Spam is indicated with "S", ham by "H", and unsure by "U".

The second argument to pass to run() deals with output redirection. As stated by the documentation:

[run()] does not capture stdout or stderr by default. To do so, pass PIPE for the stdout and/or stderr arguments.

With our two arguments accounted for, we instantiate a subprocess object with a variable called p:

p = subprocess.run([bogofilter, "-BT", path], stdout=subprocess.PIPE)

The documentation tells us that run() returns a CompletedProcess instance that contains a method for capturing stdout. An important caveat to note is that the output returned is a byte sequence. This won't work for isSpam(), so we need to decode it:

output = p.stdout.decode('ascii')

This gives us the output we'd expect to see when running Bogofilter from the command line:

.mail/gmail/Inbox/new/1519157417.41288_17540.dhcp-82-214,U=28921:2, S 0.999457

We only care about one piece of the above output (for now): the S, H, or U near the end. If it's an S or U, isSpam() should return True, else False. Using an object we'll discuss later, we can use the following if statements to accomplish the desired goal:

if processed.mailType == 'U' or processed.mailType == 'H':
    return False
if processed.mailType == 'S':
    return True

Before we can do this, however, we need to create an object that we'll define later. We'll call it processed; processed will have two attributes: path(), representing the file path for the mail it's classifying, and mailType(), an attribute for storing the S, U, or H that Bogofilter returns. Knowing this, we can see that that we need to break the output from Bogofilter into consumable pieces. We'll use split() to turn the output into an array, using empty spaces as a delimiter:

output = output.split(" ")

Next, we'll use a function (to be defined later) to initialize an instance of our processed object. The function takes two arguments: the file path as the first argument and the classification as its second:

processed = processOutput(output[0], output[1])

We'll now define the object used to hold the two attributes mentioned previously:

class pOutput(object):
    path = ""
    mailType = ""

Next, we need to define a function that initializes our object instance:

def processOutput(path, mailType):
    processed = pOutput()
    processed.path = path
    processed.mailType = mailType
    return processed

As an aside, I'm not sure that this is the idiomatically correct way for initializaing an object, but this is how it exists on my computer and in my git repo. If we wanted to do this the right way, it would look like this:

class pOutput:
    def __init__(self, path, mailType):
        self.path = path
        self.mailType = mailType

Then, to initialize it, we'd use this instead:

processed = pOutput(output[0], output[1])

I found an excellent explanation for class initialization here.

Now that we have isSpam() and processedOutput() defined, we can write our main function for sifting through and tagging mail.

We know that we have a variable called query and that it contains a Query() object. Knowing this, we want to iterate over each message that the query returns and feed it to isSpam() for classification. The documentation for notmuch says that Query() objects contain a method called search_messages() that returns a Messages() object; the documentation says this:

This object provides an iterator over a list of notmuch messages

Using this, we can create a for loop that iterates over our Messages() object:

for msg in query.search_messages():
    for filepath in msg.get_filenames():

Now we can use our isSpam() function to do the tagging. If the message is spam, we want to remove the new tag and add spam; else, we want to remove new and add inbox. The Message() class contains two methods for tagging: add_tag() and remove_tag(). Each takes a tag as an argument.

if isSpam(filepath) == True:
    msg.remove_tag('new')
    msg.add_tag('spam')
if isSpam(filepath) == False:
    msg.remove_tag('new')
    msg.add_tag('inbox')

See the full script here.

Notifications

It's nice to have notifications for new mail. Our next bash script should notify us if we've received new messages within the last x seconds. First, we need to figure out what facilities macOS makes available for generating notifications from the command line. Ideally, we'd like this to be native.

This Stack Exchange post provides an answer. We can test it to make sure it works. Run the following command to see it in action:

$ osascript -e 'display notification "Lorem ipsum dolor sit amet" with title "Title"'

That should do nicely for notification purposes. Now we need to get all of the information that would be helpful in a notification and figure out how to notify for new messages only.

Let's make a list of all of the stuff we might want to see in a notification:

  • Who it's from
  • The subject line
  • Unread message count

Using notmuch to query mail messages, we can extract all of this information with a general pattern:

$ notmuch search --format=text --output=files --limit=1 --sort=newest-first "tag:unread" | xargs <command>

This yields the newest unread message and pipes it to xargs to be used as the input to whatever command you need to use to process that input.

Let's start with From:. By feeding the message as input to cat, we can probably manipulate the text with grep and sed to produce only the From: address as ouput. Each message file contains mail headers, so a simple regular expression should work to find the From: header: ^From:. The output from such a command might produce output that looks like this:

From: alice@example.com

We can then pipe this to sed to exclude From: from the output:

$ notmuch search --format=text --output=files --limit=1 --sort=newest-first tag:unread | xargs cat | grep "^From:" | sed 's/From: //'

We'll store the result in a variable, like so:

FROM=$(/usr/local/bin/notmuch search \
                              --format=text \
                              --output=files \
                              --limit="$LIMIT" \
                              --sort="$SORT" "$SEARCH" \
           | xargs cat | grep "^From:" | sed 's/.*<//' | sed s'/.$//')

You'll notice that we've used variables for certain values that notmuch takes as arguments. You can define them like so:

SEARCH="tag:unread"
LIMIT=1
SORT="newest-first"

Next, the number of messages that match our search term. notmuch includes a command just for this called count:

UNREAD_COUNT=$(/usr/local/bin/notmuch count --output=messages "$SEARCH")

For the subject line, we'll use summary as the value for the --output flag. This produces one line that contains the subject. We'll pipe the output from notmuch to sed, matching any character that occurs before a ';' followed by a space and replace it with nothing and pipe that result to sed again, appending a newline:

TXT_SUBS=$(/usr/local/bin/notmuch search \
                                  --format=text \
                                  --output=summary \
                                  --limit="$LIMIT" \
                                  --sort="$SORT" "$SEARCH" \
               | sed 's/^[^;]*; //' | sed 's/$/\n'/)

With this info, we can now notify using osascript like so:

osascript -e 'display notification "'"$TXT_SUBS"'" with title "New mail from '"$FROM"'!" subtitle "You have '"$UNREAD_COUNT"' new messages." sound name "Frog"'

If you want to change the sound made when a notification occurs, you can find possible values in /System/Library/Sounds.

With everything in place, we want to avoid notifying if the newest message has been seen before. I send notifications every 30 seconds, so I don't want to see one if it's notifying me about a file that existed before 30 seconds ago. We can do that by comparing the time 30 seconds ago to the birth time of the newest file in our search.

My understanding is that information for a file's birth time may not be available depending on the filesystem on which it exists. Thankfully, macOS has a command called GetFileInfo that returns the birth date of a file when using the -d flag. We'll store it in a variable:

LATEST=$(/usr/local/bin/notmuch search \
                                --format=text \
                                --output=files \
                                --limit="$LIMIT" \
                                --sort="$SORT" "$SEARCH" \
             | xargs GetFileInfo -d)

We'll need to compare this against the time 30 seconds ago, which we can do with date. Thankfully, the version of date included with macOS includes a flag (-v) for adjusting the date given an argument of some amount of time, plus or minus. We also want to format the output to resemble the output produced by GetFileInfo, so it ends up looking like this:

CURRENT_TIME=$(date -v -30S +%m/%d/%Y\ %H:%M:%S)

With this, we can create an if statement that only produces a notification if the file was created within the last 30 seconds:

if [[ $CURRENT_TIME < $LATEST ]]; then
  # Check /System/Library/Sounds for available sound name values
  osascript -e 'display notification "'"$TXT_SUBS"'" with title "New mail from '"$FROM"'!" subtitle "You have '"$UNREAD_COUNT"' new messages." sound name "Frog"'
fi

You can see the whole script here.

Mail sync script

This script isn't really worth going over. It essentially just runs mbsync, notmuch, afew, and the above scripts, with or without logging output (the logging output is horribly ugly and needs to be better). You can see it here.

Using launchd to run mail-sync.sh

Now we can automate mail synchronization and tagging! We'll use launchd to daemonize our script. Thorough documentation for launchd can be found here, so I'll keep it as straightforward as possible here.

We need to create a file in /Library/LaunchAgents and load it with launchctl. We'll call it com.example.mail.plist. It should contain the following:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>UserName</key>
    <string>user</string>
    <key>EnvironmentVariables</key>
    <dict>
        <key>HOME</key>
        <string>/Users/user/</string>
    </dict>
    <key>Label</key>
    <string>com.example.mail</string>
    <key>ProgramArguments</key>
    <array>
        <string>/Users/user/.scripts/mail-sync.sh</string>
    </array>
    <key>StartInterval</key>
    <integer>30</integer>
</dict>
</plist>

Change values as necessary. Take note that <key>ProgramArguments</key> points to the location of mail-sync.sh. This is important.

With the launchd agent file created, load it with launchctl:

$ launchctl load -w /Library/LaunchAgents/com.example.mail.plist

That's it!

You can check the status of the process with the following command:

$ launchctl list | grep com.example.mail

If it's running successfully, you should see 0 in the middle column.