extract attachments from multiple mails

Jameson Graef Rollins jrollins at finestructure.net
Mon Jun 25 10:14:53 PDT 2012


On Mon, Jun 25 2012, David Belohrad <david at belohrad.ch> wrote:
> someone can give an advice? I have many emails containing
> attachment. This is typically an output of copy-machine, which fragments
> a scan into multiple attachments.
>
> I'd like to extract those attached files in a one batch into a specific
> directory. Is there any way how to programmatically fetch those files?

notmuch show has a --part option for outputting a single part from a
MIME message.  Unfortunately there's currently no clean way to determine
the number of parts in a message.  But sort of hackily, you could do
something like:

for id in $(notmuch search --output=messages tag:files-to-extract); do
    for part in $(seq 1 10); do
        notmuch show --part=$part  --format=raw $id > $id.$part
    done
done

That will also save any multipart parts, which aren't really that
useful, so you'll have to sort through them.

You can make something much cleaner with python, using the notmuch and
email python bindings:

http://packages.python.org/notmuch/
http://docs.python.org/library/email-examples.html

I hacked up something simple below that will extract parts from messages
matching a search term into the current directory (tested).

hth.

jamie.


#!/usr/bin/env python

import subprocess
import sys
import os
import notmuch
import email
import errno
import mimetypes

dbpath = subprocess.check_output(['notmuch', 'config', 'get', 'database.path']).strip()
db = notmuch.Database(dbpath)
query = notmuch.Query(db, sys.argv[1])
for msg in query.search_messages():
    with open(msg.get_filename(), 'r') as f:
        msg = email.message_from_file(f)
    counter = 1
    for part in msg.walk():
        if part.get_content_maintype() == 'multipart': continue
        filename = part.get_filename()
        if not filename:
            ext = mimetypes.guess_extension(part.get_content_type())
        if not ext:
            ext = '.bin'
        filename = 'part-%03d%s' % (counter, ext)
        counter += 1
        print filename
        with open(filename, 'wb') as f:
            f.write(part.get_payload(decode=True))
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://notmuchmail.org/pipermail/notmuch/attachments/20120625/e5b2a3de/attachment.pgp>


More information about the notmuch mailing list