converting attachments to text

Brian Sniffen bts at evenmere.org
Tue Jan 3 09:23:34 PST 2017


Sure!  Here's what I use for docx, and I think it could be adapted to
pdf with pdftotext or whatever you're already using there.  You need a
small shell script that reads from STDIN, writes to a file, and calls
pandoc or pdftotext or whatever, like ~/bin/antiwordx:

    #!/bin/sh

    tmpfile=$(mktemp /tmp/antiwordx.XXXXXX.docx)
    trap 'rm -f -- "$tmpfile"' INT TERM HUP EXIT
    cat > "$tmpfile"
    pandoc --normalize -r docx -w markdown "$tmpfile"

You need a small handler function to call it from Elisp---see attached
file `inline-docx.el`, which assumed you have both the old `antiword`
for old-style .doc files and pandoc for new-style `docx`.

I apologize for the roughness of the code; it should probably use
customizable paths for pandoc and such.

-Brian

-------------- next part --------------
A non-text attachment was scrubbed...
Name: inline-docx.el
Type: application/emacs-lisp
Size: 3362 bytes
Desc: not available
URL: <http://notmuchmail.org/pipermail/notmuch/attachments/20170103/33adefe9/attachment-0001.bin>
-------------- next part --------------


Bart Bunting <bart.bunting at ursys.com.au> writes:

> Hi,
>
> Just looking for some pointers.
>
> I have to deal with quite a few emails with attachments in either pdf or
> word format.
>
> I'm on a mac so can use applescript or something pdftotext or similar to
> convert them to text.
>
> I'm blind so use emacspeak as my primary interface.  Having an easy way
> to convert the notmuch attachments to text other than saving to a file
> and processing them would greatly speed up my workflow.
>
> Is there something in existance already to do this sort of thing?
>
> I have a little rudimentary lisp skill so can hack something up if
> someone can give me some pointers on a direction to head in.
>
> Any advice appreciated.
>
> Kind regards
>
> Bart
>
> Kind regards
> Bart
> -- 
>
> Bart Bunting - URSYS
> PH: 02 87452811
> Mbl: 0409560005
> _______________________________________________
> notmuch mailing list
> notmuch at notmuchmail.org
> https://notmuchmail.org/mailman/listinfo/notmuch


More information about the notmuch mailing list