Automatic suppression of non-duplicate messages

Jani Nikula jani at nikula.org
Sun Nov 4 14:34:40 PST 2012


On Sat, 03 Nov 2012, David Bremner <david at tethera.net> wrote:
> Eirik Byrkjeflot Anonsen <eirik at eirikba.org> writes:
>
>> That's not what I see.  If I search for a term that only appears in
>> one of the "copies", none of the copies are included in the search
>> result.
>
> The offending code is at line 1813 of lib/database.cc; the message is
> only indexed if the message-id is new.
>
> It might be sensible to move _notmuch_message_index_file into the other
> branch of the if, but even if that works fine, something more
> sophisticated is needed for the call to
> __notmuch_message_set_header_values; the invariant that each message has
> a single subject seems reasonable.
>
> Offhand I'm not sure of a good method of automatically deciding what is
> the same message (with e.g. headers and footer text added by a mailing
> list).

Assuming there was good method, what would you do with two different
messages that have the same message id? That is the unique id we use to
identify messages (which should be fine per RFC 5322 and its
predecessors; we're talking about messages from broken systems here).

It might be helpful to have a configuration option similar to new.tags
that would define the tags to be assigned to messages with duplicate
message ids. (This could be done in the
NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID case near line 516 of
notmuch-new.c). This could be used to assign a "dupe" tag, for example,
so the user could do whatever they want in the post-new hook or the user
interface. A sufficiently clever post-new hook could compare the files
of a message, and drop the tag or add another, as the case may
be. Surely not a perfect solution, but keeps the implementation simple.


BR,
Jani.


More information about the notmuch mailing list