thread merge/split proposal

Daniel Kahn Gillmor dkg at fifthhorseman.net
Mon Apr 4 11:23:43 PDT 2016


On Mon 2016-04-04 14:14:27 -0300, Daniel Kahn Gillmor wrote:
>   b) when an unjoin is requested, do a graph analysis of every message in
>      the thread's In-Reply-To and References headers, and recreate
>      distinct threads from the connected components.
 [...]
>  From the CLI, it would look something like:
>
>    notmuch join-threads THREAD_A THREAD_B [ THREAD_C ... ]
>    notmuch split-thread THREAD_X

On IRC, bremner pointed out two specific improvements to this proposal:

 0) the inverse operation of "join" proposed above is distinct from the
    ongoing discussion about splitting threads in arbitrary places.  I
    don't want to conflate these issues, so my proposed
    connected-component-analysis operation should be "notmuch
    unjoin-thread", and not "notmuch split-thread"

 1) a "join" operation probably has to be stored explicitly in the
    database, so that the threads will be re-joined across a
    dump/restore operation.

I'm happy with both of these improvements.

for (1) i'd propose that the join operation would be implemented by
adding a new term type "join", which can be applied to any document.
Its value is the message-id of a message that *should* be "in-reply-to"
but wasn't.

So for example: messages A and B are in one thread; messages C and D
come in in a separate thread that should have been joined to the prior
thread but is not.

i propose implementing this as something like:

    notmuch_message_add_term(message_c, "join", get_message_id(message_a));
  
    notmuch_message_set_thread_id(message_c, get_thread_id(message_a));
    notmuch_message_set_thread_id(message_d, get_thread_id(message_a));

i'd also add all the "join" terms to "notmuch dump", though i'm not sure
exactly how to extend the "notmuch dump" format.

feedback welcome,

        --dkg


More information about the notmuch mailing list