thread merge/split proposal

Daniel Kahn Gillmor dkg at fifthhorseman.net
Mon Apr 11 15:41:21 PDT 2016


On Sun 2016-04-10 09:16:40 -0400, David Bremner wrote:
> Daniel Kahn Gillmor <dkg at fifthhorseman.net> writes:
>
>> for (1) i'd propose that the join operation would be implemented by
>> adding a new term type "join", which can be applied to any document.
>> Its value is the message-id of a message that *should* be "in-reply-to"
>> but wasn't.
>
> Having "split" terms or equivalently "signed" +-reference terms would
> allow more general thread splitting, effectively updating (via a little
> journal of additions and deletions) the references data stored in mail
> file.

I'm not sure what you mean by "signed" here (cryptographically signed?
a term named "signed"?  the idea that the term could be either positive
or negative?), but i think your proposal is that we could have a
"reference" term with a value of "+foo at example.com" or
"-foo at example.com", instead of having a "join" term with value
"foo at example.com" and a "split" term with value "foo at example.com"

I'm not sure i see much of a difference between

 a) introduce two new term types, "join" and "split", with unsigned
    values

and

 b) introduce one new term type, "reference" with signed values

> The implementation cost could not be that much higher than only
> join/unjoin; a bit more work managing the terms attached to a document
> to avoid contradictions.

right -- and we'd need an understanding of the order in which these
terms are applied if multiple possibly-conflicting terms are present.

> Both versions probably complicate some peoples syncing solutions.

both (a) and (b) complicate syncing solutions, but my original proposal
of:

 c) just introduce a new term type "join" with unsigned value

is easy to sync, i think; i was going for the low-hanging fruit, and
trying to not let it get caught up on the more-fully-featured
arbitrary-split use case, though i understand the appeal of the generic
approach.

fwiw, i can do a really nasty workaround today to implement "join"
between two messages:

#### notmuch-join:
--------------
#!/bin/bash

verify_exists() {
    if ! notmuch search --output=files id:"$1" | grep -q . ; then
        printf "message-id %s is not in your messages\n" "$1" >&2
        exit 1
    fi
}

verify_exists "$1"
verify_exists "$2"

jdir=$(notmuch config get database.path)/join
mkdir -p "$jdir"
z=$(mktemp "$jdir/join.XXXXXX")

cat >"$z" <<EOF
From: test at example.org
Date: $(date -R)
Message-Id: <$(uuidgen)@join.example.org>
References: <$1>, <$2>
Subject: join

test
EOF
notmuch new
rm "$z"
notmuch new
--------------

And i note that this change is also not synced across dump/restore.

So adding an explicit "join" document term (and figuring out how to
represent it in "notmuch dump" and "notmuch restore") would be a strict
improvement over the current situation, right?

        --dkg

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 948 bytes
Desc: not available
URL: <http://notmuchmail.org/pipermail/notmuch/attachments/20160411/314d7a1b/attachment.sig>


More information about the notmuch mailing list