[RFC2 Patch 5/5] lib: iterator API for message properties

Daniel Kahn Gillmor dkg at fifthhorseman.net
Sat Jun 4 09:23:23 PDT 2016


On Fri 2016-06-03 19:12:52 -0400, David Bremner wrote:
>>> [ dkg wrote: ]
>>>>  * for messages which have multiple files, which file is actually indexed
>>>
>>> yes. Although rather than storing that, I think the right answer is more
>>> like "all of them".
>>
>> I don't think we do this, do we?  Is this a bug?  is it tracked somewhere?
>
> IMHO it is a bug. It's implicit in
>
>    id:87k42vrqve.fsf at pip.fifthhorseman.net
>
> and the various requests for List-Id indexing, but it's probably worth
> starting a seperate thread to track it.  Especially since there are some
> unresolved design issues. Like what to return for searches.

I'm happy to use that original thread from 2012 as the tracking thread
if you think that's a reasonable starting point.  Peter Wang's "Malice
has nothing on incompetence" message in that thread is a good reminder
about other issues there.

>> the thread-id is *not* reproducible from the
>> message sets.  This is not only because of the ghost message leakage bug
>> documented in T590-thread-breakage.sh, but also because threads can be
>> joined by a message that is later removed (e.g., the "notmuch-join"
>> script in id:87egabu5ta.fsf at alice.fifthhorseman.net ).
>
> I see, I guess that's the intended behaviour given 604d1e0977c.
>
> I haven't thought about the pros and cons of dumping/restoring
> thread-ids. At least my database has about half as many threads as
> messages, so it's a bit of data, but perhaps that's not a bit problem.
> It's somewhat orthogonal to this series since those terms are already
> attached to messages.

i agree that thread restoration is orthogonal to the per-message
properties.  I should also be clear that i don't mean to suggest that we
should dump the literal thread-id.  That'd be terrible, because you
wouldn't be able to merge two databases.

I'm happy to move the thread-id discussion to a separate topic, i just
want to make sure that people are aware that our current documentation
for notmuch-dump (below) kind of overstates the case:

------
       These tags are the only data in the  notmuch  database  that  can't  be
       recreated  from  the messages themselves. The output of notmuch dump is
       therefore the only critical thing to backup (and much more friendly  to
       incremental backup than the native database files.)
------

OK, back to the discussion of per-message properties: i think we should
go ahead with this work, now that i understand the scoping/framing of it
better!

      --dkg
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 948 bytes
Desc: not available
URL: <http://notmuchmail.org/pipermail/notmuch/attachments/20160604/56d39505/attachment.sig>


More information about the notmuch mailing list