query on a subset of messages ?

Sebastien Binet binet at cern.ch
Mon Jul 9 10:06:37 PDT 2012


Austin Clements <amdragon at MIT.EDU> writes:

> Quoth Sebastien Binet on Jul 09 at 10:25 am:
>> 
>> hi there,
>> 
>> I was trying to reduce the I/O stress during my usual email
>> fetching+tagging by writing a little program using the go bindings to
>> notmuch.
>> 
>> ie:
>> db, status := notmuch.OpenDatabase(db_path,
>>     		notmuch.DATABASE_MODE_READ_WRITE)
>> query := db.CreateQuery("(tag:new AND tag:inbox)")
>> msgs := query.SearchMessages()
>> for _,msg := range msgs {
>>   tag_msg(msg, tagqueries)
>> }
>> 
>> 
>> where tagqueries is a subquery of the form:
>> [
>>     {
>>         "Cmd": "+to-me",
>>         "Query": "(to:sebastien.binet at cern.ch and not tag:to-me)"
>>     },
>>     {
>>         "Cmd": "+sci-notmuch",
>>         "Query": "from:notmuch at notmuchmail.org or to:notmuch at notmuchmail.org or subject:notmuch"
>>     }
>> ]
>> 
>> 
>> the idea being that I only need to crawl through the db only once and
>> then iteratively apply tags on those messages (instead of repeatedly
>> running "notmuch tag ..." for each and every of those many
>> 'tag-queries')
>> 
>> I couldn't find any C-API to do such a thing using the notmuch library.
>> did I overlook something ?
>> 
>> Is it something useful to add ?
>> 
>> -s
>
> Have you tried a more direct translation of the multiple notmuch tag
> commands into Go, where you don't worry about subsetting the queries?
> Unless you're tagging a huge number of messages, the cost of notmuch
> tag is almost certainly the fsync that it does when it closes the
> database (which every call to notmuch tag must do).  However, in Go,
> you can keep the database open across all of the tagging operations
> and then close and fsync it just once.

nope, I haven't tried that, but will do.

>
> Note that there is an important optimization in notmuch tag that you
> might have to replicate.  It manipulates the original query to exclude
> messages that already have the desired tags, so that they get skipped
> very efficiently at the earliest stage possible.
I already have this in my original shell script.
(wouldn't be too hard to automatically do, though.)

-s
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL: <http://notmuchmail.org/pipermail/notmuch/attachments/20120709/bfc4612a/attachment.pgp>


More information about the notmuch mailing list