Memory management practices

Ben Gamari bgamari.foss at gmail.com
Mon Aug 29 13:30:57 PDT 2011


Hey all,

Over the last few weeks I've been trying to fix some brokeness in the
notmuch-haskell bindings with respect to memory management.

In discussion included below, I describe the issue as I approached it.
In short, it appears that GHC's garbage collector is quite liberal with
the order in which it frees resources (which is apparently permitted by
Haskell's FFI specification), allowing, for instance, a Messages object to be
freed before a Query object despite my attempts to hold the proper references in
the Haskell wrapper objects to keep the Query reachable.

In general, it seems to me that memory management in notmuch bindings is
a little bit harder than it needs to me due to the decision not to
talloc_ref parent objects when a new child object is created. This means
that a bindings author needs to recreate the ownership tree in their
binding, a task which is fairly easily done (except in the case of
Haskell due to the weak GC finalization guarantees) but seems quite
unnecessary. Is there a reason this decision was made? Would a patch be
accepted adding talloc_ref'ing parents in those functions creating
children and talloc_frees in *_destroys?

Cheers,

- Ben



On Mon, 29 Aug 2011 20:30:10 +0200, Bertram Felgenhauer <bertram.felgenhauer at googlemail.com> wrote:
> Dear Ben,
> 
> Ben Gamari wrote:
> > After looking into this issue in a bit more depth, I'm even more
> > confused. In fact, I would not be surprised if I have stumbled into a
> > bug in the GC.
> [...]
> >         MessagesMessage
> >               |   
> >               |  msmpp
> >               \/
> >         QueryMessages
> >               |
> >               |  qmpp
> >               \/
> >             Query
> > 
> > As we can see, each MessagesMessage object in the Messages list
> > resulting from queryMessages holds a reference to the Query object from
> > which it originated. For this reason, I fail to see how it is possible
> > that the RTS would attempt to free the Query before freeing the
> > MessagesPtr.
> 
> When a garbage collection is performed, the RTS determines which heap
> objects are still reachable. The rest is then freed _simultaneously_,
> and the corresponding finalizers are run in some random order.
> 
> So assuming the application holds a reference to the MessagesMessage
> object for a while and then drops it, the GC will detect unreachability
> of all the three objects at the same time and in the end, the finalizer
> for MessagesMessage may be run before that of Query.
> 
> So I think this is not a bug.
> 
> To solve this problem properly, libnotmuch should stop imposing order
> constraints on when objects are freed - this would mean tracking
> references using talloc_ref and talloc_unlink instead of
> talloc_free inside the library.
> 
> For a bindings author who does not want to touch the library, the best
> idea I have is to add a layer with the sole purpose of tracking those
> implicit references.
> 
> Best regards,
> 
> Bertram
> 
> _______________________________________________
> Glasgow-haskell-users mailing list
> Glasgow-haskell-users at haskell.org
> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


More information about the notmuch mailing list