Update on python-cffi bindings

Floris Bruynooghe flub at devork.be
Thu Dec 21 03:30:39 PST 2017


Daniel Kahn Gillmor <dkg at fifthhorseman.net> writes:

> Hi Floris--
>
> On Sun 2017-12-17 19:08:18 +0100, Floris Bruynooghe wrote:
>
> i've heard reported, and i also appreciate your attention to performance
> concerns on different python platforms (e.g. making sure things are
> performant on both CPython and PyPy).

Oh btw, I can happily report this has paid off.  These bindings perform
much better on PyPy while performing slightly worse on CPython.  I
haven't proven this but my guess is most of the performance loss on
CPython is due to the memory management which unfortunately involves a
lot of calls on each object destruction, also each object needs to be
destroyed individually so when destroying a parent all children will get
destroyed first which is a waste from libnotmuch's point of view.  This
can be improved on by using a weakref.finalizer-modeled approach, but
is some more work.

> I confess that i haven't read the series in full, but i have two main
> concerns that i'd generally use to evaluate proposals like this:
>
>  0) how much does the API change?  that is, if we're expecting existing
>     users of the notmuch python bindings to adopt this new approach, how
>     much pain are we putting them through?  How much of an effort has
>     been made to reduce that pain, and do we have a clear and
>     comprehensive porting guide?

The API changes a lot and there is no easy migration.  And history has
shown that's a terrible way to get something new adopted.  Last time I
suggested a possible multi-tiered approach (maybe not as explicit):

1 I think it's possible to move the memory management technique to the
  existing API without API breakage.  It would still allow users to call
  functions in the wrong order etc, but that's not any regression.

2 It's probably possible to either switch the existing API to use CFFI
  or create a drop-in replacement for it based on CFFI.  The benefit
  here is two-fold: users get better PyPy performance and I believe it
  becomes easier to maintain the bindings, e.g. all you need to do to
  call notmuch_database_get_path is
  capi.ffi.string(capi.lib.notmuch_database_get_path(dbptr)) (see
  https://github.com/flub/notmuch/blob/cffi/bindings/python-cffi/notdb/_database.py#L263)
  for an actual example).  But this really depends on what everyone else
  here that maintains the Python bindings here thinks.  I'd encourage to
  have a look at the CFFI implementation to get an idea of this.

3 As last step I still think providing the more idiomatic bindings is
  useful, especially for new users.  It does take the burden of
  correctly calling the C functions somewhat more.  This could then
  either treat a notmuch_cffi as a lower level API which more directly
  maps the C API or it could call C directly as it does now.  I'm not
  currently sure on which is more feasible or better here.

  An additional thing that could be done here to ease migration is to
  allow creating the new objects from the old ones allowing a codebase
  to gradually adopt the newer API where appropriate.  E.g.:

     db = notmuch.Database(...)
     msg = db.find_message(...)
     new_db = notdb.Database.from_notmuch(db)
     new_msg = notdb.Message.from_notmuch(msg)
     print('Tags not on msg: {}'.format(new_db.tags - new_msg.tags))

  This would rely on the existing API to migrate to CFFI, otherwise it
  could still be possible but would be very hairy.

So do people reckon following this path would be more desirable and
worth the extra effort?  Would there be a willingness to change the
exising notmuch API to a CFFI implementation?  I didn't go down this
path yet as last time there was no feedback on this suggestion while
there was some moderate curiosity for a more idiomatic API.

>  1) how accessible is the new implementation for contributions from
>     other developers?  For example, a transition to a highly idiomatic
>     style of python that no other developers would be able to improve
>     would put a large maintenance burden on you.

- For the CFFI-part I believe this is easier then the existing ctypes as
  I tried to say above.

- For exposing completely new APIs, sure building the
  notdb.ImmutableTagSet and MutableTagSet was not straight forward,
  likewise for the PropertiesMap.  Many other things are much easier
  though.  One possible nice way to alleviate this with the idea of the
  existing notmuch API being the lower-level layer nearly mimicking the
  C-API directly.  That way adding new APIs there is more or less
  straight forward and there is less time pressure to add them to the
  higher level API.  Especially if mixing the APIs is supported.


> Do you have any thoughts about these questions?
>
> For example, for point 0, have you tried to run alot or some other
> python-based notmuch MUA against this newer python binding?  what
> changes were necessary?
>
> For example, for point 1, can you show me how i would (as a fellow notmuch
> contributor) create a patch to add support for some of the recent (post
> 0.25.3) changes to notmuch in the python interface?

I think I've answered these somewhat already.  If you think it would be
useful to see a real example of one of the recent patches against what I
have now I could create this.  Just let me know.

> Also, the old python bindings are actually directly exercised by the
> notmuch test suite.  i see you've adopted the tox.ini convention to
> trigger a run of pytest, but that's not how the current test suite
> works.  Do you think notmuch needs to adopt tox in order to use this
> series, or do you think you could integrate the testing of this module
> into the currently-existing test suite?

I must admit I didn't look too much at the existing tests untill just
now.  If I understand correctly the existing tests are in
T390-python.sh.  In this case I'd say the tests I added are a lot more
thorough.  The reason I added tox.ini is to easily test against multiple
python versions, i.e. CPython 3.5, 3.6 & PyPy 3.5.  If I had to guess at
the best way to integrate I'd say it's probably best to create a
TXXX-python-pytest-pyXY.sh for each supported python version and lose
tox.ini.  I'd suggest for those tests to be a simplified version of what
tox does: create a virtualenv and run pytest inside of it.

I'm open to other suggestions though, I'm not very familiar with the
notmuch testing.


Let me know what you all think,
Floris


More information about the notmuch mailing list