[Xapian-discuss] Another PHP 5 wrapper...

Olly Betts olly at survex.com
Thu Apr 6 00:20:55 BST 2006


Dear me, you wait ages for an object wrapped version of the PHP bindings
and then two come along at once!

I've only had a brief look through Paul's wrapper, and I've not had a
chance to look at yours at all.  But here are my thoughts on the issues
brought up, and some related ones.

Firstly, I'm very pleased that people are working on this.  It's
annoyed me for some time that SWIG can only reliably wrap C++ classes
to a "flat" interface (it does support an object interface, but only
for PHP4, and the support was buggy when I last tried it, though that
was some time ago and significant work has been done on SWIG's PHP
support recently).

A very important issue is that it needs to be easy to update the
bindings.  This is perhaps SWIG's greatest strength - generally I just
need to cut-and-paste the prototype for a new method from the xapian
headers to xapian.i and rebuild and it's wrapped in all the languages
SWIG supports.  Compare that to what is required to add a method to the
hand-crafted Java JNI wrappers we currently have (it requires changing 4
files) and you can see why the SWIG bindings are completely up-to-date
whereas the Java bindings are lagging significantly.

Paul's __call trick is very handy for this - we just need to tell
his code the name of a new class and it pretty much works automatically.
A new/changed method in an existing class typically requires no changes
at all!

If we're actually going to have PHP code to wrap each method (which
does have some advantages) I'd favour getting SWIG to generate the
wrapper code rather than trying to maintain it by hand.  Kevin Ruland
(the current SWIG PHP maintainer) is looking to move SWIG's PHP support
in this direction anyway:

http://article.gmane.org/gmane.comp.programming.swig.devel/15958

I was intending to mention Paul's approach to Kevin as an easy way
for SWIG to provide such wrappers (at least as a first implementation).

I think we should aim to provide as natural an API to the programmer
as we can, so for example providing iterators with the native semantics,
etc is important.  So is operator overloading in languages which support
it.  And so is following local naming conventions (it's not just the
Java wrappers which rename methods - so do the C# ones).

I must admit that camelCase as a PHP convention is news to me
- from what I've seen PHP's core seems to actively eschew naming
conventions.  But if PHP moving towards a convention that would be a
step forward and it would be good to follow it.  Recent versions of
SWIG can actually do the renaming automatically, though we have to
specify a list of special cases for C# (because the C++ naming is a
little irregular - e.g get_doccount -> GetDocCount).

It's true that such differences make it harder to use the C++-based
documentation, but it's not a big problem to mentally map between
getDocCount and get_doc_count.  I wouldn't support arbitrary renaming
of methods.

Anyway, on to a couple of specific points...

Daniel writes:
> - target : PHP >= 5.1

I don't really have any idea which PHP versions are prevalent.  I'm
guessing PHP4 is still fairly widely used and will continue to be for
a while to come, so it would be good to keep the existing bindings for
that (possibly with proxy classes reenabled to give an OO interface).

(Out of curiousity I've just tried removing -noproxy from the PHP
bindings and with PHP4 it passes an OO translation of smoketest.php
so it looks like Kevin's fixed the problems I encountered before).

But what's the issue with PHP 5.0 here?  I'm assuming there's something
useful which was added in 5.1, but 5.0 still seems to be common (it's
the version in the latest Ubuntu release for example) so it would be
good to support it with OO wrappers, albeit possibly with some nice
"PHP-ish" feature missing.

> Basically, I think that our two wrappers will have about the same 
> overhead, except if using "__call" is many times slower than using a 
> direct call. It is also probable that overhead of the wrapper will be 
> negligible on a real corpus containing many documents.

Generally Xapian methods either do a lot of work or won't be called
a lot.  The main exception to this is Document::add_posting when
indexing (or Document::add_term if you're indexing without positional
information).

If dispatching a method call from a language to C++ incurs a large
overhead it may be possible to address this by writing a "fat" wrapper
class which buffers postings and passes them all over in one go (which
would need some special code on the C++ side, but SWIG allows that).

Cheers,
    Olly



More information about the Xapian-discuss mailing list