[Xapian-discuss] Future of Xapian (long)

Charlie Hull charlie@lemurconsulting.com
Fri, 18 Jun 2004 16:47:54 +0100


Hi all,

I'm emailing on behalf of myself, Tom Mortimer and Richard Boulton at Lemur
Consulting.

As some of you will know, we're one of two companies offering support and
customisation for Xapian (Olly Betts' company Oligarchy being the other
one). We've been having some thoughts about where Xapian might go in the
future and would like to expand this discussion to those on this list.

Obviously we have a commercial interest here - we'd love to get paid for
hacking Xapian - but we also have a more general interest in seeing Xapian
go places, as we've been involved with it since the OpenMuscat days.

Any feedback is very welcome, either on-list or privately.

1. We've already made some contributions to Xapian over the years and have
some other potential things in the pipeline; some of these are dependent on
customers paying us to do them, some we do in our spare moments. We've been
trying to compile a 'wishlist' for Xapian improvements/features and this is
what we have so far (in no particular order): we'd like some idea of what
people regard as a priority.

a. Testing and fixing to make Xapian work in a multithreaded environment.
Richard is currently helping to track down some threading problems, but this
work could be expanded into a rigorous set of tests to check Xapian's
threadsafety.
b. A web server for Xapian.
c. A summarizer/highlighter component; we've noticed that TheyWorkForYou.org
have this already but we also have some code to do this.
d. A spellchecker (like Google's 'did you mean xxx') using edit distance
calculation.
e. A web spider
f. An easy(ier) way of plugging in the various open source file format
converters, for indexing Ms Office and other formats, with a list of which
ones actually work!
g. More example programs, setup HOWTOs etc. to make the initial learning
curve a bit less steep.
h. A connector to ASP; some way of easily integrating Xapian results into
ASP pages. We've done something similar in the past for another search
engine.
i. Native compilation under Windows.

2. Where would be good projects/places to get Xapian accepted as a search
engine? Obviously the more people using Xapian the better, as it drives
improvements, finds bugs etc.

a. Content management systems (CMS), e.g. Zope (has anyone tried this?)
APLAWS (a Redhat-based local authority CMS)
b. Linux distributions
c. Academic institutions, many of which can't afford commercial engines and
usually end up using Google site search or htDig.
d. Web developers and other organisations that regularly use open source
software but may not know about Xapian.

3. We're also thinking of offering various levels of commercial support for
Xapian, from the 'pay a small flat fee and we'll get it up and running' to
full 24-hour support. Does anyone have any comments about this? It might
help to get Xapian accepted in commercial organisations that need some kind
of 'formal' backup.


Thanks in advance!


 Charlie Hull
Managing Director, Lemur Consulting Ltd.
- the information experts -
web    : www.lemurconsulting.com
tel/fax: +44 (0) 870 0118334
mobile : +44 (0) 776 7825828


---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.701 / Virus Database: 458 - Release Date: 07/06/2004