[Xapian-discuss] Evaluating Xapian

Markus Peter warp at spin.de
Mon Jan 24 13:04:09 GMT 2005


We're currently evaluating replacing SWISH-E with Xapian for a couple of
search installations.

The problem we have is, that the searches are no "pure" fulltext
searches, but also need some features which are otherwise more likely to
be found in RDBMS. Due to performance reasons we unfortunately cannot
combine the RDBMS results with the results from the fulltext search
engine, but instead, everything needs to come from the fulltext search

SWISH so far does, what we need in terms of search features, but its
ranking mechanism is not very good, and live indexing of documents is
only an afterthought there. Therefore we currently evaluate Xapian.

The most complex search application which we want to replace is a kind
of "person" search. If a search engine can fullfill all the requirements
of that application, it will also work for all our other needs.

A person is defined by age/birthdate, gender, contact information like
state, country, city in different fields, as well as freetext documents
attached to it. It must be possible to freely combine searches for the
different fields. That means that I can for example search for all male
persons between 20 and 30 from the USA where the word "foo" occurs in
the attached freetext documents.

The questions now are:
- Can I, and if yes, how would I do it, restrict searches to a specific
age group? SWISH-E offers the "-L" option for such range searches, but I
have not yet found something similar in the Xapian documentation. Range
search features are also useful for such things like "give me all
documents matching 'foo bar' which have been modified the last 30
days". I really really want to avoid adding a seperate filtering step
afterwards for things like that. Omega seems to implement a feature like
that, but how?
- Does anyone have good Perl-based examples for the indexer and the
searcher as a starting point?
- The documentation I read so far is not very explicit on searching
different fields. The way I currently understood it, I simply make the
name of the fields I want to support part of the terms I add to the

Markus Peter - SPiN AG                                          warp at spin.de

More information about the Xapian-discuss mailing list