[Xapian-discuss] Design-question/problem

James Aylett james-xapian at tartarus.org
Fri Aug 28 13:43:00 BST 2009


On Fri, Aug 28, 2009 at 01:59:21PM +0200, Carsten Reimer wrote:

> we are using the Xapian-python bindings to build some fulltext search 
> engine for some 400+ books each about 300+ pages.
> We have the need to be able to limit the search on one book or on 
> several selected books as well as to be able to search all of them.
> 
> To be able to do so we decided to create one Xapian-database for each 
> book and build the databases we need to search for the different use 
> cases described above dynamically.

Hi, Carsten. As John points out, another way to approach this is to
use a single database, and to add a single term to each document,
identifying the book it came from. John mentions prefixes, but I
thought I'd provide some sample (if untested) code to try to explain
them a little.

----------------------------------------------------------------------
boolean_terms = [ 'XB1', 'XB2' ]
qp = xapian.QueryParser()
# ... configuration of qp (eg: stemming, prefixes)
p_query = qp.parse_query(request.GET.get('q', ''))
b_query = xapian.Query(xapian.Query.OP_AND, boolean_terms)
query = xapian.Query(xapian.Query.OP_FILTER, p_query, b_query)
----------------------------------------------------------------------

Where XB1, XB2 are identifying books 1 and 2 respectively. (You can
choose your own prefix; see
<http://xapian.org/docs/omega/termprefixes.html>.)

OP_FILTER only uses the left query (the p_query, which your users
typed in) for weights, but otherwise behaves like OP_AND, so it will
require all the terms in b_query to match for a document to be
returned.

J

-- 
  James Aylett

  talktorex.co.uk - xapian.org - uncertaintydivision.org



More information about the Xapian-discuss mailing list