[Xapian-discuss] Help with weights

Olly Betts olly at survex.com
Wed Jul 2 22:30:13 BST 2008


On Wed, Jul 02, 2008 at 02:00:02PM -0700, Robert Kaye wrote:
> 
> On Jul 1, 2008, at 6:14 PM, Olly Betts wrote:
> > Are you adding this type term to queries?  If not, the effect of
> > indexing the type term with those termcounts will be to increase the
> > document length of albums.  That will tend to decrease the  
> > importance of
> > each occurrence of "love" in the album title, so albums will indeed  
> > tend
> > to rank lower.
> 
> Ah ha -- that explains it -- thanks.

Incidentally, if you want to see why the weighting schemes work like
this, consider the case of a database with two documents, one of which
contains all the text from the first twice.  You probably want to give
these similar weight - certainly the doubled document shouldn't get
twice the weight for most applications.

For BM25 you can adjust a parameter to tune how much influence the
document length has.

> > Xapian::Query album_boost("XTYPEalbum");
> > album_boost = Xapian::Query(Xapian::Query::OP_SCALE_WEIGHT,  
> > album_boost, 4.2);
> > query = Xapian::Query(Xapian::Query::OP_AND_MAYBE, query,  
> > album_boost);
> 
> OK, I see how that can be really useful. Since I am providing an end  
> user search service, should I write my own parser and generate my own  
> queries or should I post-process the results from QueryParser to tack  
> on the fields that would give the user better search results?

If you're happy using QueryParser, just apply the above to the Query
object it produces (i.e. query in the above code snippet comes from
QueryParser).

Cheers,
    Olly



More information about the Xapian-discuss mailing list