[Xapian-discuss] Is there a secondary sorting possible for relevance-sorts?

Olly Betts olly at survex.com
Fri Nov 17 00:04:20 GMT 2006


On Thu, Nov 16, 2006 at 11:05:14AM +0100, Arjen van der Meijden wrote:
> On 16-11-2006 3:06 Olly Betts wrote:
> >OK, here's a patch (also checked in to SVN).  I've done a quick set of
> >tests to make sure all the changes work as intended, but let me know if
> >anything doesn't seem right:
> 
> It appears as if the SORT=x SORTAFTER=1 combination uses relevance-bands 
> or something like that?

I'm assuming you mean x is some number, not a literal 'x'...

The way this should work is that the primary sort key is the document
weight, and the secondary sort key is value number x (and the tertiary
sort key is the document id itself).

But note that relevance ordering is on the raw floating point weight value,
not the rounded percentage.

I can't really tell from your description if this is working or not.
Here's a patch to add a new $weight command to OmegaScript to report the
raw document weight:

http://www.oligarchy.co.uk/xapian/patches/omegascript-weight.patch

You can then add something like this to your query template and it
should be clear if the sort order is correct:

$value{1} $weight $id

If you were hoping for more reordering of hits with apparently identical
weights, the documentation comment for set_sort_by_relevance_then_value()
might be helpful (this is what SORTAFTER=1 uses):

        /** Set the sorting to be by relevance then value.
         *
         *  Note that with the default BM25 weighting scheme parameters,
         *  non-identical documents will rarely have the same weight, so
         *  this setting will give very similar results to
         *  set_sort_by_relevance().  It becomes more useful with particular
         *  BM25 parameter settings (e.g. BM25Weight(1,0,1,0,0)) or custom
         *  weighting schemes.
         *
         * @param sort_key value number to reorder on.  Sorting is with a
         *      string compare.  If ascending is true (the default) higher
         *      is better; if ascending is false, lower is better.
         *
         * @param ascending  If true, documents values which sort higher by
         *               string compare are better.  If false, the sort order
         *               is reversed.  (default true)
         */
        void set_sort_by_relevance_then_value(Xapian::valueno sort_key,
                                              bool ascending = true);

> But I think both will be worth a shot in our environment, just use 
> descending docid-order by default. The differences in search time seems 
> to be neglicable, compared to the ascending or "dont't care".

Changing the docid order will generally make little or no difference to
query times, except for some pure boolean queries.

> As an option, we can offer the user the option to do a rougher "order 
> mostly by relevance, but display newer results first, even if its 
> relevance is a bit lower".

That isn't what "SORTAFTER" should be doing though...

Cheers,
    Olly



More information about the Xapian-discuss mailing list