[Xapian-discuss] Is there a secondary sorting possible for
relevance-sorts?
Olly Betts
olly at survex.com
Fri Nov 17 00:04:20 GMT 2006
On Thu, Nov 16, 2006 at 11:05:14AM +0100, Arjen van der Meijden wrote:
> On 16-11-2006 3:06 Olly Betts wrote:
> >OK, here's a patch (also checked in to SVN). I've done a quick set of
> >tests to make sure all the changes work as intended, but let me know if
> >anything doesn't seem right:
>
> It appears as if the SORT=x SORTAFTER=1 combination uses relevance-bands
> or something like that?
I'm assuming you mean x is some number, not a literal 'x'...
The way this should work is that the primary sort key is the document
weight, and the secondary sort key is value number x (and the tertiary
sort key is the document id itself).
But note that relevance ordering is on the raw floating point weight value,
not the rounded percentage.
I can't really tell from your description if this is working or not.
Here's a patch to add a new $weight command to OmegaScript to report the
raw document weight:
http://www.oligarchy.co.uk/xapian/patches/omegascript-weight.patch
You can then add something like this to your query template and it
should be clear if the sort order is correct:
$value{1} $weight $id
If you were hoping for more reordering of hits with apparently identical
weights, the documentation comment for set_sort_by_relevance_then_value()
might be helpful (this is what SORTAFTER=1 uses):
/** Set the sorting to be by relevance then value.
*
* Note that with the default BM25 weighting scheme parameters,
* non-identical documents will rarely have the same weight, so
* this setting will give very similar results to
* set_sort_by_relevance(). It becomes more useful with particular
* BM25 parameter settings (e.g. BM25Weight(1,0,1,0,0)) or custom
* weighting schemes.
*
* @param sort_key value number to reorder on. Sorting is with a
* string compare. If ascending is true (the default) higher
* is better; if ascending is false, lower is better.
*
* @param ascending If true, documents values which sort higher by
* string compare are better. If false, the sort order
* is reversed. (default true)
*/
void set_sort_by_relevance_then_value(Xapian::valueno sort_key,
bool ascending = true);
> But I think both will be worth a shot in our environment, just use
> descending docid-order by default. The differences in search time seems
> to be neglicable, compared to the ascending or "dont't care".
Changing the docid order will generally make little or no difference to
query times, except for some pure boolean queries.
> As an option, we can offer the user the option to do a rougher "order
> mostly by relevance, but display newer results first, even if its
> relevance is a bit lower".
That isn't what "SORTAFTER" should be doing though...
Cheers,
Olly
More information about the Xapian-discuss
mailing list