[Xapian-discuss] Optimization and Load balancing with Xapian
Olly Betts
olly at survex.com
Thu Feb 16 15:10:54 GMT 2006
On Wed, Feb 15, 2006 at 01:03:09AM +0200, David Levy wrote:
> I am experiencing bad response times with Xapian/Omega in the last few days.
> My database has more than 700k records, using ~ 3Go disk space.
> Maybe my requests or my templates are not optimized, or maybe it's a
> hardware (disk speed) issue. The weird thing is that often, the search time
> provided in the response is sub second, and the response is actually given
> by Omega over one second (even seconds ...).
The time reported by "$time" includes the match, but because of how
Omega works it doesn't include the time to calculate top terms (if
you're using $topterms), and also doesn't include the time to display
the matches. If you're actually displaying a lot of matches that can be
quite considerable.
So one thing to check is that $topterms isn't being used.
> To solve this issue, I was been thinking about load balancing Xapian. I
> could not find any information about that on Internet. One of you did it yet
> ? How ?
I've not done it myself. The simple approach is just to put several
boxes in the DNS and they'll be used in a round-robin fashion.
> I've done some tests this morning and it seems that some of this slowlyness
> is due to sorting.
>
> Indeed, Omega requests with and with sorting do not produce the same
> calculation time at all. < 1 s without sorting and sometimes > 30 s with
> sorting.... These 30 seconds happen with results having like 500+ matches.
> How can it be possible ? Sorting should not be so much time consuming I
> guess.
It's not the actual sorting which takes the extra time - the issue is
that for a multi-term query, relevance ranking can terminate early in
many cases (often when we reach the end of the matches for any of the
terms). But if results are sorted on a value, we need to consider every
result which matches the query.
Cheers,
Olly
More information about the Xapian-discuss
mailing list