[Xapian-discuss] Search performance issues and profiling/debugging
James Aylett
james-xapian at tartarus.org
Wed Oct 24 23:09:34 BST 2007
On Wed, Oct 24, 2007 at 11:04:01PM +0200, Ron Kass wrote:
> Sorry, seems I forgot to paste the statistics for the 100 consecutive
> runs we did on th 'no recip' search..
> Here it is
>
> Max : 40.845
> Min : 0.973
> Average : 1.739141414
> StDev : 4.161330613
Ewww, nasty. If this is Xapian's fault, there's something really crazy
going on.
Okay, some quick observations. (Without sitting down at your system
it's difficult to do more.)
* seems to take ten cycles to settle down; this feels really high to
me, but modulo doing a manual warm-up on your system after a
rebuild shouldn't actually matter all that much (I do wonder what
the VM usage for the non-leaf blocks in your b-trees comes to,
though).
* there is a vaguely cyclic issue; I'd be inclined to look for
sweeper 'scripts' in the OS, something screwy in the FS or VM
layer, or (again) maybe something in the virtualisation layer.
Something you could try is to put a 10 second sleep between each
run and do it again. If the period changes, it's most likely a
timed system thing. I don't think it is, though.
* the cycles, if that's what they are, seem to become longer,
suggesting (to me) that the VM system is getting better at
understanding your load.
* you're hit by a random spike which causes most of the damage about
halfway through the test; remove that and your sd becomes
reasonable (although not great, because of the warm-up period;
remove that as well and the sd becomes pretty sane).
* I don't like the idea of ~ 1 seconds for the mset to build, but
without knowing a lot more about your system I have no idea where
to attack this from (except that profiling will help - so annoying
it isn't working for you :-(
It'd be interesting to see various system usage stats against this. If
you could get something like cacti polling with a subsecond gap there
might be some interesting stuff you could learn about what's causing
the spikes. I've seen semi-regular yet initially inexplicable spikes
in other applications (under real rather than simulated load) that
generally came down to the interactions with the FS layer (which would
make sense here, since to a first approximation that's all you're
actually using in your OS during the tests).
J
--
/--------------------------------------------------------------------------\
James Aylett xapian.org
james at tartarus.org uncertaintydivision.org
More information about the Xapian-discuss
mailing list