[Xapian-discuss] Search performance issues and profiling/debugging
search?
Ron Kass
ron at pidgintech.com
Wed Oct 24 13:49:05 BST 2007
Hi Olly.
We've installed both valgrind and oprofile, and managed to get valgrind
working (but not oprofile.. see error below.. still working on it, but
if you have an insight it would be helpful)
Anyway, we have actually used xapian-compress on the databases to see if
it helps. It appears to have rid of the segmentation fault error on
database 10, but the slowness and the variations in estimates still exist.
I'm linking to two outputs of valgrind and valgrind --leak-check=full
http://www.pidgintech.com/other/fts/test/valgrind_2-4-6-7-8-9-10.txt
http://www.pidgintech.com/other/fts/test/valgrind-full_2-4-6-7-8-9-10.txt
And here is the error we are getting when trying to start -deamon oprofile
Oct 24 01:21:27 fts1 kernel: BUG: soft lockup detected on CPU#3!
Oct 24 01:21:27 fts1 kernel: [<c043ea0f>] softlockup_tick+0x98/0xa6
Oct 24 01:21:27 fts1 kernel: [<c0408b7d>] timer_interrupt+0x504/0x557
Oct 24 01:21:27 fts1 kernel: [<c043ec43>] handle_IRQ_event+0x27/0x51
Oct 24 01:21:27 fts1 kernel: [<c043ed00>] __do_IRQ+0x93/0xe8
Oct 24 01:21:27 fts1 kernel: [<c040672b>] do_IRQ+0x93/0xae
Oct 24 01:21:27 fts1 kernel: [<c053a04d>] evtchn_do_upcall+0x64/0x9b
Oct 24 01:21:27 fts1 kernel: [<c0404ec5>] hypervisor_callback+0x3d/0x48
Oct 24 01:21:27 fts1 kernel: [<c0407fd1>] raw_safe_halt+0x8c/0xaf
Oct 24 01:21:27 fts1 kernel: [<c0402bca>] xen_idle+0x22/0x2e
Oct 24 01:21:27 fts1 kernel: [<c0402ce9>] cpu_idle+0x91/0xab
Oct 24 01:21:27 fts1 kernel:
=======================
Best regards,
Ron
Olly Betts wrote:
> On Wed, Oct 24, 2007 at 04:04:07AM +0200, Ron Kass wrote:
>
>> I don't know anything yet about oprofile, will have to dig deeper there.
>> (any pinpoints would be handy)
>>
>
> I've written a quick guide (something I've been meaning to do for a
> while):
>
> http://wiki.xapian.org/ProfilingXapian
>
>
>> What was not clear from your answers is if it makes sense that second
>> (and third) time searches take that long.
>>
>
> It suggests that the speed isn't limited by I/O but rather by CPU.
> Profiling data would show us why and hopefully enable such cases to be
> improved.
>
>
>> First I ran a test on all 8 databases (1, 2, 4, 6, 7, 8, 9 and 10)
>>
>> http://www.pidgintech.com/other/fts/test/test_1-2-4-6-7-8-9-10.txt
>>
>> This resulted in Segmentation fault.
>>
>
> Where does it seg fault? You can find out by running under gdb like so:
>
> gdb --args <the command to run your test>
>
> And then the seg fault should drop you back to the gdb prompt, where
> you can type "bt" to show a backtrace.
>
>
>> 1. Regarding your stable-sort, theory, if its something we can test, let
>> me know how.
>>
>
> I could provide a patch, but you'd need to be using SVN HEAD because the
> code in this area has changed a lot since 1.0.3.
>
> SVN HEAD is also a lot faster on some cases (more than twice as fast in
> tests replaying tweakers.net query logs against their database). So if
> you're going to profile it would be a lot more useful to profile that.
>
>
>> 2. What undefined value might anything depend on?
>>
>
> An uninitialised variable somewhere perhaps.
>
>
>> 3. I don't know the testsuite and valgrind and what you refer to
>> regarding that..
>>
>
> Don't worry about the testsuite. Just try running your test code under
> valgrind:
>
> valgrind <the command to run your test>
>
> It'll run rather more slowly and use quite a bit more memory than
> normal. Any problems valgrind spots are reported to stderr by default.
>
> Cheers,
> Olly
>
More information about the Xapian-discuss
mailing list