[Xapian-discuss] xapian performance
Fernando Nemec
fernando.nemec at folha.com.br
Tue Nov 14 16:55:27 GMT 2006
Hi Olly,
> On Mon, Nov 13, 2006 at 12:12:51PM -0200, Fernando Nemec wrote:
>> Hi all, does anyone knows tweaks to compile Xapian under intel/linux
>> to have, even if small, performance increase.
> I'm not really aware of any compile time magic that can be worked.
> If there was a magic GCC switch we'd be turning it on by default!
> For a large search system, it's the I/O which dominates so getting
> the compiler to generate better code is probably less important than it
> is for something like a video codec.
Well, thanks anyway. I'm going crazy to figure out how to make things
on Xapian faster.
At this time I build an large index on a Pentium Dual Core 3 Ghz, 1 GB
RAM, sata2 HD. The index has about 1000k documents, without stemming,
medium size by 6k.
When I try to do a search with several phrased words like this:
"ronaldo jogou somente trinta minutos ontem", the search time go very
high. This example takes about 30 seconds in the first try. The very
same search took < 700 ms on the first run and < 100 ms forth.
I did all the tweaks you taught me, plus the ones I read on the list
archive, but the execution time still high when using phrased words.
Well, I'm going to try the ones you list in this mail. Thanks!
Fernando
> There will undoubtably be some hotspots still, so you could try using
> GCC's -fprofile-generate and -fprofile-use. But I wouldn't expect
> miracles.
> If you have an x86_64 chip, compiling the code as 64 bit rather than 32
> bit will probably be faster.
> At runtime, if you've plenty of memory, setting env. variable
> XAPIAN_FLUSH_THRESHOLD higher than the default of 10000 can speed up
> indexing a lot.
> Once built, running the database through quartzcompact or xapian-compact
> (for flint) will make searches faster.
> You should also make sure that your disk subsystem is set up to be fast.
> Make sure DMA is enabled (if supported), etc.
> There's also still plenty of scope for improving code to speed up
> indexing, and some scope for faster searching too. I know some places
> where better algorithms can be used, but I suspect there are bottlenecks
> in unobvious places too. Profiling to identify these places would be
> a useful activity.
> Cheers,
> Olly
--
[]s
Fernando Nemec
fernando.nemec at folha.com.br
http://www.folha.com.br/
More information about the Xapian-discuss
mailing list