[Xapian-discuss] xapian performance

Fernando Nemec fernando.nemec at folha.com.br
Tue Nov 14 16:55:27 GMT 2006


Hi Olly,

> On Mon, Nov 13, 2006 at 12:12:51PM -0200, Fernando Nemec wrote:
>> Hi all, does anyone knows tweaks to compile Xapian under intel/linux
>> to have, even if small, performance increase.

> I'm not really aware of any compile time magic that can be worked.
> If there was a magic GCC switch we'd be turning it on by default!

> For a large search system, it's the I/O which dominates so getting
> the compiler to generate better code is probably less important than it
> is for something like a video codec.

Well, thanks anyway. I'm going crazy to figure out how to make things
on Xapian faster.

At this time I build an large index on a Pentium Dual Core 3 Ghz, 1 GB
RAM, sata2 HD. The index has about 1000k documents, without stemming,
medium size by 6k.

When I try to do a search with several phrased words like this:
"ronaldo jogou somente trinta minutos ontem", the search time go very
high. This example takes about 30 seconds in the first try. The very
same search took < 700 ms on the first run and < 100 ms forth.

I did all the tweaks you taught me, plus the ones I read on the list
archive, but the execution time still high when using phrased words.

Well, I'm going to try the ones you list in this mail. Thanks!

Fernando



> There will undoubtably be some hotspots still, so you could try using
> GCC's -fprofile-generate and -fprofile-use.  But I wouldn't expect
> miracles.

> If you have an x86_64 chip, compiling the code as 64 bit rather than 32
> bit will probably be faster.

> At runtime, if you've plenty of memory, setting env. variable
> XAPIAN_FLUSH_THRESHOLD higher than the default of 10000 can speed up
> indexing a lot.

> Once built, running the database through quartzcompact or xapian-compact
> (for flint) will make searches faster.

> You should also make sure that your disk subsystem is set up to be fast.
> Make sure DMA is enabled (if supported), etc.

> There's also still plenty of scope for improving code to speed up
> indexing, and some scope for faster searching too.  I know some places
> where better algorithms can be used, but I suspect there are bottlenecks
> in unobvious places too.  Profiling to identify these places would be
> a useful activity.

> Cheers,
>     Olly

--
[]s
Fernando Nemec
fernando.nemec at folha.com.br
http://www.folha.com.br/





More information about the Xapian-discuss mailing list