[Xapian-discuss] xapian performance
Fernando Nemec
fernando.nemec at folha.com.br
Wed Nov 22 21:49:59 GMT 2006
Hi Olly,
I don't know if this is relevant but may be it is. On this query
<!--Xapian::Query((presidente PHRASE 2 lula))-->
cache seems to do not affect this query at all. Even if I search the
exact same query seconds later the search time is high and almost the
same.
This behavior doesn't happen with this query
<!--Xapian::Query((governo PHRASE 6 do PHRASE 6 estado PHRASE 6 de PHRASE 6 sao PHRASE 6 paulo))-->
When I search the same query seconds later the search time is greatly
reduced (just a few seconds).
Both queries return almost 190000 documents, in a database with
1050000 documents.
Thanks again,
Nemec
Wednesday, November 22, 2006, 6:55:21 PM, you wrote:
> Hi Olly,
>> Could you compare the speed of phrase searches with this patch:
> Certainly. I use Query::get_description for each query I did along the
> time to get the result set. I just made three different queries: one
> term, 2 words phrase and 6 words phrase.
> Do you think its better to have a large set of queries or this will do
> fine?
> This was made *without* experimental phrase optimization patch:
> <!--Xapian::Query(lula)-->
> 0m0.412s
> <!--Xapian::Query((presidente PHRASE 2 lula))-->
> 1m5.062s
> <!--Xapian::Query((governo PHRASE 6 do PHRASE 6 estado PHRASE 6 de PHRASE 6 sao PHRASE 6 paulo))-->
> 1m14.193s
> That was made *with* phrase optimization patch:
> <!--Xapian::Query(lula)-->
> 0m0.379s
> <!--Xapian::Query((presidente PHRASE 2 lula))-->
> 0m58.514s
> <!--Xapian::Query((governo PHRASE 6 do PHRASE 6 estado PHRASE 6 de PHRASE 6 sao PHRASE 6 paulo))-->
> 1m2.503s
> Thanks for you help Olly. If there's anything else I can do to help to
> fix this issue, please let me know.
> Nemec
> Wednesday, November 22, 2006, 5:19:45 PM, you wrote:
>> On Tue, Nov 21, 2006 at 07:16:52PM -0200, Fernando Nemec wrote:
>>> After so many patches I opt to get a fresh new source copy from svn.
>>> As far as I see you committed almost all patches you produced in the
>>> last days.
>> So far I've only committed the changes to use "my_fls" instead of the
>> floating point log calculation. The changes to open positionlists
>> lazily aren't in yet (I was waiting to check that the latest patch
>> fixed the slowdown for 2 term phrases).
>>> Sadly I didn't figure out any new improvement. I made a simple list
>>> with a variety of queries and all of them return in more or less the
>>> same time (a few tens of seconds).
>> The "my_fls" changes should reduce CPU use, so you won't see much
>> improvement if you're heavily I/O bound (which you must be if a search
>> takes tens of seconds).
>>> Is there any information I can supply to you to help to find what's
>>> going on phrase searches?
>> Could you compare the speed of phrase searches with this patch:
>>> > http://www.oligarchy.co.uk/xapian/patches/xapian-experimental-phrase-optimisation-v2.patch
>> with not using it (either on SVN trunk or 0.9.9). Ideally it should
>> speed up phrases with 3 or more terms, but should be just as fast for
>> 2 term phrases.
>> I'm going to look at creating a simple patch to count the number of
>> blocks read from each table during the query, which should help to get a
>> handle on how much I/O we're actually doing in an easily repeatable way.
>> Cheers,
>> Olly
> --
> []s
> Fernando Nemec
> fernando.nemec at folha.com.br
> http://www.folha.com.br/
--
[]s
Fernando Nemec
fernando.nemec at folha.com.br
http://www.folha.com.br/
More information about the Xapian-discuss
mailing list