[Xapian-discuss] xapian performance
Fernando Nemec
fernando.nemec at folha.com.br
Thu Nov 23 14:25:27 GMT 2006
Olly,
> And how are you timing?
> If this is "wall-clock" time from the "time" utility/built-in, what are
> the user and system times?
I'm using time time utility. As the user and system times are so low I
remove them from the message I sent you. In the end of message I'll
put the queries and all the times.
> It's interesting that the first case is sped up (by 8% which is
> little high to be noise) - the patch shouldn't change non-phrase
> queries at all.
In this particular case time difference is sometimes up to 12%. I
didn't worry about that because search times < 0.5 are fine for my
application.
> Is this SVN HEAD with and without this patch?
> http://www.oligarchy.co.uk/xapian/patches/xapian-experimental-phrase-optimisation-v2.patch
Both. The first block without the patch and the seconds block with the
patch. In the end of this message I'll put just the report made svn
head _with_ the patch above.
> I think this must mean that we need to read so many disk blocks for
> this query that not many end up cached. I think you said you had 1GB
> of RAM, so there might not be all that much left for caching.
Yes, that's correct.
> What does the "free" command report?
That's the debug info for each query. It was made with svn head and
with xapian-experimental-phrase-optimisation-v2 patch. For each case I
add the free command report.
== CASE 1
<!--Xapian::Query(lula)-->
1 blocks read from /local/xapian/newdb/record.
4369 blocks read from /local/xapian/newdb/value.
3 blocks read from /local/xapian/newdb/termlist.
1 blocks read from /local/xapian/newdb/position.
104 blocks read from /local/xapian/newdb/postlist.
real 0m0.429s
user 0m0.396s
sys 0m0.036s
total used free shared buffers cached
Mem: 1034764 1019508 15256 0 3556 980372
-/+ buffers/cache: 35580 999184
Swap: 2097144 13308 2083836
== CASE 2
<!--Xapian::Query((presidente PHRASE 2 lula))-->
1 blocks read from /local/xapian/newdb/record.
3023 blocks read from /local/xapian/newdb/value.
3 blocks read from /local/xapian/newdb/termlist.
153036 blocks read from /local/xapian/newdb/position.
380 blocks read from /local/xapian/newdb/postlist.
real 1m33.191s
user 0m3.300s
sys 0m3.624s
total used free shared buffers cached
Mem: 1034764 1021384 13380 0 3492 982248
-/+ buffers/cache: 35644 999120
Swap: 2097144 13308 2083836
CASE 3
<!--Xapian::Query((governo PHRASE 6 do PHRASE 6 estado PHRASE 6 de PHRASE 6 sao PHRASE 6 paulo))-->
1 blocks read from /local/xapian/newdb/record.
1712 blocks read from /local/xapian/newdb/value.
3 blocks read from /local/xapian/newdb/termlist.
58136 blocks read from /local/xapian/newdb/position.
4141 blocks read from /local/xapian/newdb/postlist.
real 1m7.275s
user 0m1.556s
sys 0m2.484s
total used free shared buffers cached
Mem: 1034764 1020136 14628 0 4336 980640
-/+ buffers/cache: 35160 999604
Swap: 2097144 13308 2083836
CASE 4
<!--Xapian::Query((presidente PHRASE 2 luiz))-->
1 blocks read from /local/xapian/newdb/record.
3628 blocks read from /local/xapian/newdb/value.
3 blocks read from /local/xapian/newdb/termlist.
143663 blocks read from /local/xapian/newdb/position.
407 blocks read from /local/xapian/newdb/postlist.
real 1m16.068s
user 0m2.820s
sys 0m3.580s
total used free shared buffers cached
Mem: 1034764 1019752 15012 0 4016 980608
-/+ buffers/cache: 35128 999636
Swap: 2097144 13308 2083836
Thanks again for your help, Olly,
Nemec
Wednesday, November 22, 2006, 9:31:35 PM, you wrote:
> On Wed, Nov 22, 2006 at 06:55:21PM -0200, Fernando Nemec wrote:
>> Do you think its better to have a large set of queries or this will do
>> fine?
> The effects will depend on the queries, but Arjen has already tested a
> larger set so I was mostly hoping you could confirm there was no
> regression for the two term case.
>> This was made *without* experimental phrase optimization patch:
>>
>> <!--Xapian::Query(lula)-->
>> 0m0.412s
>> <!--Xapian::Query((presidente PHRASE 2 lula))-->
>> 1m5.062s
>> <!--Xapian::Query((governo PHRASE 6 do PHRASE 6 estado PHRASE 6 de PHRASE 6 sao PHRASE 6 paulo))-->
>> 1m14.193s
>>
>> That was made *with* phrase optimization patch:
>>
>> <!--Xapian::Query(lula)-->
>> 0m0.379s
>> <!--Xapian::Query((presidente PHRASE 2 lula))-->
>> 0m58.514s
>> <!--Xapian::Query((governo PHRASE 6 do PHRASE 6 estado PHRASE 6 de PHRASE 6 sao PHRASE 6 paulo))-->
>> 1m2.503s
> It's interesting that the first case is sped up (by 8% which is little
> high to be noise) - the patch shouldn't change non-phrase queries at
> all. Is this SVN HEAD with and without this patch?
> http://www.oligarchy.co.uk/xapian/patches/xapian-experimental-phrase-optimisation-v2.patch
> Are you timing Omega? If so, did you try removing $topterms from your
> query template?
> And how are you timing?
> If this is "wall-clock" time from the "time" utility/built-in, what are
> the user and system times?
>> I don't know if this is relevant but may be it is. On this query
>>
>> <!--Xapian::Query((presidente PHRASE 2 lula))-->
>>
>> cache seems to do not affect this query at all. Even if I search the
>> exact same query seconds later the search time is high and almost the
>> same.
> I think this must mean that we need to read so many disk blocks for
> this query that not many end up cached. I think you said you had 1GB
> of RAM, so there might not be all that much left for caching.
> What
> does the "free" command report?
>> If there's anything else I can do to help to fix this issue, please
>> let me know.
> It would be interesting to try measuring just how many blocks we
> actually read - this will be a repeatable measure, whereas timings
> from cold disk cache are much harder to exactly repeat. Try applying
> this patch:
> http://www.oligarchy.co.uk/xapian/patches/flint-count-read-blocks.patch
> This reports the number of blocks read from each table of each flint
> database to stderr (the report happens whenever a database is closed).
> Cheers,
> Olly
--
[]s
Fernando Nemec
fernando.nemec at folha.com.br
http://www.folha.com.br/
More information about the Xapian-discuss
mailing list