[Xapian-discuss] Flint Backend
Arjen van der Meijden
acmmailing at tweakers.net
Tue Jun 28 20:32:46 BST 2005
Here is the output of quartzcompact:
postlist: Reduced by 0.537767% 7504K (1395400K -> 1387896K)
record: Reduced by 0.311498% 544K (174640K -> 174096K)
termlist: Reduced by 0.423684% 5168K (1219776K -> 1214608K)
position: Reduced by 0.0208535% 1512K (7250576K -> 7249064K)
value: Reduced by 0.0307314% 16K (52064K -> 52048K)
I started from the "quartzcompact 0.8.4 + zlib", above is quartzcompact
0.9.1-svn -F + zlib.
At the moment I'm running benchmarks on all the databases I created
earlier. When I've the results, I'll send them to the list as well.
Best regards,
Arjen
On 27-6-2005 0:45, Olly Betts wrote:
> On Sun, Jun 26, 2005 at 10:43:32AM +0200, Arjen van der Meijden wrote:
>
>> Qz Qz 084 gz Qz -nF gz
>>Position 7424589824 7424589824 7432200192
>>Postlist 1708957696 1428889600 1535426560
>>Record 254222336 178831360 179888128
>>Termlist 1770250240 1249050624 1395597312
>>Value 61317120 53313536 53313536
>
>
> I think "-nF" is probably larger because of the "-n". Can you try with
> just "-F"?
>
>
>>Here the xapian-compact results of the flint database. Here -n -F and -F
>>produced exactly the same table sizes but they were smaller than the
>>original compaction-try. Please do note the position-table is larger
>>than in the quartz compacted-cases.
>>
>> Flint Flint -nF/-F
>>Position 7452794880 7451574272
>>Postlist 1644240896 1634279424
>>Record 255377408 254418944
>>Termlist 1772339200 1764106240
>>Value 62177280 62177280
>
>
> OK, so comparing against the non-zlib, we're a bit better for postlist,
> and a bit worse for record/termlist/value. I suspect that's mostly
> down to the longer keys, which will be resolved when I replace the Btree
> manager (I'm going to make the key compare a virtual function which can
> be different for each table, rather than having to encode the keys in
> such a way that the byte contents compare in the desired order).
>
> It's a shame that the new position table encoding isn't smaller for you.
> I think I might need to look at your data at some point, but I'll try
> some more examples locally first in case it's the one I've been using
> which is atypical.
I had some very unexpected results with the position-tables of the
various quartz-databases. The uncompacted version was 50% *faster* than
the compacted ones. I've changed the benchmarking, hoping it was some
issue with how it was layed out on disk.
I'll have to investigate it a bit more probably, depending on the out
come of the current benchmarks.
>>Is reading from the working, instead of the compacted database a
>>cause?
>
> Almost certainly - there's probably less to read (though bear in mind
> that the working database will have a number of blocks which aren't
> in use in the current version and these don't need to be read to copy
> it), but more to the point a database which is compact with revision
> 1 (like that quartzcompact and xapian-compact produce) is more efficient
> to read and iterate over.
I'll test this on the not-loaded-machine as well sometime soon.
Best regards,
Arjen
More information about the Xapian-discuss
mailing list