[Xapian-discuss] Compressed Btrees

Olly Betts olly at survex.com
Sun Dec 12 22:44:54 GMT 2004


On Sun, Dec 12, 2004 at 07:55:32PM +0000, Olly Betts wrote:
> Interpolative coding will work much better for the positions anyway.
> I've written code to calculate the compression achievable for a given
> position list, though I've not written actual compression and
> decompression code yet.  I'll try to sort out something to let you see
> how well they'll compress at least.

OK, stick the attached file in your xapian-core tree instead of
bin/quartzdump.cc and rebuild quartzdump (this is an easy way to get
the right files included and linked!)

Then run it with the database directory as the argument.

It'll report the current total tag size and what the new total tag size
would be.  I get just over 30% compression in most cases.

It also reports a theoretical max compression, which we usually actually
exceed by a little because one of the assumptions in the theoretical
value isn't correct.

Cheers,
    Olly
-------------- next part --------------
A non-text attachment was scrubbed...
Name: quartzdump.cc
Type: text/x-c++src
Size: 4617 bytes
Desc: not available
Url : http://lists.tartarus.org/pipermail/xapian-discuss/attachments/20041212/399ccc41/quartzdump.bin


More information about the Xapian-discuss mailing list