[Xapian-discuss] Compressed Btrees

Arjen van der Meijden arjen at glas.its.tudelft.nl
Thu Dec 9 09:43:39 GMT 2004


On 9-12-2004 2:05, Olly Betts wrote:
> I've been working on allowing Btree tags to be compressed using zlib,
> and it's now working sufficiently well that some external testing would
> be useful (a patch to xapian-core 0.8.4.0 is attached).

I'll if I can test it today.

> So you can experiment, the patch looks for two files per Btree to decide
> whether to compress and how to compress.  For the record btree (which
> currently uses record_DB, record_baseA, and record_baseB) these files
> are record_compress and record_compress_strategy.
> 
> If record_compress exists, then quartz will use zlib to try to compress
> any tag added to the record table which is more than 4 bytes long.
> 
> If record_compress isn't empty, then the contents are used as a dictionary
> to seed compression.  So for the record table, putting in a typical record
> will improve the compression achieved.

How does one find a 'typical record' and in what format should it be 
entered?

> If you've got any large databases, I'd be very interested to see what results
> you can get, and also speed tests for building and searching with compression
> on.

I'll test it with our database, using your hybrid settings, perhaps 
position_DB is another good candidate to run in filtered-mode?

Best regards,

Arjen van der Meijden



More information about the Xapian-discuss mailing list