[Xapian-discuss] Incremental updates and disk space ...

Olly Betts olly at survex.com
Tue Aug 30 14:15:22 BST 2011


On Mon, Aug 29, 2011 at 02:19:34PM +0200, Marinos Yannikos wrote:
> we've been using Xapian in production for several months now and update  
> our (chert) databases continuously. A freshly generated index occupies  
> only around ~35% of the disk space compared to what it becomes after a  
> few days. This is not a huge concern (we use SSDs), but I've been  
> wondering whether there is a way to fine-tune this (other than  
> recreating the index frequently), so that less disk space is wasted or  
> it degrades a little slower.

You can use xapian-compact to make a copy of a database with free space
reclaimed.

But your size difference sounds unusual.  In normal use, you should get
~75% block utilisation for random insertions, and close to 100%
utilisation for linear updates.  That doesn't take into account blocks
which were used in the previous revision and are now awaiting reuse,
but unless your update between (automatic or explicit) commits are
changing most of the database, that shouldn't lead to only about ~35% 
of the space actually being used.

Are you deleting a lot of documents?

Or is there something else which might be unusual about your update
patterns?

Cheers,
    Olly



More information about the Xapian-discuss mailing list