[Xapian-discuss] Improving indexing speed

James Aylett james-xapian at tartarus.org
Wed Jul 2 01:49:28 BST 2008

On Tue, Jul 01, 2008 at 02:47:12PM -0700, Robert Kaye wrote:

> > Yes, it's always interesting to hear performance reports.
> Ok, I've tinkered with the setup a bit. I've found that if I give  
> xapian loads and loads of RAM, it doesn't even get around to using all  
> the RAM I give it -- at most each process used 5% of 8G of RAM.
> I measured disk access with:
>      iostat -x 10 (10 second disk usage average window)
> And CPU util with top. I've found:
> 3 processes: 95% - 96% CPU usage for each process, 40%-60% disk usage
> 4 processes: 95% - 96% CPU usage for each process, 60%-90% disk usage
> 5 processes: 92% - 94% CPU usage for each process, 80%-100% disk usage
> 6 processes: 91% - 93% CPU usage for each process, 100% disk usage  
> sustained
> It looks like 4 processes is the sweet spot that doesn't utterly slam  
> the machine. This is much better than I had anticipated -- well done  
> Xapian team!

Couple of detail questions:

 * what processor?

 * what OS?

 * how many spindles behind the FS volume?

 * what hard disks?

All hard data is good data, but obviously it's even better if there's
context as well -- apologies if you're already given any of these
details, but I didn't notice them recently in the thread.

(By the way, slamming the disks during index is what you want to do
unless you're also searching off the same database. A breakdown of the
type of CPU usage will help analysis here -- iowait versus sys/user
will tell you when you're starting to become IO bound. 4-6 processes
to max out your storage is pretty good :-)


  James Aylett                                                  xapian.org
  james at tartarus.org                               uncertaintydivision.org

More information about the Xapian-discuss mailing list