[Xapian-discuss] File Descriptors

James Aylett james-xapian at tartarus.org
Mon Jan 31 11:37:11 GMT 2005


On Mon, Jan 31, 2005 at 10:23:43AM +0000, Richard Boulton wrote:

> On Linux I believe there are two file descriptor limits - one is a hard
> limit set by the operating system (but is quite a high value).  The
> other is a lower, per-process limit, which can be changed.  Under bash,
> "ulimit -n" displays the current per-process limit, which may be
> modified if you have sufficient permissions (usually root).

This is a Unix thing, not Linux-specific.

The soft limit can be changed by any process to a value less than or
equal to its hard limit.

Any process can lower its hard limit to a value greater than or equal
to its soft limit. This lowering of the hard limit is irreversible for
normal users.

Only a superuser process can raise a hard limit.

(See Stevens, Advanced Programming in the Unix Environment, 7.11
pp180-184 and getrlimit(2)).

On top of this there are system limits, which are things like the
total number of open file descriptors the entire system can handle. I
wouldn't worry about them.

On Debian linux, I notice that the defaults for RLIMIT_NOFILE are
1024/1024 (hard/soft); on Solaris 9 on Sparc it's 65535/256. (Solaris
is also more lenient with RLIMIT_NPROC by default.)

We regularly raise our soft limit for RLIMIT_NOFILE to 4096 or more
for processes that don't have to do much work; for something that does
a lot for a short time, you'll run into other resource problems
(primarily CPU and disk, I suspect), as someone else has pointed out.

> All file descriptors used by a Xapian database should be released when
> the database is closed, so as long as there isn't a bug in PHP-SWIG
> which is failing to delete databases after use, there shouldn't be a
> problem that raising the limit won't solve.  I haven't looked at the PHP
> bindings in detail, but it looks like you are meant to explicitly delete
> database (and other Xapian objects) after use, so it would be worth
> making sure that you're doing that.

And alternative if this doesn't help would be to federate Xapian
access through an application server (this can actually just be
Apache/PHP again, or similar, although it needs a bit of work to get
Xapian db connections persistent in PHP) so that a fixed number of
queries can be run simultaneously, with others being queued behind for
processing later. If, however, you're operating significantly beyond
the capacity of your hardware this will actually hurt your user
experience, because all queries will take ages to complete, rather
than some failing straight away due to resource starvation. (You could
always build a pending queue into the app server, similar to the
''backlog'' parameter of listen(2).)

J

-- 
/--------------------------------------------------------------------------\
  James Aylett                                                  xapian.org
  james at tartarus.org                               uncertaintydivision.org



More information about the Xapian-discuss mailing list