[Xapian-discuss] Python bindings - xapian.Database.reopen

Cedric Jeanneret cedric.jeanneret at camptocamp.com
Tue Apr 14 16:19:24 BST 2009


Oh, thanks a lot for your quick answer.

In fact, I'll do in another way: my ""clients"" will push indexes at the end of the process. As it's a simple script called by a cronjob, I can add the right line.

Thanks again for you answer.

Regards,

C. Jeanneret

On Tue, 14 Apr 2009 15:35:29 +0100
Richard Boulton <richard at lemurconsulting.com> wrote:

> On Tue, Apr 14, 2009 at 04:15:10PM +0200, Cedric Jeanneret wrote:
> > Hello,
> > 
> > I'm using xapian in a pylons application, with pythons libs/bindings...
> >
> > My indexes are created on other servers, then rsync-ed to my search
> > engine... It seems that sometimes this process do some mess, as my Pylons
> > app returns a big error :
> 
> This is the problem - if you rsync a database which is being modified,
> you'll get half the old database and half the new database.  It is not safe
> to rsync a database which is in the process of being modified, because
> rsync is not an atomic copy operation.  In fact, even if the database isn't
> being modified, you'll get errors like the one you report if you try and
> search while the rsync is happening (though at least in that case, once the
> rsync is finished, the database should be valid again).
> 
> This is why the 1.1.0 release will have support for replication, in a safe
> way.   See
> http://trac.xapian.org/browser/trunk/xapian-core/docs/replication.rst for
> details (it has a long section on alternative approaches to replication,
> including rsync, which may interest you).  If you want to try this out, use
> SVN trunk (which is very close to release, though no promises that we won't
> need to change something at the last minute).
> 
> If you must rsync, you need to stop the indexer, take a copy of the
> database on the client, rsync to update the copy, and then swap that copy
> in place of the old database on the client.  Preferably, use a stub
> database to control which database is live on the client, and use "rename"
> to update that stub database file (rename, when used to move a file to
> replace another file, is an atomic operation.  Unless you're on windows.).
> 
> > Error - xapian.DatabaseModifiedError: The revision being read has been discarded - you should call Xapian::Database::reopen() and retry the operation
> > [snip useless trace]
> > DatabaseModifiedError: The revision being read has been discarded - you should call Xapian::Database::reopen() and retry the operation
> 
> This error is slightly misleading in this situation - in fact, due to the
> rsync, your copy of the database is corrupt.
> 
> > Ok... so I'm trying to call xapian.Database.reopen().... but how ??
> > 
> > Trying to do so:
> > try:
> >   d = xapian.Database('my/db')
> > except xapian.DatabaseModifiedError:
> >   d = xapian.Database()
> >   d.reopen('my/db')
> 
> Just to note; if the error had occurred due to local modifications, you'd
> only need to call reopen() if you were re-using a database handle.  Here,
> you're making a new database, so you just need to retry the operation.
> It's academic, though, because the use of rsync has left you with an
> invalid database which no amount of calling reopen() will fix.
> 


-- 
Cédric Jeanneret                 |  System Administrator
021 619 10 32                    |  Camptocamp SA
cedric.jeanneret at camptocamp.com  |  PSE-A / EPFL
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
Url : http://lists.xapian.org/pipermail/xapian-discuss/attachments/20090414/2e509045/attachment.pgp 


More information about the Xapian-discuss mailing list