[Xapian-discuss] Deleted documents not deleted

Jean-Francois Dockes jean-francois.dockes at wanadoo.fr
Tue Jun 19 21:53:51 BST 2007


Olly Betts writes:
 > On Tue, Jun 19, 2007 at 07:20:18PM +0200, Jean-Francois Dockes wrote:
 > > I seem to be seeing cases where I call db.delete_document(somedocid) with
 > > no error, then flush() and delete the database object, but the document is
 > > still there after process exit. The write lock is normally deleted, so it
 > > appears that the database close finished normally.
 > > 
 > > If I then then call delete_document(somedocid) from another
 > > command/process, this time it goes away.
 > > 
 > 
 > Do you see this in 1.0.0 or later?

Yes, I am seeing this on an index created with at least 1.0.0, and I am
currently working on it with 1.0.1. It's a flint index, but I'm quite
certain that I've already seen it in the past with 0.x and quartz. The
problem is that I have nothing much more precisely defined than "deletes
sometimes don't work".

It is reproducible though, if I repeatedly launch the indexer, it will tell
me every time that it deleted document xxx which is still alive and well
when the indexing is done. I am reasonably certain that delete_document()
has been called when the message is printed. If I then delete document xxx
from a small test driver, it does go away, so the problem is apparently
caused by the mix of read/write accesses which are performed during
indexing. 

There is a slight weirdness in the indexer, in that it uses 2 separate
Database/WritableDatabase objects for reading and writing (this is because
of performance issues I had in the past when performing some kinds of
accesses through a WritableDatabase). I don't know if this can be relevant
at all, I could try and change it if it has a chance to be related to the
problem.

Regards,
JF

 > This fix went in before 1.0.0, but I can't recall exactly what the symptoms
 > were, nor find a relevant mail or bug report...
 > 
 > Mon Apr 09 14:50:40 BST 2007  Olly Betts <olly at survex.com>
 > 
 >         * backends/flint/flint_database.cc: Delete the corresponding entry
 >           (if any) from doclens in delete_document().  Add assertion to
 >           add_document_() that the corresponding entry in doclens isn't
 >           already set, but in a non-debug build overwrite any existing
 >           entry as that's more likely to be correct.
 >         * backends/quartz/quartz_database.cc: Ditto.
 > 
 > Cheers,
 >     Olly



More information about the Xapian-discuss mailing list