[Xapian-discuss] Most efficient update of already existing document?

Do. do1 at yandex.ru
Tue May 31 14:26:39 BST 2011


Thank you Olly.

So, as I understood - term is most officient to update, then value and then data (which is said to be "expensive operation"). Or maybe more edfficient could be to use metadata key-value index functionality.

Search will give me docids which I will need to resolve to IDs. I feared that updating document with 1 term changed actually deletes and inserts document, with rebuilding postings, etc, which sounds quite expensive operation.

Best regards.



31.05.2011, 17:10, "Olly Betts" <olly at survex.com>:
> On Tue, May 31, 2011 at 01:50:29AM +0400, Do. wrote:
>
>>  What is the most efficient way to update some content of document with
>>  new info gathered later after it's first indexed?
>
>     Xapian::document doc = db.get_document(did);
>     // make changes to doc.
>     db.replace_document(did, doc);
>
> The Document and Database objects together keep track of which classes
> of things you changed, and then compare within those classes to make
> minimal changes to the database, so it's pretty efficient.
>
> The most recent improvements to this were in 1.1.4, backported to
> 1.0.18, so you can probably rely on having them these days.
>
>>  As I understood API there is only way to update document is to replace
>>  it. But I don't know how effective/fast is that replacement. What is
>>  better way to store new ID in document (that will be single change)
>>  that will be updated most efficiently by replace: term, data, or
>>  value?
>
> How to store it should really depend on how you intend to use it.  If
> you want to be able to find a particular document by its id efficiently
> then you really need to index it as a term.
>
> Cheers,
>     Olly



More information about the Xapian-discuss mailing list