[Xapian-discuss] Xapian Terms vs. Document Partition.

Kevin Duraj kevin.softdev at gmail.com
Wed May 7 00:48:01 BST 2008


Xapian Terms vs. Document Partition.

On December 2007, Diego Puppin from Google had interesting talk about
parallel architecture distributing index based on terms rather than
documents.
Reference:
http://youtube.com/watch?v=KpZpsu2wM1s

This describing similar idea we have discussed 7 months earlier on May
2007, before Diego's presentation in the following Xapian discussion
threads.
Reference:
http://lists.tartarus.org/pipermail/xapian-discuss/2007-May/003889.html

My index is growing to 100 million of documents at
http://myhealthcare.com and I need to implement some parallel
architecture, because it takes too long to update and add new
documents into index. I would like again encourage Xapian community to
start looking into distributing index based on terms rather than
documents. To make each server be responsible for set of terms rather
then set of documents would enable us to scale our search engine to
Google's level.

Thank you,

Kevin Duraj
http://myhealthcare.com



More information about the Xapian-discuss mailing list