[Xapian-discuss] Xapian Index 253 million documents = 704G

Kevin Duraj kevinduraj at gmail.com
Fri May 13 17:55:46 BST 2011


Xapian Index 253 million documents = 704G

I just build my largest single Xapian index with 253 million unique
documents on single server using single hard disk, less that 8G RAM
and single processor 2.0 GHz. I do not see any search performance
decreases in searching my indexes between 100 million and 250 million,
which indicates a good scalability of Xapian and it looks like, I can
push it easily forwards 300 million documents on single Index.

You can check it yourself at: http://myhealthcare.com/

number of documents = 253717716
average document length = 35670.3
document length lower bound = 1
document length upper bound = 181656
highest document id ever used = 253717716

total 704G
-rw-r--r-- 1 kevin kevin   28 2011-05-13 08:30 iamchert
-rw-r--r-- 1 kevin kevin   14 2011-05-13 03:28 position.baseA
-rw-r--r-- 1 kevin kevin 718K 2011-05-13 08:30 position.baseB
-rw-r--r-- 1 kevin kevin 359G 2011-05-13 08:30 position.DB
-rw-r--r-- 1 kevin kevin   14 2011-05-12 17:22 postlist.baseA
-rw-r--r-- 1 kevin kevin 167K 2011-05-13 02:26 postlist.baseB
-rw-r--r-- 1 kevin kevin  84G 2011-05-13 02:26 postlist.DB
-rw-r--r-- 1 kevin kevin   14 2011-05-13 02:26 record.baseA
-rw-r--r-- 1 kevin kevin 301K 2011-05-13 03:02 record.baseB
-rw-r--r-- 1 kevin kevin 151G 2011-05-13 03:02 record.DB
-rw-r--r-- 1 kevin kevin   14 2011-05-13 03:02 termlist.baseA
-rw-r--r-- 1 kevin kevin 224K 2011-05-13 03:28 termlist.baseB
-rw-r--r-- 1 kevin kevin 112G 2011-05-13 03:28 termlist.DB

Thanks,
Kevin Duraj
http://myhealthcare.com



More information about the Xapian-discuss mailing list