[Xapian-discuss] Re: Re: get_docid over multi-database search
Olly Betts
olly at survex.com
Tue Dec 18 11:49:24 GMT 2007
On Fri, Dec 14, 2007 at 11:18:12AM -0800, Andrey wrote:
> from my own experience, breaking up into dbs will not cause a big
> preformance lost, like from 1sec to 2 secs, it just works like querying 1 db
> after cached up
I would be suprised if there was a large overhead - there's a bit of
extra work from opening the databases, and a small amount from having
a "MultiPostList". The combined size of the split databases is usually
a little larger than the combined one, which may increase VM pressure a
bit.
If you do profile and find there's a significant difference, it would
be interesting to see comparable profiles for the two cases to see where
the extra time is spent.
> maybe you can try to duplicate another copy of your db and serach over them
> together, its very easy with just 1 extra line
> db=db.add_database(xapian.Database(''db"))
You'd also need to generate the equivalent combined database (e.g. by
using xapian-compact with the same input twice).
But just duplicating the data isn't an accurate recreation of searching
a real database split in two though. I don't know if it actually would
make a difference, but it might.
Cheers,
Olly
More information about the Xapian-discuss
mailing list