[Xapian-discuss] Using Omega and/or Xapian and how to get started

James Aylett james-xapian at tartarus.org
Fri Aug 5 11:14:12 BST 2005


On Thu, Aug 04, 2005 at 10:06:49PM +0200, F. Bos wrote:

> So far I've learned that I need to get all the data I want to search in a
> Xapian database (in documents), indexed in a way that I can optimize my
> search demands. I've looked at the PHP examples that are included with the
> Xapian bindings and I do roughly understand how these work. However I don't
> understand when the index script will have to run. For example: do I need to
> run an index script directly after I inserted a new forum post in order to
> insert the search terms that are in the post in the Xapian database and for
> the post to become searchable? Won't this slow down the posting procedure a
> lot?

Hi, Floris. You have a choice: either you want to update the Xapian
database immediately on a new forum post (or similar), or you want to
batch up the changes and do them later.

The former will slot down the posting procedure a little, but may not
necessarily slow it down a lot - depends on a whole range of
things. It's probably easier to implement than batched processing,
however, so unless you're concerned that your site gets high volume
and you might not be able to sustain an 'online' approach, updating
the database as part of the post operation, you might want to try this
first. Much of the code could be reusable in a batch approach if that
became necessary later.

> Another thing I don't understand (this probably sounds stupid) is that when
> I use the Xapian PHP bindings/functions to index and search, I don't see
> where Omega comes in?! Or can I also use Omega from within PHP? I think I
> don't really got the hang of how Omega interacts with Xapian and how Omega
> interacts with PHP (if the latter is the case at all). 

You don't have to use Omega if you're using Xapian directly from PHP -
indeed, there isn't much point. However you can quite happily use
Omega from PHP without directly accessing Xapian; this has a number of
advantages, mostly that Omega has a whole load of features you'd
otherwise have to implement for yourself.

If you want to go down that route, PHP would call Omega to generate an
file that contained the results (perhaps as XML), and would then
process it into the form to be displayed to the user.

Indexing in this scenario would probably be best done by creating
scriptindex input files and running that, either from PHP, or in a
batch fashion from a cronjob or similar.

> Last question: does Xapian need a dedicated server or can I also get
> reasonable performance when I have the Mysql database, the Xapian database
> and the web server software on the same server? 

Again, depends on a lot of factors. You should be able to get pretty
far with everything on the one server, but it really depends on the
machine. If it's a recent Intel or AMD-based server, even at the lower
end, you should do fine - as with many performance situations, you
have to be careful not to optimise too soon.

Often, upgrading a single system will give you better performance than
splitting onto multiple systems, in the short term; and if your
machine is powerful enough to cope with initial demand all by itself,
splitting across multiple machines will slow things down until you get
quite a bit more demand.

There's a document on Xapian scalability that may prove helpful:
<http://xapian.org/docs/scalability.html>.

Hope this helps,
James

-- 
/--------------------------------------------------------------------------\
  James Aylett                                                  xapian.org
  james at tartarus.org                               uncertaintydivision.org



More information about the Xapian-discuss mailing list