[Xapian-discuss] search queries with less than 3 characters, memory goes nuts - not xapians fault

chris chris at s-4-u.net
Sat Aug 15 15:28:00 BST 2009


For the archives:

Did a quick test with the ruby bindings and everything works fine, so
the error lies above the ruby-bindings-layer.

Thanks for the pointer, Olly.

Greets, Chris


Am Sat, 15 Aug 2009 16:04:48 +0200
schrieb chris <chris at s-4-u.net>:

> > On Sat, Aug 15, 2009 at 12:58:53PM +0200, chris wrote:
> > > So my questions are:
> > > - why does xapian use countless gigabytes of ram if i feed it such
> > > a query?
> > 
> > I've never seen it do so before.
> 
> I guessed so, just didn't know where to turn first.
> 
> > 
> > > - is there a need to clean the query before? i mean, could someone
> > > do something nasty with it? (except the usual html-security
> > > things, which we take care of by escaping the query before
> > > display)
> > 
> > There shouldn't be a need.
> 
> Very well
> 
> > 
> > > - what can i do to prevent this? 
> > 
> > My guess is that acts_as_xapian is asking Xapian to return all
> > possible matches, is getting a few million, and is storing them in a
> > space-inefficient way.
> > 
> > The code here seems to show @limit defaults to "-1" which I assume
> > means "maximum unsigned integer" by the time Xapian sees it:
> > 
> > http://github.com/Overbryd/acts_as_xapian/blob/dc3517c66b18dbf66733aac3ba436c7bf4ffcab8/lib/acts_as_xapian.rb
> 
> I'm overriding it with my own per-page limit and by watching the logs
> it seems to accept this, but i'll investigate deeper.
> 
> > 
> > It would be useful to narrow down which layer is causing this.  Can
> > you try running some of these "bad" queries without the Ruby layers
> > involved (examples/quest in xapian-core provides an easy way to run
> > a query against a database).
> > 
> > If that works OK, try it from just using the Ruby bindings (without
> > acts_as_xapian) - you may find examples/simplesearch.rb useful for
> > that. 
> 
> Ok, will do.
> 
> > If the problem is in acts_as_xapian, you'll need to talk to its
> > developers, or just pass a sane limit giving the number of matches
> > you actually want.  It's a good idea to do that anyway since asking
> > for all possible matches will disable various matcher optimisations
> > and slow down searches.
> 
> After reading your answer i also suspect acts_as_xapian as the
> problem, i just wanted to be sure that i did not make a stupid
> mistake, thanks for the help.
> 
> Greets, Chris
> 
> 
> _______________________________________________
> Xapian-discuss mailing list
> Xapian-discuss at lists.xapian.org
> http://lists.xapian.org/mailman/listinfo/xapian-discuss



More information about the Xapian-discuss mailing list