[Xapian-discuss] Phrase search performance

Alex Deucher alexdeucher at gmail.com
Mon Feb 20 21:14:50 GMT 2006


On 2/20/06, Olly Betts <olly at survex.com> wrote:
> On Mon, Feb 20, 2006 at 02:27:53PM -0500, Alex Deucher wrote:
> > Is there any way to speed up phrase searches?  What sort of
> > performance should I expect?  Currently when I search against a 5.3 GB
> > flint database it takes 4.5 minutes for a simple 2 word phrase.  Is
> > that reasonable performance?
>
> No.
>
> Phrase searches involving two common terms can be slow, especially
> where an AND query matches many documents but the two terms don't
> often occur as a phrase, but 4.5 minutes is clearly ludicrous.
>
> A more concrete example would be useful - what's the query, and
> what are the term frequencies for the two terms involved?

my query was: Xapian::Query((FTEXT:main PHRASE 2 FTEXT:channel))
FTEXT:main term frequency: 37983
FTEXT:channel term frequency: 16106


>
> > I'm using the perl interface to xapian 0.9.2.  I'm building my own
> > Query objects rather than using QueryParser since we use ':' as part
> > of our field prefix.
>
> Can't you just set the prefix map to include the ":"?  i.e.
>
>     queryparser.add_prefix("field", "FIELD:");

I'll give that a try.  Thanks for the heads up.

>
> > Xapian::Query((FIELD:term1 PHRASE 2 FIELD:term2))
> >
> > while QueryParser's looks like:
> >
> > Xapian::Query((term1(pos=1) PHRASE 2 term2(pos=2)))
> >
> > where is the position information coming from and how do I add it to
> > my query?
>
> There are optional parameters on the Query from term name constructor,
> one of which sets the query position.

Ah, ok.  I didn't know there was a query from term constructor.

>
> > Will it help or is it irrelevant?
>
> I think it's only used to sort the query terms which match a particular
> document into order (they may need to be reordered to build the query -
> e.g. 'hello +world' -> 'world AND_MAYBE hello').
>
> > The query object (at least the perl interface) only allows me to build
> > queries of the form:
>
> For Perl, see the "new_term" method of "Search::Xapian::Query" - added
> in 0.9.2.3.

Is there any documentation on that Perl code anywhere?  the stuff is
CPAN is pretty limited.

Thanks,

Alex


>
> Cheers,
>     Olly
>



More information about the Xapian-discuss mailing list