[Xapian-discuss] Phrase search performance
Alex Deucher
alexdeucher at gmail.com
Mon Feb 20 21:14:50 GMT 2006
On 2/20/06, Olly Betts <olly at survex.com> wrote:
> On Mon, Feb 20, 2006 at 02:27:53PM -0500, Alex Deucher wrote:
> > Is there any way to speed up phrase searches? What sort of
> > performance should I expect? Currently when I search against a 5.3 GB
> > flint database it takes 4.5 minutes for a simple 2 word phrase. Is
> > that reasonable performance?
>
> No.
>
> Phrase searches involving two common terms can be slow, especially
> where an AND query matches many documents but the two terms don't
> often occur as a phrase, but 4.5 minutes is clearly ludicrous.
>
> A more concrete example would be useful - what's the query, and
> what are the term frequencies for the two terms involved?
my query was: Xapian::Query((FTEXT:main PHRASE 2 FTEXT:channel))
FTEXT:main term frequency: 37983
FTEXT:channel term frequency: 16106
>
> > I'm using the perl interface to xapian 0.9.2. I'm building my own
> > Query objects rather than using QueryParser since we use ':' as part
> > of our field prefix.
>
> Can't you just set the prefix map to include the ":"? i.e.
>
> queryparser.add_prefix("field", "FIELD:");
I'll give that a try. Thanks for the heads up.
>
> > Xapian::Query((FIELD:term1 PHRASE 2 FIELD:term2))
> >
> > while QueryParser's looks like:
> >
> > Xapian::Query((term1(pos=1) PHRASE 2 term2(pos=2)))
> >
> > where is the position information coming from and how do I add it to
> > my query?
>
> There are optional parameters on the Query from term name constructor,
> one of which sets the query position.
Ah, ok. I didn't know there was a query from term constructor.
>
> > Will it help or is it irrelevant?
>
> I think it's only used to sort the query terms which match a particular
> document into order (they may need to be reordered to build the query -
> e.g. 'hello +world' -> 'world AND_MAYBE hello').
>
> > The query object (at least the perl interface) only allows me to build
> > queries of the form:
>
> For Perl, see the "new_term" method of "Search::Xapian::Query" - added
> in 0.9.2.3.
Is there any documentation on that Perl code anywhere? the stuff is
CPAN is pretty limited.
Thanks,
Alex
>
> Cheers,
> Olly
>
More information about the Xapian-discuss
mailing list