[Xapian-discuss] UTF8 support plans (without stemming)
James Aylett
james-xapian at tartarus.org
Thu Apr 28 13:59:09 BST 2005
On Thu, Apr 28, 2005 at 01:51:39PM +0100, Craig Macdonald wrote:
> Many submissions to last year's TREC Terabyte track didnt use
> stemming at all.
>
> It would also appear to be a similar approach to what Google is doing.
> The first two steps [of Porter] only drops plurals and tense suffixes.
Does Google even do that much? My experience is that it doesn't do
more than dropping plural suffixes, if that. It's one of the reasons I
found the gmail search less than completely helpful (although
synonym expansion helped a bit).
J
--
/--------------------------------------------------------------------------\
James Aylett xapian.org
james at tartarus.org uncertaintydivision.org
More information about the Xapian-discuss
mailing list