[Xapian-discuss] UTF8 support plans (without stemming)

James Aylett james-xapian at tartarus.org
Thu Apr 28 13:59:09 BST 2005


On Thu, Apr 28, 2005 at 01:51:39PM +0100, Craig Macdonald wrote:

> Many submissions to last year's TREC Terabyte track didnt use
> stemming at all.
>
> It would also appear to be a similar approach to what Google is doing. 
> The first two steps [of Porter] only drops plurals and tense suffixes.

Does Google even do that much? My experience is that it doesn't do
more than dropping plural suffixes, if that. It's one of the reasons I
found the gmail search less than completely helpful (although
synonym expansion helped a bit).

J

-- 
/--------------------------------------------------------------------------\
  James Aylett                                                  xapian.org
  james at tartarus.org                               uncertaintydivision.org



More information about the Xapian-discuss mailing list