Re[2]: [Snowball-discuss] question about russian stemmer

"Yuri" "Yuri"
Fri Feb 13 20:22:26 2004


Hello

> The English stemmer gives a scheme for including exceptions, which you might
> try and adapt to the Russian stemmer if the "Kiev" case was sufficiently
> important.

I'm not a linguist, i'm just a programmer. SBL definitions look very uncommon
for me, i will try to find out where to put exceptions. May be you can
help me, how to add just one exception: stem Kiev => Kiev.

Or if it hard, as workaround I make my stemmer subclass which looks for exceptions
and use it, or if word is not listed in exceptions call Snowball.

> You must of course realise that the stemmers are not 100% accurate, and a
> certain rate of error is inevitable. These errors do not necessarily degrade
> retrieval performance however (see the Introduction to Snowball).
> 
> Are there many other words that mis-stem in a similar way?

No, this was first and only one problem (at least for now).
I'm writting search engine, which index all word in text.
And i noticed when i search "Kieva" (Kiev's or 'of Kiev' in english),
my search engine does not find text containing word "Kiev".

When i started to search where is the error i've found that stemmer,
stems 'Kiev' as 'Ki', and stem('Kiev') != stem('Kieva'), 
('Ki' != 'Kiev')

Thank you

PS. I'm sorry for my english.