Re[2]: [Snowball-discuss] question about russian stemmer
"Yuri"
"Yuri"
Fri Feb 13 20:22:26 2004
Hello
> The English stemmer gives a scheme for including exceptions, which you might
> try and adapt to the Russian stemmer if the "Kiev" case was sufficiently
> important.
I'm not a linguist, i'm just a programmer. SBL definitions look very uncommon
for me, i will try to find out where to put exceptions. May be you can
help me, how to add just one exception: stem Kiev => Kiev.
Or if it hard, as workaround I make my stemmer subclass which looks for exceptions
and use it, or if word is not listed in exceptions call Snowball.
> You must of course realise that the stemmers are not 100% accurate, and a
> certain rate of error is inevitable. These errors do not necessarily degrade
> retrieval performance however (see the Introduction to Snowball).
>
> Are there many other words that mis-stem in a similar way?
No, this was first and only one problem (at least for now).
I'm writting search engine, which index all word in text.
And i noticed when i search "Kieva" (Kiev's or 'of Kiev' in english),
my search engine does not find text containing word "Kiev".
When i started to search where is the error i've found that stemmer,
stems 'Kiev' as 'Ki', and stem('Kiev') != stem('Kieva'),
('Ki' != 'Kiev')
Thank you
PS. I'm sorry for my english.