[Snowball-discuss] Stemming 'communing' and 'communed'

Martin Porter martin.porter at grapeshot.co.uk
Thu Mar 29 09:15:00 BST 2007


> ... my algorithm stems it to "commun". I have run through the spec
> 'by-hand' many times and cannot figure out how to get to the proper
> stemming. 
> 

Michael,

The reason is that prefix 'commun' is handled specially by Porter2 (see
the 'mark_regions' routine) so that in effect it is treated as one
syllable, rather than two syllables. So 'communing' behaves like
'tuning' etc. Similarly Porter2 stems 'communism' to 'communism' while
Porter stems 'communism' to 'commun'.

Were you thinking of contributing your PHP version to 

http://snowball.tartarus.org/otherlangs/index.html

?

Martin






More information about the Snowball-discuss mailing list