[Snowball-discuss] -IZE versus -ISE

Steve Jones steve.jones@isdduk.com
Thu Oct 2 12:48:01 2003


This is a multi-part message in MIME format.

------=_NextPart_000_005C_01C38810.8B5E8E70
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

Hi,

I was wondering if anyone had overcome (or tried to!) the -IZE/-ISE =
problem: where 'categorize' and 'categorise' are both valid spellings of =
the same word. I noticed that the algorithm only takes into account the =
-IZE version.

I am aware the english language is a complex beastie and you would have =
to differentiate 'prize' from 'prise', but I thought someone might have =
had some success in this area...?

Regards,

/steve

------=_NextPart_000_005C_01C38810.8B5E8E70
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=3DContent-Type content=3D"text/html; =
charset=3Diso-8859-1">
<META content=3D"MSHTML 6.00.2800.1170" name=3DGENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=3D#ffffff>
<DIV><FONT face=3DArial size=3D2>Hi,</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>I was wondering if anyone had overcome =
(or tried=20
to!) the -IZE/-ISE problem: where 'categorize' and 'categorise' are both =
valid=20
spellings of the same word. I noticed that the algorithm only takes into =
account=20
the -IZE version.</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>I am aware the english language is a =
complex=20
beastie and you would have to differentiate 'prize' from 'prise', but I =
thought=20
someone might have had some success in this area...?</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>Regards,</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>/steve</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV></BODY></HTML>

------=_NextPart_000_005C_01C38810.8B5E8E70--