[Snowball-discuss] Doubt about the portuguese stem
Leonardo Borges
leonardoborges.rj at gmail.com
Wed Aug 19 21:31:42 BST 2009
Hello guys,
I am currently evaluating Sphinx as an option for my projects and, since I
am brazilian, wanted to give it a try to the Portuguese stemmer you guys
provide.
Thus, I compiled sphinx with the libstemmer option and everything went
great.
Given the following phrase, in one of my documents: "Então, vamos começar a
usar libstemmer"
The following searches return the correct document:
"Então", "Entã", "então", "entã"
which is great, but if I search for:
"Entao"
It returns nothing.
Since I didn't dig into the algorithm, is this the expected behavior? In
that case, the way to accomplish what I'm trying is removing accents myself?
Or perhaps you guys have other suggestions?
Thanks a lot,
Leonardo Borges
www.leonardoborges.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tartarus.org/mailman/private/snowball-discuss/attachments/20090819/5be835cf/attachment.htm
More information about the Snowball-discuss
mailing list