[Xapian-discuss] UTF8 support plans (without stemming)
rm at fabula.de
rm at fabula.de
Wed Apr 27 21:17:20 BST 2005
On Thu, Apr 28, 2005 at 12:09:26AM +0400, Alexandre wrote:
> On Apr 27, 2005, at 23:47, rm at fabula.de wrote:
>
> >On Wed, Apr 27, 2005 at 11:32:30PM +0400, Alexandre wrote:
> >>Good day,
> >>
> >>does there is any plans about support of the UTF-8 (I talk about lib
> >>core, not about stemming)?
> >
> >What exactly do you mean by UTF-8 support? You can pretty much stuff
> >anything into a xapian database (see some recent posts in this list).
> >But -- without stemming statistical information retieval doesn't really
> >work as expected in most western languages :-/
>
> Ralf, do you mean this post
> (http://lists.tartarus.org/pipermail/xapian-discuss/2005-April/
> 000821.html)?
Yes, that's the last one. The bug report mentioned in this post gives
more information.
> If so, "query parser ... currently assume latin1" - that's not very
> good, isn't it?
Hmm. Depends on what you want/need to do. I personally can't see why there
even _is_ a query parser in Xapian core. After all the query language really
depends on the aplication ...
> Hm, and can you tell me, please, more about stemming influence on IR in
> western languages? Is it only about probabilistic IR or about vector
> search too?
>
> And another one question (not exactly about subject): why Xapian stick
> to the probabilistic approach? Probably some historical links/docs?
Well, these two querstions relate to each other: Xapian is strong in
'probabilistic IR' and that approach kind of needs some sort of stemming.
I can't speak for the Xapian developers (nor the libraries ancestry
in the guts of Muscat) - from your question i infer that you seem to think
that 'probabilistic IR' is kind of outdated?
cheers RalfD
>
> Thank you in advance,
> Regards,
> /Alexandre.
>
>
>
> _______________________________________________
> Xapian-discuss mailing list
> Xapian-discuss at lists.xapian.org
> http://lists.xapian.org/mailman/listinfo/xapian-discuss
More information about the Xapian-discuss
mailing list