[Xapian-discuss] UTF8 support plans (without stemming)

rm at fabula.de rm at fabula.de
Wed Apr 27 21:17:20 BST 2005


On Thu, Apr 28, 2005 at 12:09:26AM +0400, Alexandre wrote:
> On Apr 27, 2005, at 23:47, rm at fabula.de wrote:
> 
> >On Wed, Apr 27, 2005 at 11:32:30PM +0400, Alexandre wrote:
> >>Good day,
> >>
> >>does there is any plans about support of the UTF-8 (I talk about lib
> >>core, not about stemming)?
> >
> >What exactly do you mean by UTF-8 support? You can pretty much stuff
> >anything into a xapian database (see some recent posts in this list).
> >But -- without stemming statistical information retieval doesn't really
> >work as expected in most western languages :-/
> 
> Ralf, do you mean this post  
> (http://lists.tartarus.org/pipermail/xapian-discuss/2005-April/ 
> 000821.html)?

Yes, that's the last one. The bug report mentioned in this post gives
more information.

> If so, "query parser ... currently assume latin1" - that's not very  
> good, isn't it?

Hmm. Depends on what you want/need to do. I personally can't see why there
even _is_ a query parser in Xapian core. After all the query language really
depends on the aplication ... 


> Hm, and can you tell me, please, more about stemming influence on IR in  
> western languages? Is it only about probabilistic IR or about vector  
> search too?
> 
> And another one question (not exactly about subject): why Xapian stick  
> to the probabilistic approach? Probably some historical links/docs?

Well, these two querstions relate to each other: Xapian is strong in
'probabilistic IR' and that approach kind of needs some sort of stemming.
I can't speak for the Xapian developers (nor the libraries ancestry
in the guts of Muscat) - from your question i infer that you seem to think
that 'probabilistic IR' is kind of outdated? 

 cheers RalfD
> 
> Thank you in advance,
> Regards,
> /Alexandre.
> 
> 
> 
> _______________________________________________
> Xapian-discuss mailing list
> Xapian-discuss at lists.xapian.org
> http://lists.xapian.org/mailman/listinfo/xapian-discuss



More information about the Xapian-discuss mailing list