[Xapian-discuss] queryparser thinks ø is o
olly at survex.com
Tue Sep 13 05:21:12 BST 2005
On Sun, Aug 28, 2005 at 02:49:23PM +0200, R. Mattes wrote:
> On Mon, 2005-08-29 at 14:18 +0200, Marcus Ramberg wrote:
> > Thanks for the tips, however, disabling the action in normalizer
> > makes the queryparser tokenize on æøå instead of including them in
> > the term. where can I modify the tokenizer in queryparser to include
> > high-ascii chars (or at least the ones I need).
You'd need to tweak it to treat accented letters as part of a word.
> I'm using some extentions/patches from Olly Betts that enable
> unicode - either you have to wait until Olly Betts is back or
> you have to nag him personally ;-}
> I'm not shure about the status of his patches and i'd hate to
> release code that's considered non-public.
It's public (I've already posted the patches to the mailing list!)
> Anyway, i had to tweak the aptches to apply them to 0.9.2 (and had to
> change some signatures to get them to compile ...).
I'll hopefully get this cleaned up and merged in soon.
More information about the Xapian-discuss