[Xapian-discuss] Search relevance

Matthew Somerville matthew at mysociety.org
Thu Jul 24 13:24:16 BST 2008


Ben Phillips wrote:
> I'm getting some surprising (to me) ordering of results.

If you look at the parsed query, it's looking for "IV", when it should be 
"iv" (Xapian search is case sensitive, given the first letter being 
capitalised means something special). Result 5 includes the terms "iv" and 
"Ziv", but not "IV". You need to synonym to lowercase versions of words.

ATB,
Matthew

 > An example below:
> 
> Query string is: grand theft auto four
> Parsed query is: Xapian::Query((Zgrand:(pos=1) OR Ztheft:(pos=2) OR
> Zauto:(pos=3) OR Zfour:(pos=4) OR 4:(pos=4) OR IV:(pos=4)))
> 528 results found.
> Results 1-10:
> 1: 46% ID=51009 TITLE= docid=36694 [Grand Theft Auto Compilation]
> ['XID51009', 'XPLATFORM16', 'Zauto', 'Zcollector', 'Zcompil', 'Zedit',
> 'Zgrand', 'Ztheft', 'auto', "collector's", 'compilation', 'edition',
> 'grand', 'theft']
> 2: 40% ID=6609 TITLE= docid=6569 [Grand Theft Auto Advance]
> ['XID6609', 'XPLATFORM4', 'Zadvanc', 'Zauto', 'Zgrand', 'Ztheft',
> 'advance', 'auto', 'grand', 'theft']
> 3: 40% ID=6614 TITLE= docid=6574 [Grand Theft Auto London 1961]
> ['1961', 'XID6614', 'XPLATFORM15', 'Zauto', 'Zgrand', 'Zlondon',
> 'Ztheft', 'auto', 'grand', 'london', 'theft']
> 4: 39% ID=6607 TITLE= docid=6567 [Grand Theft Auto]
> ['XID6607', 'XPLATFORM15', 'XPLATFORM16', 'XPLATFORM22', 'Zauto',
> 'Zgrand', 'Ztheft', 'auto', 'grand', 'theft']
> 5: 39% ID=24402 TITLE= docid=21668 [Grand Theft Auto IV]
> ['XID24402', 'XPLATFORM34', 'XPLATFORM69', 'Zauto', 'Zgrand', 'Ziv',
> 'Ztheft', 'auto', 'grand', 'iv', 'theft']
> 
> four has synonyms so expands to four OR 4 OR IV. I'd expect result 5
> 'Grand Theft Auto IV' to be result number 1 as it's exactly the search
> term. If I search for 'grand theft auto iv' then it is result 1.
> 
> Cheers,
> Ben.



More information about the Xapian-discuss mailing list