[Xapian-discuss] add_posting(): term position significance - line or offset?
Henry
henka at cityweb.co.za
Tue Nov 18 17:18:46 GMT 2008
> The usual use is to store the "word number" at which a word appears,
> and this is probably what you want. However, you could store the
> line number if you wanted: phrase searches (with a window of
> phrase-size) would then match when the words were fairly spread out
> (ie, up to one per line).
>
> I recommend using word number, anyway, unless you have a very odd
> situation I've not thought of.
Thanks - I hadn't even thought of word number.
> Note that Xapian currently doesn't modify the weight of a phrase
> based on how close together the terms are ...
Sorry, I wasn't very clear: I was thinking in terms of normal
non-phrase searches. ie, searching for [ +candle +stick ] in:
"...the candle stick was made of gold..."
would score higher (because of the proximity of the words, posting
weights aside) than:
"...the boy decided to use a stick made of wood to break the candle..."
where 'stick' and 'candle' are further apart.
Anyway, you've answered my question, thanks!
Regards
Henry
More information about the Xapian-discuss
mailing list