GSOC 2018: Diversification of Search Results

James Aylett james at tartarus.org
Mon Feb 26 10:41:50 GMT 2018


On 25 Feb 2018, at 20:44, Uppinder Chugh <uppinderchugh at gmail.com> wrote:

> a) Is wrapping Xapian::Mset matcher::get_set(..) suitable in this scenario and with the api? Also, how can I allow the user to manually allow diversification while he configures his result set via Matcher API? 

Hi, Uppinder. You should probably look at how our letor and clustering APIs work, since both of them need an MSet to get started. (Neither is yet part of a release, so you'll need to check the source code rather than online API documentation.)

> b) Should I include the LC clustering algorithm in xapian-core/cluster (as there's the base class Cluster which can be inherited) or make it part of diversification implementation. 

You'll need to make a recommendation in your proposal. Will LC be useful for clustering? Will the other clustering algorithms be useful for diversification?

> c) Apart from the proposed methods, I'd be writing automated tests, examples and documenting the new feature. Some tips here are appreciated as I've never written tests for code.

I'd read some of the tests we have written already, as well as a basic introduction to testing. (This is a good start, based on python which is fairly readable even if you haven't worked with it before: http://www.diveintopython3.net/unit-testing.html)

> Also, for documenting, I believe only getting-started-with-xapian should be updated with examples for using the new feature.

That's true for examples; API documentation is also important.

J

-- 
 James Aylett
 devfort.com — spacelog.org — tartarus.org/james/




More information about the Xapian-devel mailing list