[Xapian-discuss] Omega Collapse option

Sam Liddicott sam at liddicott.com
Thu Sep 8 20:24:26 BST 2005


Christiano Anderson wrote:

>Hello,
>
>I am trying to figure out how to use the COLLAPSE option on Omega. 
>
>I have some websites indexed and I want to group or 'collapse'
>duplicated document based on thei URLs. For example, I have the
>following URLs:
>
>http://www.test.com/blahblah
>http://www.test.com/blahblah2
>http://www.test.com/blahblah3
>
>http://www.acme.com/page01
>http://www.acme.com/page02
>http://www.acme.com/page03
>
>I want to make Omega to show only the first occurrence of each URL. Is
>it possible by using COLLAPSE option? 
>
>  
>
Yes. When you index data you will need to set one of the numbered values
to the part of the URL that you want to collapse on.

Then you can pass the value number as the collapse option to omega

If you are not sure what values are, read the scriptindex notes, rather
than LABEL=VALUE they are INTEGER=VALUE as it permits quicker lookup at
search time.

You may want to alter you omega tempate to show the number of collapsed
hits, so that if this is *more* than 1 (or 0?) you show a link to repeat
the same search but without collapsing and additionally restricting the
search by adding a boolean clause that selects only documents that
includes that URL fragment as one of it's terms, or in other words,
exband out THAT result to show the results that were collapsed.  Orange
do this for their WAP search to avoid showing multiple hits from the
same site.

All this requires putting the "unique"-ified URL fragment as a value, to
collapse on, as a field, so you can extract it from the results to
generate the sub-search URL and as a term so that you can restrict the
sub-search.

Sam


>Thanks again,
>
>Christiano
>
>
>_______________________________________________
>Xapian-discuss mailing list
>Xapian-discuss at lists.xapian.org
>http://lists.xapian.org/mailman/listinfo/xapian-discuss
>  
>




More information about the Xapian-discuss mailing list