[Xapian-discuss] How to really make use of omega/xapian? (for omega with PHP Mysql)

Olly Betts olly at survex.com
Thu Oct 5 09:17:20 BST 2006


On Wed, Oct 04, 2006 at 01:35:09PM -0700, ath wrote:
> first-post: unhtml  truncate=300 field=sample
> first-post: unhtml weight=3 index field=body

It would be more efficient to only "unhtml" once:

first-post: unhtml weight=3 index field=body truncate=300 field=sample

Do you actually want to store the whole field in Xapian?  You can, but
it's not required in order to index it, and it's potentially large...

> In the end I want to be able to search on topictitle, author and forum. Is
> this indexscript suitable for that?

It looks plausible, though I don't know exactly what's in each field.

> 2) How can I search on the indexes with the given indexscheme?
> 
> If I, lessay I want to search for topics started by a certain author
> (testwriter), I'd assume I only have to do such a search in omega.
> omega?A=testwriter&DEFAULTOP=or&DB=default&FMT=query&xDB=default&xFILTERS=--O
> However, I'm not getting any results back if I do so.

Term prefixes aren't the same as CGI parameters.

See the Omega documentation "docs/termprefixes.txt", in particular the
last section on "Probabilistic Fields".

If you want separate form fields for "author" and "body" queries, you
can't quite achieve this using Omega unmodified at present.  That really
should be possible - file a wishlist bug and I'll take a look when I'm
not in the middle of a release.  Or if you want to work on a patch, I
can point you in the right direction.

> 3) How can I safely integrate omega on my site?
> 
> I have a grouppermission scheme going on on my site. You need to be in a
> certain group to search for content in certain forums. 
> I found this post
> http://thread.gmane.org/gmane.comp.search.xapian.devel/112/focus=113 but it
> didn't really help. How can I put these (<QUERY>) AND (XWORLD:yes OR
> XUSER:bill OR XGROUP:users OR XGROUP:wheel) into use with omega.
> Where do I put the XWORLD, XUSER, XGROUP-things in the index?
> And doesn't that mean that a user only have to out XGROUP:wheel in the query
> and still gets to see evertying?

You'll need to modify Omega for this.

The query string is passed to the Xapian::QueryParser object which
returns a Xapian::Query object (function set_probabilistic in query.cc).
You then just need to combine this with your permissions filter,
something like this:

	Xapian::Query permissions("XGROUP:squirrels"); // Or whatever
        query = qp.parse_query(query_string);
	query = Xapian::Query(Xapian::Query::OP_FILTER, query, permissions);

> 4) How can i make sure that illegal characters are filtered out in omega?
> I sometimes have multilingual characters in the content, and these has
> caused the xml-output of omega to go haywire. How can I make sure that these
> kind of characters are filtered out? I've already used unhtml, what else can
> I do?

Bear in mind that the released versions of Omega assume iso-8859-1
(utf-8 support will be in the 1.0 release) so wide and multibyte
characters won't currently be handled correctly.

Using $html{} in your query template should escape characters which
are problematic in HTML and XML.  If you're not using that, you really
need to so as to avoid potential cross-site scripting type attacks.

If you're already using that, can you give an example of this problem?

Cheers,
    Olly



More information about the Xapian-discuss mailing list