[Xapian-discuss] zip/rar support
James Aylett
james-xapian at tartarus.org
Mon Aug 7 14:59:24 BST 2006
On Mon, Aug 07, 2006 at 03:46:08PM +0200, Reini Urban wrote:
> Problems:
> * omindex support for zip,rar,outlook msg and excel xls.
>
> I hacked a preliminary custom filter for xls and msg into the omindex.cc,
> http://www.fileformat.info/format/outlookmsg/
> And added zip/rar support by decrompessing into a root+"/tmp/"+file dir,
> indexing there and removing the root+"/tmp/"+file afterwards.
>
> Is this a good idea?
> Or should I prefer hacking scriptindex which I will need sooner or
> later to support meta fields.
You won't want to hack scriptindex for this, you'll want to change the
way you generate the scriptindex data files. However the plan (ages
ago) for omindex was to allow you to specify MIME -> generator
mappings, which still isn't a bad idea. More recently there has been
some discussion about whether the generator mechanism should perhaps
be related to the way scriptindex works, to save some code and provide
flexibility beyond what we currently provide for, say,
PDF. Unfortunately this means nothing has actually been done.
If you take the plunge now to scriptindex, you'll probably make your
life easier in the short to medium term, and considerably easier in
the long term. If you then want to provide indexing scripts, or
recipes for indexing from archives etc., then shove them up on the
wiki and they'll be available for everyone (and may make their way
into the official manual, if you give permission, assuming that at
some point I or someone else gets round to writing one :-).
You won't have to modify scriptindex at all unless you're doing
something pretty unusual.
We can't accept code to support RAR into any of the core Xapian
packages, because of patent restrictions. (At least, that's my
understanding; IANAL.)
James
--
/--------------------------------------------------------------------------\
James Aylett xapian.org
james at tartarus.org uncertaintydivision.org
More information about the Xapian-discuss
mailing list