<div>Thanks, we are doing much better when adding [--stemmer=none] option to the scriptindex. All the words are indexed and able to search as they appear in the orginal content. Very good! This way we can easly avoid stemming words that are not of English origin.
</div>
<div>&nbsp;</div>
<div>#!/bin/sh<br>#---------------------------------------------------------#<br># This script create a Xapian index <br>#---------------------------------------------------------#<br>echo -n &quot;Xapian Index Start: &quot;
<br>date<br>echo &quot;The pid of this process is $$&quot;<br>#---------------------------------------------------------#<br>echo &quot;Retrieving data from database&quot;<br>DBUSER=user DBPASSWORD=password /usr/local/bin/dbi2omega myDatabase myTable &gt; 
myData.dat<br>rm -f /indexPath/*<br>echo &quot;Indexing &quot;<br>/usr/local/bin/scriptindex --stemmer=none /indexPath indexscript myData.dat <br>#---------------------------------------------------------#<br>echo -n &quot;Xapian Index End: &quot;
<br>date<br><br><br>&nbsp;</div>
<div><span class="gmail_quote">On 3/9/06, <b class="gmail_sendername">James Aylett</b> &lt;<a href="mailto:james-xapian@tartarus.org">james-xapian@tartarus.org</a>&gt; wrote:</span>
<blockquote class="gmail_quote" style="PADDING-LEFT: 1ex; MARGIN: 0px 0px 0px 0.8ex; BORDER-LEFT: #ccc 1px solid">On Thu, Mar 09, 2006 at 06:35:13PM +0000, Olly Betts wrote:<br><br>&gt; &gt; search for hiking does not return any results:
<br>&gt; &gt; <a href="http://nitra.net/cgi-bin/hladaj.cgi?a=q&amp;q=hiking&amp;c=sk">http://nitra.net/cgi-bin/hladaj.cgi?a=q&amp;q=hiking&amp;c=sk</a><br>&gt; &gt;<br>&gt; &gt; search for hike return result including hiking:
<br>&gt; &gt; <a href="http://nitra.net/cgi-bin/hladaj.cgi?a=q&amp;q=hike&amp;c=sk">http://nitra.net/cgi-bin/hladaj.cgi?a=q&amp;q=hike&amp;c=sk</a><br>&gt;<br>&gt; That hike matches hiking, but hiking doesn't strongly suggests that
<br>&gt; stemming is happening at index time.&nbsp;&nbsp;So you need to fix that.<br><br>The result matching 'hike' contains the word 'hike' in the page<br>text. I'd suggest that no stemming is happening, but that the HTML<br>&lt;title&gt;...&lt;/title&gt; isn't being indexed.
<br><br>J<br><br>--<br>/--------------------------------------------------------------------------\<br>James Aylett&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href="http://xapian.org">xapian.org</a><br><a href="mailto:james@tartarus.org">
james@tartarus.org</a>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <a href="http://uncertaintydivision.org">uncertaintydivision.org</a><br></blockquote></div><br>