[Xapian-discuss] scriptindex on an internet crawl

Olly Betts olly at survex.com
Thu Jun 23 18:53:43 BST 2005


OK, the multiple unique fields is a red herring.

The problem is there's no newline at the end of the last line of the
dump file.  This causes scriptindex to try to parse the last line
over and over again.

The attached patch fixes this, but there's the obvious easy workaround
of just doing:

echo >> raw.txt

Cheers,
    Olly
-------------- next part --------------
Index: scriptindex.cc
===================================================================
--- scriptindex.cc	(revision 6303)
+++ scriptindex.cc	(working copy)
@@ -496,6 +496,7 @@
 		}
 	    }
 	    if (this_field_is_content) seen_content = true;
+	    if (stream.eof()) break;
 	}
 
 	// If we haven't seen any fields (other than unique identifiers)


More information about the Xapian-discuss mailing list