Zend_Search_Lucene index file formats are binary compatible with a Lucene version 1.4 and above.
A detailed description of this format is available here: http://lucene.apache.org/java/docs/fileformats.html.
After index creation, the index directory will contain several files:
segments
file is a list of index segments.
*.cfs
files contain index segments.
Note! Optimized index has always only one segment.
deletable
file is a list of files that are no longer used
by the index, but which could not be deleted.
The Java program listing below provides an example of how to index a file using Java Lucene:
/** * Index creation: */ import org.apache.lucene.index.IndexWriter; import org.apache.lucene.document.*; import java.io.* ... IndexWriter indexWriter = new IndexWriter("/data/my_index", new SimpleAnalyzer(), true); ... String filename = "/path/to/file-to-index.txt" File f = new File(filename); Document doc = new Document(); doc.add(Field.Text("path", filename)); doc.add(Field.Keyword("modified",DateField.timeToString(f.lastModified()))); doc.add(Field.Text("author", "unknown")); FileInputStream is = new FileInputStream(f); Reader reader = new BufferedReader(new InputStreamReader(is)); doc.add(Field.Text("contents", reader)); indexWriter.addDocument(doc);