Search inside Lucene in Action

Query parsed to: +field +size +limit

1 - 8 of 8 results (Page 1 of 1)

10.3.3 : Search results

starts on page 343 under section 10.3 (Using Lucene in SearchBlox) in chapter 10 (Case studies)

... 2003 07:40:15 EST 24 Nov 2003 07:40:15 EST <size>7408size> SearchBlox<highlight>FAQs</highlight> limit on the number of collections? Why is there a limit on the number of collections? There is a limit on the number of collections due to performance reasons ... segment is generated from the Lucene Hits object. The data for the title and description fields...

2.7.3 : Limiting Field sizes: maxFieldLength

starts on page 54 under section 2.7 (Controlling the indexing process) in chapter 2 (Indexing)

... the ten-thousandth term isn't indexed. To limit Field sizes to 1,000 terms, an application sets ... example. Listing 2.6 Controlling field size with maxFieldLength public class FieldLengthTest ... Some applications index documents whose sizes aren't known in advance. To con- trol the amount of RAM and hard-disk memory used, they need to limit the amount of input they index. Other applications deal with documents of known size but want to index only a portion of each document. For example...

2.7.1 : Tuning indexing performance

starts on page 42 under section 2.7 (Controlling the indexing process) in chapter 2 (Indexing)

... exposes several instance variables that allow you to adjust the size of this buffer and the frequency ... .lucene.mergeFactor 10 Controls segment merge frequency and size continued on next page Table ... variable value maxMergeDocs org.apache.lucene.maxMergeDocs Integer.MAX_VALUE Limits the number ... on the disk for every 10 Documents added to the index. When the tenth segment of size 10 is added, all 10 are merged into a single segment of size 100. When 10 such seg- ments of size 100 have been added...

4.8.3 : Analyzing Asian languages

starts on page 142 under section 4.8 (Language analysis issues) in chapter 4 (Analysis)

... for all tokenized fields in our index, which tokenizes each English word as expected (tao ... , String s) { int size = 0; for (int i = 0; i < s.length(); i++) { size += metrics.charWidth(s.charAt(i)); } return size; } } CJKAnalyzer and ChineseAnalyzer are analyzers found in the Lucene ... an AWT Label component to avoid any confusion that might arise from console output encoding or limited ... are likely to be kept together (as well as disconnected characters, increasing the index size...

4.2.2 : TokenStreams uncensored

starts on page 109 under section 4.2 (Analyzing the analyzer) in chapter 4 (Analysis)

... (for example, lowercase) characters. Tokens are limited to a maximum size of 255 characters ... that tokenizes the input from a Reader. When you're indexing a String through Field.Text(String, String) or Field. UnStored(String, String) (that is, the indexed field constructors which accept a String...

6.5.1 : Testing the speed of a search

starts on page 213 under section 6.5 (Performance testing) in chapter 6 (Extending search)

...(Calendar.DATE) - 1); b Yesterday for (int i = 0; i < size; i++) { timestamp.set(Calendar.SECOND ... document = new Document(); document.add(Field.Keyword("last-modified", now)); d As Date writer ... 2,000 OR'd TermQuerys nested in a BooleanQuery. This exceeds the default limit of 1,024 clauses ... : String today = Search.today(); for (int i = 0; i < size; i++) { Document document = new Document(); document.add(Field.Keyword("last-modified", today)); writer.addDocument(document); } Here, today...

7.3.2 : Built-in Lucene support

starts on page 239 under section 7.3 (Indexing a PDF document) in chapter 7 (Parsing common document formats)

... built into PDFBox. This approach can be quick, as you're about to see, but it also limits what is extracted from the PDF file, what Lucene Document Fields are created, and how they're analyzed ... Document with Fields named summary, producer, contents, modified, url, and path: $ java lia.handlingtypes ... in size to that of the largest supercom- puter installations. Combining more than ... :java.io.InputStreamReader@1193779> org.apache.lucene.document.Field@8916a2 Keyword

Appendix B : Lucene index format

starts on page 394

... is their size, which reflects the number of Documents with that field. Now that you know this, you can ... and populate it with Fields that consist of name and value Figure B.1 The logical, black-box view of a Lucene ... file, so Lucene knows to look only for files with the _lfyc prefix. Lucene also limits itself to files ... that constitute a Lucene index and each segment varies from index to index and depends on the number of fields ... N is a number. These files correspond to the indexed fields present in the indexed Documents. Recall...