Search inside Lucene in Action

Query parsed to: +analyz +phrase

1 - 20 of 21 results (Page 1 of 2)

3.5.6 : Phrase queries

starts on page 98 under section 3.5 (Parsing query expressions: QueryParser) in chapter 3 (Adding search to your application)

...Terms enclosed in double quotes create a PhraseQuery. The text between the quotes is analyzed; thus the resultant PhraseQuery may not be exactly the phrase originally specified. This process has been the subject of some confusion. For example, the query "This is Some Phrase*", when analyzed ... *\"", "field", new StandardAnalyzer()); assertEquals("analyzed", "\"some phrase\"", q.toString("field ... phrase\"", "field", analyzer); assertEquals("zero slop", "\"exact phrase\"", q.toString("field... [Full sample chapter]

3.4.5 : Searching by phrase: PhraseQuery

starts on page 87 under section 3.4 (Creating queries programmatically) in chapter 3 (Adding search to your application)

... and the slop factor in section 3.5.6. Phrases are analyzed by the analyzer passed to the QueryParser ... contained the phrase "the quick brown fox jumped over the lazy dog". Without knowing the exact phrase ... words; but in this case we only want documents that have phrases where the words are either exactly ... 4.8.3. reconstruct the phrase in order. Let's take the phrase just mentioned and see how the slop ... a phrase query matched the test document: public class PhraseQueryTest extends TestCase { private... [Full sample chapter]

4.1 : Using analyzers

starts on page 104 in chapter 4 (Analysis)

.... Two phrases are analyzed, each by four of the built-in analyzers. The phrases are "The quick brown ... Before we get into the gory details of what lurks inside an analyzer, let's look at how an analyzer ... . In the following two sections, we detail how an analyzer is used in these scenarios. Before we begin ... ! Listing 4.1 Visualizing analyzer effects Analyzing "The quick brown fox jumped over the lazy dogs ... ] Analyzing "XY&Z Corporation - " WhitespaceAnalyzer: [XY&Z] [Corporation...

4.9 : Nutch analysis

starts on page 145 in chapter 4 (Analysis)

... to demonstrate the Nutch analysis trickery. Listing 4.10 first analyzes the phrase "The quick brown..." using ... of the Nutch architecture. There is another interesting facet to Nutch: how it analyzes text. Nutch does ... common terms are used in a query, but not within a phrase, such as the quick brown with no other ... words as a single token) technique with a query-time optimization of phrases. This results in a far ... public class NutchExample { Custom analyzer public static void main(String[] args) throws IOException...

4.7.3 : Hole lot of trouble

starts on page 138 under section 4.7 (Stemming analysis) in chapter 4 (Analysis)

... during searching. PhraseQuery and QueryParser are the two troublemakers. Exact phrase matches now fail ... ()); } } As shown, an exact phrase query didn't match. This is disturbing, of course. Unlike the synonym analyzer situation, using a different analyzer won't solve the problem. The difficulty lies ... in a PhraseQuery must be side by side, and in our test case, the phrase it's searching for is "over lazi" (stop ... showing a phrase query working. Setting the slop to 1 allows the query to effectively ignore the gap...

4.2.3 : Visualizing analyzers

starts on page 112 under section 4.2 (Analyzing the analyzer) in chapter 4 (Analysis)

.... AnalyzerDemo includes two predefined phrases and an array of the four analyzers we're focusing on in this section. Each phrase is analyzed by all the analyzers, with bracketed output to indicate ... into different types based on a grammar. Analyzing the phrase "I'll e-mail you at ... It's important to understand what various analyzers do with your text. Seeing the effect of an analyzer is a powerful and immediate aid to this understanding. List- ing 4.2 provides a quick and easy...

4.0 : Analysis

starts on page 102

... analyzers Handling foreign languages 102 Analysis, in Lucene, is the process of converting field ... terms in sequence. An analyzer is an encapsula- tion of the analysis process. An analyzer tokenizes ... that text. In order for Lucene to know what "words" are, it analyzes the text during indexing, extracting ... analyzer is a crucial development decision with Lucene. One size doesn't fit all when it comes to choosing an analyzer. Language is one factor in choosing an analyzer, because each has its own unique...

4.2.1 : What's in a token?

starts on page 108 under section 4.2 (Analyzing the analyzer) in chapter 4 (Analysis)

... stream analyzing this phrase with the SimpleAnalyzer. Figure 4.1 Token stream with positional ... designated for tokenization are processed with the specified ana- lyzer, and each token is written ... what forms a Token; we'll come back to how that translates into a term. For example, let's analyze ... into terms After text is analyzed during indexing, each token is posted to the index as a term ... and successive position in the field. Position increments factor directly into perform- ing phrase queries...

6.3.4 : Allowing ordered phrase queries

starts on page 208 under section 6.3 (Extending QueryParser) in chapter 6 (Extending search)

... without the slop signature (slop makes sense only on multiterm phrase query); parsing a quoted phrase ... the analyzer.4 Given enough slop, PhraseQuery will match terms out of order in the original text ... a PhraseQuery with an ordered SpanNearQuery: protected Query getFieldQuery( String field, Analyzer analyzer, String queryText, int slop) throws ParseException { Query orig = super.getFieldQuery(field, analyzer, queryText); b Delegate to QueryParser's if (! (orig instanceof PhraseQuery...

4.3.2 : StandardAnalyzer

starts on page 120 under section 4.3 (Using the built-in analyzers) in chapter 4 (Analysis)

... analyzers on the phrase "XY&Z Corporation - ". StandardAnalyzer is the only one ... StandardAnalyzer holds the honor as the most generally useful built-in analyzer. A JavaCC-based2 grammar underlies it, tokenizing with cleverness for the following lexical types: alphanumerics, acronyms, company names, e-mail addresses, com- puter host names, numbers, words with an interior ... choice. Using StandardAnalyzer is no different than using any of the other analyzers, as you can see...

4.3.1 : StopAnalyzer

starts on page 119 under section 4.3 (Using the built-in analyzers) in chapter 4 (Analysis)

... analyzes phrases, and each of these reduces to "one enough" and matches the terms indexed. There is a "hole ... Query- Parser along with StopAnalyzer, this document would match phrase queries for "one enough", "one ... , only the tokens emitted from the analyzer (or indexed as Field.Keyword) are available for searching....

4.4 : Dealing with keyword fields

starts on page 121 in chapter 4 (Analysis)

... b QueryParser analyzes each term and phrase of the query expression. Both Q36 and SPACE are analyzed ... (), true); Field not Document doc = new Document(); analyzed doc.add(Field.Keyword("partnum ... ("partnum:Q36 AND SPACE", "description", new SimpleAnalyzer()); b QueryParser analyzes each Hits hits = searcher.search(query); term and phrase assertEquals("note Q36 -> q", "+partnum:q +space", query ... q. But at indexing time, Q36 was left as is. Notice, also, that this is the same analyzer used during indexing...

6.3.1 : Customizing QueryParser's behavior

starts on page 203 under section 6.3 (Extending QueryParser) in chapter 6 (Extending search)

... to force ordered or phrase matches. getFieldQuery(String field, Analyzer analyzer, String queryText ... Although QueryParser has some quirks, such as the interactions with an analyzer, it does have ... ? getFieldQuery(String field, These methods are responsible for the construction of either a Analyzer analyzer, TermQuery or a PhraseQuery. If special analysis is needed, or a String queryText) unique type ... string handed to this method doesn't include the trailing asterisk and isn't analyzed. Override...

3.1.2 : Parsing a user-entered query expression: QueryParser

starts on page 72 under section 3.1 (Implementing a simple search feature) in chapter 3 (Adding search to your application)

... is the primary purpose of the QueryParser. QueryParser requires an analyzer to break pieces ... be parsed into something at a lower level that the database server can understand directly. analyzer. Querying through the API using TermQuery and the others discussed in section 3.4 doesn't use an analyzer ... signature is static public Query parse(String query, String field, Analyzer analyzer) throws ... on this in section 3.5.4). The final argument is an Analyzer instance. (We discuss analyzers in detail... [Full sample chapter]

5.4 : Span queries: Lucene's new hidden gem

starts on page 161 in chapter 5 (Advanced search techniques)

... we've discussed thus far, it isn't possible to formulate such a query. Phrase queries could get close with something like "quick fox" AND "lazy dog", but these phrases may be too distant from one another ... methods to make our later code clearer, as shown in listing 5.3. We index two similar phrases in a field ... cat; private Analyzer analyzer; protected void setUp() throws Exception { directory = new RAMDirectory(); analyzer = new WhitespaceAnalyzer(); IndexWriter writer = new IndexWriter(directory, analyzer...

4.5 : "Sounds like" querying

starts on page 125 in chapter 4 (Analysis)

... to view what an analyzer is doing with text. Using our Analyzer- Utils, two phrases that sound similar ... , we'll implement an analyzer to convert words to a phonetic root using an implementation ... () throws Exception { RAMDirectory directory = new RAMDirectory(); Analyzer analyzer = new MetaphoneReplacementAnalyzer(); IndexWriter writer = new IndexWriter(directory, analyzer, true); Document doc ... = QueryParser.parse("kool kat", User typed in "contents", hip query analyzer); Hits hits = searcher.search...

4.6 : Synonyms, aliases, and words that

starts on page 128 in chapter 4 (Analysis)

....length()); Analyzer can't find document using phrase query = QueryParser.parse("\"fox jumps\"", from ... mean the same Our next custom analyzer injects synonyms of words into the outgoing token stream ... [] tokens = AnalyzerUtils.tokensFromAnalysis(synonymAnalyzer4, "jumps"); Analyze one word AnalyzerUtils ... Analyzer { private SynonymEngine engine; public SynonymAnalyzer(SynonymEngine engine) { this.engine ... (reader))), StandardAnalyzer.STOP_WORDS), engine ); return result; } } Once again, the analyzer code...

6.3.3 : Handling numeric field-range queries

starts on page 205 under section 6.3 (Extending QueryParser) in chapter 6 (Extending search)

... Analyzer analyzer; private RAMDirectory directory; protected void setUp() throws Exception { super.setUp(); analyzer = new WhitespaceAnalyzer(); directory = new RAMDirectory(); IndexWriter writer = new IndexWriter(directory, analyzer, true); for (int i = 1; i <= 500; i++) { Document doc = new ... 346 phrased as id:[37 TO 346] won't work as expected with the 3 Lucene stores term information ... the getRangeQuery() method: protected Query getRangeQuery(String field, Analyzer analyzer, String part...

10.6.2 : Searching content

starts on page 367 under section 10.6 (Artful searching at Michaels.com) in chapter 10 (Case studies)

... that matching documents must contain the word.8 For example, given a search phrase of "Mother and child ... , had the 8 Authors' note: There are enough odd interactions between analyzers and QueryParser...

5.3 : Querying on multiple fields at once

starts on page 159 in chapter 5 (Advanced search techniques)

... and zero-slop default phrase queries. NOTE Generally speaking, querying on multiple fields isn't the best ... + " " + subjects)); We used a space (" ") between author and subjects to separate words for the analyzer ... , be sure your queries are fabricated appropriately using the QueryParser and Analyzer diagnostic...