Lucene in Action

Yeehaw!

Ulf Dittmer gave LIA a over at JavaRanch. Thanks Ulf!

...a book that must be considered indispensable by serious users.

Posted on Mon, 21 May 2007 13:28

Lucene in Action, Korean translation

Lucene in Action has recently been translated to Korean by Cheolgoo Kang, Seongjin Ju, and Moonho Lee. The translated version was in October 2005. Thank you, Cheolgoo, Seongjin, and Moonha!

Posted on Mon, 7 Nov 2005 23:51

Ferret - Lucene for Ruby!

Dave Balmain has done a wonderful thing - ported Ruby to Lucene, called . He released it in a very robust, usable, and well documented state. With Rails increasingly becoming the web platform of choice, Ferret is surely a hit.

Posted on Mon, 24 Oct 2005 15:25

After all the good reviews and very positive feedback about Lucene in Action that we have received over the last 10 months, we finally came across a not so positive review . The review can be broken down into the following 4 main parts:

Lack of import statements
Authors didn't test the code
OOP is not suitable for Lucene code examples and there are no direct Lucene calls
Need for a command-line tool for HTML indexing

As Amazon's site doesn't let us provide feedback and respond to the review there, we thought we would address these issues here and hopefully help the reviewer get more out of our book. Let's address each of the four concerns:

Lack of import statements
Code examples in the book purposely don't contain import statements. Often times the list of import statements would be rather long. If we included all the imports, the code examples would be much longer and would often span multiple pages, thus making them harder for readers to follow. The list of import statements would also often repeat, as most examples import the same or very similar set of Lucene classes. Including imports would result in a thicker, heavier, and thus more expensive book.

So how should one deal with the lack of import statements?

Firstly, all code examples from Lucene in Action are free and available for download, even for those who don't own a copy of the book. The code is packaged with an ant script that can compile all the code, create all needed indexes, and run the code examples from the book.

Secondly, one can import all the code in any modern Java IDE and easily see which classes come from which packages.

This is also described in the book itself, in the "About the Book" section on page xxvii, in the last sentence in the paragraph titled "Code examples".
Authors didn't test the code
One of the novel and interesting aspects of Lucene in Action is that most of its code examples are written as unit tests. All code examples are, therefore, automatically tested. We used the excellent JUnit unit test framework to build the examples, and we provided the reasoning behind this in the "About the Book" section on page xxvii, in the paragraph titled "Why JUnit?".
OOP is not suitable for Lucene code examples and there are no direct Lucene calls
All the calls to Lucene are direct calls, but presented as unit tests. It sounds like the reviewer is confusing OOP and unit tests.
Need for a command-line tool for HTML indexing
We present just such a tool in Chapter 7, in section 7.4.2. The chapter also includes a whole mini-framework for indexing other file types (e.g. XML, Word, PDF, etc.).

Posted on Wed, 21 Sep 2005 05:47

Dallas Lucene presentation - September 14, 2005

Erik will be presenting Lucene to the Dallas Java Users Group on Wednesday, September 14.

Posted on Tue, 13 Sep 2005 16:33

Lucene powering CNET.com Product Category Listings

Lucene expert Chris Hostetter announced, with detailed implementation notes, that CNET.com's Product Category Listings are powered by Lucene. The filtering and caching details are well worth the read.

Posted on Wed, 31 Aug 2005 09:06

Slashdot review: Lucene in Action - 9 of 10

Simon P. Chappell wrote a nice Lucene in Action review on Slashdot, giving it a 9 of 10 rating. Thank you, Simon!

Posted on Wed, 24 Aug 2005 23:01

"Needless to say, the customer is extremely happy"

We just received this unsolicited e-mail that made our day:

Excellent book. I purchased the book and used it to help deliver a knowledge management solution based upon the c# DotLucene. I found the material very well written and organized. Even though the code samples were in Java, I found it was easy to translate to c# due to the way you presented the concepts.

I was able to create a MS Content Management Server crawler to index 15,000 knowledge base articles in less than 15 minutes. The customer's previous indexer used Full-Text indexing for SQL Server and took well over 2 hours. Searches that used to take over 7 seconds now return in milliseconds, not to mention the fact that the search results are much more relevant than before. Needless to say, the customer is extremely happy.

Thanks to your book, I was able to get this project completed in 5 weeks. Prior to that, I had very little knowledge of search engines or content indexing. I hope to have the opportunity to use Lucene in future projects. If you decide to write more, I would love to see a "advanced" book that describes some more of the inner workings if Lucene. I am interested in learning more about things like term vector calculations and more advanced analysis concepts.

I have learned a great deal and look forward seeing more in the future. Great job and thanks!

We will definitely keep these suggestions in mind for the second edition of LIA. If you'd like to send us suggestions/comments/compliments/errata, please .

Posted on Wed, 20 Jul 2005 08:56

Lucene at SourceForge: Behind the Scenes

Chris Conrad, a SourceForge Engineer, describes how SourceForge uses Lucene in Behind the Scenes of the SourceForge.net Search System.

Posted on Thu, 30 Jun 2005 00:44

SearchMorph - javadoc searching

David Spencer at SearchMorph has just updated his . From the :

I have been collecting URLs to javadoc-generated pages and have updated the index of javadoc trees. Now the Lucene index includes over 162,000 documents (individual pages or URLs) from 630 javadoc trees.

Posted on Mon, 20 Jun 2005 09:48

Lucene in the Piggy Bank

The SIMILE team at MIT recently released Piggy Bank 2.0. Piggy Bank is a very slick semantic web Firefox extension allowing the collection of rich content around the web. What does this have to do with Lucene? The Firefox extension embeds a full Java web server including Lucene for full-text search, as you can see from the Piggy Bank .

For some more information, read Stefano's blog entry titled

Congrats to the SIMILE team!

Posted on Wed, 25 May 2005 15:17

The number of search sites is also skyrocketing, as the technological barrier for search has been lowered. In December, Manning Publications published Lucene in Action, a guide to creating search engines with open-source software called Lucene, which is already used on singer Bob Dylan's Web site. Armed with this book, the average programmer can create a search engine in less than a day, says co-author Erik Hatcher.

Posted on Tue, 22 Mar 2005 10:02

"...grab this book..."

Valentin Crettaz is publishing a review of Lucene in Action at Java Ranch soon. He's already.

Posted on Tue, 15 Mar 2005 13:13