Memory leak in custom sort code

Brian Riddle e-mailed us quite a detailed errata item, and with his permission I'm posting the e-mail in its entirety in order to preserve the details:

First *huge* thanks for your book Lucene In Action between it and the lucene develepers and user mailing lists i have been able to give our site a much better search infrastructure.

In the last phase of rolling out our new search system we discovered a memory leak in listing 6.2 DistanceComparatorSource. I used that code as a base for a modified integer sort. That was in and of it self pretty straight forward. But the problem was there was no equals and hash code method. That means that equals and hashcode are inherited from object for DistanceScoreDocLookupComparator.

And there in lies the memory leak. Everytime a new DisctanceComparatorSource was retrieved it failed to find the cached value ScoreDocComparator. So it added it to the cache of ScoreDocCompatators kept by o.a.l.s.FieldCacheImpl. The fix was to add a hashcode and equals method to ou ScoreDocCompatator implementation.

The big clue came after using's profiler to see what was allocating so much memory and reading the last paragraph on page 199 a couple of times.

"The sorting infrastructure within Lucene caches (based on a key combining the hashcode of the indexReader, the field name, and the custom sort object) ..."

That sentence gave the clue as to what was happening but it is also a little misleading. Looking at o.a.l.s.FieldCacheImpl The index reader is used as the key for the internal WeakHashMap of the different Entry(fieldName, ScoreDocComparator) that are used in an application.

If implementations of ScoreDocComapartors do not implment hashcode and equals correctly every time they are used they will be added to the internal cache of field/comparators.

This was complete my fault as I usually add the to every class i write, not however in this case. I hope you can add this to the errata for the currrent addition (as well as fix the code) and expand on this in the Second addition so others won't be bitten by this bug.

Thanks again for the book you guys *rock*.

PS. We are using lucene-1.4.3.jar /jsdk 1.4.2 & jre 1.5 solaris and linux

Posted on Wed, 1 Mar 2006 04:03