Setting the BooleanQuery maxClauseCount in Elasticsearch

October 18, 2013

If you have come across the TooManyClauses exception while querying Elasticsearch, chances are you are using a terms query, prefix query, fuzzy query, wildcard query or range query that ends up expanding into more than 1024 boolean clauses. That's the default setting in Lucene's [BooleanQuery](http://lucene.apache.org/core/3_0_3/api/core/org/apache/lucene/search/BooleanQuery.html#getMaxClauseCount(\)) which is what lies underneath.

That setting is there as a sensible default for memory conservation and performance reasons so unless you know what you are doing you should probably avoid changing it. The Lucene FAQ mentions a few approaches to overcoming the TooManyClauses exception which apply to Elasticsearch as well. One of them is to replace the query that causes the exception by a filter which is cacheable and much more efficient. In Elasticsearch, most the the above queries have filter equivalents:

Having said that, if you really need to use a query instead of a filter (e.g. you need your results sorted by relevance), then you can bump the maxClauseCount by setting the following in the Elasticsearch config file:

[...]
index.query.bool.max_clause_count: 4096
[...]

Note that since this is a static Lucene setting, it can only be set in the Elasticsearch config file and get picked up at startup.

Update

@imotov (who pointed out index.query.bool.max_clause_count to me in the first place) also suggests looking into the rewrite parameter if you are using a multi term query as an additional source of options for controlling how boolean queries are re-written in Elasticsearch.

comments powered by Disqus
Fork me on GitHub