Setting the BooleanQuery maxClauseCount in Elasticsearch
If you have come across the TooManyClauses exception while querying Elasticsearch, chances are you are using a terms query, prefix query, fuzzy query, wildcard query or range query that ends up expanding into more than 1024 boolean clauses. That's the default setting in Lucene's [BooleanQuery](http://lucene.apache.org/core/3_0_3/api/core/org/apache/lucene/search/BooleanQuery.html#getMaxClauseCount(\)) which is what lies underneath.
That setting is there as a sensible default for memory conservation and performance reasons so unless you know what you are doing you should probably avoid changing it. The Lucene FAQ mentions a few approaches to overcoming the TooManyClauses exception which apply to Elasticsearch as well. One of them is to replace the query that causes the exception by a filter which is cacheable and much more efficient. In Elasticsearch, most the the above queries have filter equivalents:
- terms query => terms filter
- prefix query => prefix filter
- fuzzy query => N/A
- wildcard query => N/A
- range query => range filter
Having said that, if you really need to use a query instead of a filter (e.g. you need your results sorted by relevance), then you can bump the
maxClauseCount by setting the following in the Elasticsearch config file:
[...] index.query.bool.max_clause_count: 4096 [...]
Note that since this is a static Lucene setting, it can only be set in the Elasticsearch config file and get picked up at startup.
@imotov (who pointed out
index.query.bool.max_clause_count to me in the first place) also suggests looking into the
rewrite parameter if you are using a multi term query as an additional source of options for controlling how boolean queries are re-written in Elasticsearch.