Setting the BooleanQuery maxClauseCount in Elasticsearch
If you have come across the TooManyClauses exception while querying Elasticsearch, chances are you are using a terms query, prefix query, fuzzy query, wildcard query or range query that ends up expanding into more than 1024 boolean clauses. That’s the default setting in Lucene’s BooleanQuery which is what lies underneath. That setting is there as a sensible default for memory conservation and performance reasons so unless you know what you are doing you should probably avoid changing it. The Lucene FAQ mentions a few approaches to overcoming the TooManyClauses exception which apply to Elasticsearch as well. One of them is to replace the query that causes the exception by a filter which is cacheable and much more efficient. In Elasticsearch, most the the above queries have filter equivalents:
- terms query => terms filter
- prefix query => prefix filter
- fuzzy query => N/A
- wildcard query => N/A
- range query => range filter
Having said that, if you really need to use a query instead of a filter (e.g. you need your results sorted by relevance), then you can bump the maxClauseCount
by setting the following in the Elasticsearch config file:
[...]
index.query.bool.max_clause_count: 4096
[...]
Note that since this is a static Lucene setting, it can only be set in the Elasticsearch config file and get picked up at startup.
Update
@imotov (who pointed out index.query.bool.max_clause_count
to me in the first place) also suggests looking into the rewrite
parameter if you are using a multi term query as an additional source of options for controlling how boolean queries are re-written in Elasticsearch.