We have recently created a fix for JSWSERVER-20133. The bug causes indexing to fail when plugins with custom indexing code attempt to create very large Lucene terms or DocValues fields. It stems from Lucene’s limit of 32766 bytes for single terms or DocValues. In practice, the cases where we’ve seen it manifest were mostly attempts at indexing very large JSON documents without tokenizing them.
Starting with Jira 8.4.1, fields that exceed this limit will be removed before they are committed to Jira’s Lucene index in order to prevent entire indexing operations from failing. Each such event will emit an ERROR level message to the logs, allowing plugin developers to pinpoint the offending fields. The log entry looks like this:
2019-09-04 20:06:24,091 IssueIndexer:thread-6 ERROR admin 1206x3196x1 ujgapq 0:0:0:0:0:0:0:1 /secure/admin/IndexReIndex!reindex.jspa [c.a.j.issue.index.DocumentScrubber] A document contained a potential immense term in field customfield_10220_timeline. The field has been removed from the document.
Ultimately, it can be solved only by fixing the plugin’s indexing code.
- In the case of immense terms, using a tokenized field should be enough (i.e. indexing as a StringField as opposed to a TextField).
- In the case of DocValues, the recommended approach is to truncate the field to fit below Lucene’s 32766 byte limit.