A high number of custom fields can negatively impact performance, index size and indexing time in Jira. A big chunk of this impact is caused by the time it takes to index certain custom fields. This problem is especially painful for our biggest Data Center customers.
To minimise this impact, we’re introducing:
- a view that surfaces custom fields that take longest to index,
- optimizations that reduce the number of called field indexers
- optimizations that reduce the amount of data stored in the Lucene index
A full reindex is required to benefit from the changes.
This post describes what are these changes about and how it can impact your apps.
Change 1: View the custom fields that take longest to index
When a custom field takes long to index, it can cause sudden indexing performance spikes. Normally, re-indexing time is not evenly distributed and there might be several fields which take up most of the indexing time. Now, instead of checking the logs for the stats, you can view them in Jira Data Center. Click Actions > Custom field indexing for a specific node to view the data (see EAP release notes for more details). If you see a custom field that is introduced by your app, you can take action to change its configuration to improve the overall performance.
Note: you can use these stats to validate how your app will impact reindexing time.
Change 2: Reducing the number of fields indexer calls when indexing an issue
Whenever an issue is being indexed, Jira retrieves all registered custom field indexers and calls their addIndex() method regardless of whether the custom field is applicable for the issue or not. This generates additional overhead and affects indexing time.
To reduce this overhead we’re introducing two improvements:
Optimization 1: We’re calling indexers only when their custom fields have values assigned
There is an experimental API exposed to mark your custom field type and the indexer being called only for existing values. See more details on how to use it here.
Optimization 2: We’re calling indexers only when their custom fields are applicable for the issue ( they are visible and have context assigned)
This optimization is enabled for each custom field and the indexer cannot be disabled selectively.
As a result, only the indexers for the custom fields that are applicable for the issue will be called.
Benefits
After reducing the number of called indexers, our tests showed a significant reduction in reindexing time (up to 70% improvement for Jira custom fields).
The optimizations will also decrease the response time for the actions that involve changing the issue (create / edit issue) and that trigger reindexing (adding or updating comments).
How does it impact my apps?
In order for a custom field type to benefit from optimization 1, you need to explicitly implement the new API. We’ve done it for standard Jira custom fields and you as App vendors need to opt-in to the new API to leverage the benefits for your custom field types. It means that your application should not be affected.
We want to hear your feedback on how the new APIs fulfils your use cases before we mark them as stable. Feel free to share it under this blog post.
Optimized built-in Jira custom fields:
Field Name | FieldType | Searcher | Indexer |
---|---|---|---|
Checkboxes | MultiSelectCFType | MultiSelectSearcher | SelectCustomFieldIndexer |
Date Picker | DateCFType | DateRangeSearcher | LocalDateIndexer |
Date Time Picker | DateTimeCFType | DateTimeRangeSearcher | DateCustomFieldIndexer |
Number Field | NumberCFType | ExactNumberSearcher | NumberCustomFieldIndexer |
Project Picker (single project) | ProjectCFType | ProjectSearcher | ProjectCustomFieldIndexer |
Radio Buttons | SelectCFType | MultiSelectSearcher | SelectCustomFieldIndexer |
Select List (cascading) | CascadingSelectCFType | CascadingSelectSearcher | CascadingSelectCustomFieldIndexer |
Select List (multiple choices) | MultiSelectCFType | MultiSelectSearcher | SelectCustomFieldIndexer |
Text Field (multiple line) | TextAreaCFType | TextSearcher | SortableTextCustomFieldIndexer |
Select List (single choice) | SelectCFType | MultiSelectSearcher | SelectCustomFieldIndexer |
Text Field (read only) | ReadOnlyCFType | TextSearcher | SortableTextCustomFieldIndexer |
Text Field (single line) | RenderableTextCFType | TextSearcher | SortableTextCustomFieldIndexer |
URL Field | URLCFType | ExactTextSearcher | ExactTextCustomFieldIndexer |
Version Picker (multiple versions) | VersionCFType | VersionPickerSearcher | VersionCustomFieldIndexer |
Version Picker (single version) | VersionCFType | VersionPickerSearcher | VersionCustomFieldIndexer |
Optimization 2 can possibly affect your app since we stop calling indexers for ALL field types (also the custom types you can create) for the fields which are not visible or are not in the context of a specific issue.
In other words, if your app relies on writing something to the index (or executing some other code) for fields that are not visible or out of context, then this functionality will stop working. The end goal is to write to an index only when needed.
Also if your indexer inherits from the AbstractCustomFieldIndexer you need to check for the addDocumentFieldsNotSearchable() as it will not be called any more. If your indexer implements FieldIndexer you need to check for any logic called when isFieldVisibleAndInScope returns false.
Please let us know if you notice this change breaking your app.
How do I disable the optimizations?
You can use the system property jira.cfv.driven.indexing.disabled=true to disable executing indexers for fields having values assigned and jira.local.context.indexing.disabled=true to call indexers regardless of the field’s visibility and context.
Change 3: Removing redundant data from Lucene index
JQL supports sorting by the custom field values with the ‘ORDER BY’ clause:
project = MyProject ORDER BY MyCustomField
For this to work, Jira built-in custom fields store sorted_cf_name in the Lucene document. Whenever there is a custom field value assigned to an issue, it is stored in the index. However, when no value exists for a custom field, the sorting marker is stored instead (Double.MAX_VALUE, Long.MAX_VALUE or \ufffd depending on the custom field’s type).
However, the comparators used by Jira built-in custom fields are capable of sort null values correctly. For this reason, we decided not to store markers for null values in the Lucene index any more and rely on comparators to sort null values.
Benefits
We expect the reduction in the index size and reindexing time, depending on the number of custom fields and non-assigned values. On our internal instance (600k issues, 400 CFs, 2M CF values) we observed a 15% reduction in the index size.
How does it impact my apps?
You should not expect any functional changes if you use Jira built-in custom fields or define custom field types inheriting from the Jira ones. However, check that your app does not rely on any side effects of storing values in sort_ Lucene fields or access Lucene fields outside of Jira API.
Custom fields that may be affected by not storing null values:
Field Name | FieldType | Searcher | Indexer |
---|---|---|---|
Date Picker | DateCFType | DateRangeSearcher | LocalDateIndexer |
Date Time Picker | DateTimeCFType | DateTimeRangeSearcher | DateCustomFieldIndexer |
Import Id | ImportIdLinkCFType | ExactNumberSearcher | NumberCustomFieldIndexer |
Hidden Job Switch | HiddenJobSwitchCFType | JobSearcher | SortableTextCustomFieldIndexer |
Job CheckBox | JobCheckboxCFType | JobSearcher | SortableTextCustomFieldIndexer |
Number Field | NumberCFType | ExactNumberSearcher | NumberCustomFieldIndexer |
Text Field (multiple line) | TextAreaCFType | TextSearcher | SortableTextCustomFieldIndexer |
Text Field (read only) | ReadOnlyCFType | TextSearcher | SortableTextCustomFieldIndexer |
Text Field (single line) | RenderableTextCFType | TextSearcher | SortableTextCustomFieldIndexer |
URL Field | URLCFType | ExactTextSearcher | ExactTextCustomFieldIndexer |
How can my app benefit?
If your app uses custom indexers, we encourage you to review how your app handles non-existing values. Feel free to contact us to discuss your use case if needed.
How do I disable this feature?
You can use the system property jira.skip.indexing.null.disabled=true to disable it in case any problems arise or to compare with previous results.
Availability and download
These changes will be introduced in Jira Data Center 8.10. However, you can already test them in the EAP version. To start benefiting from the above optimisations a full reindex is required. You can download an EAP version here. Read more about the 8.10 EAP.