Index Safeguards adjustment: New values for default limits
We’re adjusting the default limits of the topN issue-related entities.
What is changing?
Following performance tests we decided to adjust the default Index Safeguards limits. The new limits are as follows:
comments: 500
worklogs: 100
changehistory: 100
Why is it changing?
We performed extensive performance tests. We set up an 8-node cluster and we used 30 threads to constantly send requests to each node (240 threads in total). The requests consist of both reads and writes, simulating heavy user traffic. We then used Jira Indexing Queue stats to observe how the indexing performance changes with different limit values.
Choosing the right limit values is an act of balancing between functionality and usability. On one extreme a user can search in all 50000 comments of an issue, but it takes ages to index anything. On the other extreme the system is very fast, but searching for comments doesn’t work. With this in mind we chose the new default limits.
Jira will index the latest 500 comments. Comment search is more popular than change history/worklog search, so we allowed a higher number here to lean towards the functionality. Also, comments are indexed incrementally (adding a new comment causes only this comment to be indexed), so we are safe with a more relaxed limit here.
Jira will index the latest 100 change history items and worklogs. These are less often searched for, so we can lean towards indexing performance here. Moreover, all change items of an issue are always indexed at once, making it much more expensive, further incentivising a tougher limit.
When is it changing
The new default limits are introduced in Jira 8.22.4.
The impact
We took a look at Jira stats coming from a production instance of one of our large clients to assess the impact the new limits would have on them. We used a sample of 1.3M indexing operations. We observed that only in 0.3% of cases the number of comments exceeded the limit, meaning not all comments were indexed. For worklogs this was 0% and for changehistory 0.1%. For exact numbers head down to the bottom of this article.
We believe this trade-off is fair taking into account the increased Jira indexing stability.
Jira stats deep dive
This section is only for those who love numbers. ![:nerd_face: :nerd_face:](https://emoji.discourse-cdn.com/twitter/nerd_face.png?v=12)
We ran our performance tests to observe how Jira indexer is coping with different limits.
No limit -1
Without any limits the cluster quickly became unstable. Change history items had to wait in a queue on average 743ms (many waited for over 10s!) to be indexed and the index update process many times took over 1s, meaning less than one update per second was done.
[JIRA-STATS] [INDEXING-QUEUE] index:CHANGE_HISTORY, total primary queue stats: {
"timeInQueueMillis":
{
"avg": 743,
"distributionCounter":
{
"0": 2773,
"1": 90,
"10": 38,
"100": 214,
"1000": 573,
"10000": 727,
"20000": 44,
"30000": 0,
"9223372036854775807": 1
}
},
"timeToUpdateIndexMillis":
{
"avg": 158,
"distributionCounter":
{
"0": 3749,
"1": 27,
"10": 161,
"50": 59,
"100": 36,
"500": 113,
"1000": 147,
"9223372036854775807": 167
}
},
"totalTimeMillis":
{
"avg": 796,
"distributionCounter":
{
"0": 1215,
"1": 1023,
"10": 393,
"100": 313,
"1000": 641,
"10000": 820,
"20000": 53,
"30000": 0
}
},
"totalTimeTimedOutMillis":
{
"count": 76,
},
}
Limit 1000
With the previous default limit of 1000
the situation was much better, but when we reached the maximum load the indexer still couldn’t withstand the amount of work. Time to update the index still exceeded 1s.
[JIRA-STATS] [INDEXING-QUEUE] index:CHANGE_HISTORY, total primary queue stats: {
"timeInQueueMillis":
{
"avg": 515,
"distributionCounter":
{
"0": 3190,
"1": 249,
"10": 239,
"100": 622,
"1000": 2683,
"10000": 1041,
"20000": 10,
"30000": 7
}
},
"timeToUpdateIndexMillis":
{
"avg": 93,
"distributionCounter":
{
"0": 6665,
"1": 196,
"10": 194,
"50": 77,
"100": 27,
"500": 424,
"1000": 303,
"9223372036854775807": 155
}
},
"totalTimeMillis":
{
"avg": 609,
"distributionCounter":
{
"0": 1329,
"1": 294,
"10": 1460,
"100": 863,
"1000": 2812,
"10000": 1264,
"20000": 9,
"30000": 9
}
},
"totalTimeTimedOutMillis":
{
"count": 1,
},
}
Limit 500
The limit of 500 brought another improvement. The average time to update the index fell down to 27ms, allowing for ~37 issue updates per second. Unfortunately, there were still 21 updates that spent over a second on the critical path. There were no timeouts anymore, though.
[JIRA-STATS] [INDEXING-QUEUE] index:CHANGE_HISTORY, total primary queue stats: {
"timeInQueueMillis":
{
"avg": 79,
"distributionCounter":
{
"0": 5616,
"1": 170,
"10": 219,
"100": 2019,
"1000": 2774,
"10000": 80,
"20000": 0,
"30000": 0
}
},
"timeToUpdateIndexMillis":
{
"avg": 27,
"distributionCounter":
{
"0": 9300,
"1": 44,
"10": 101,
"50": 100,
"100": 81,
"500": 1211,
"1000": 20,
"9223372036854775807": 21
}
},
"totalTimeMillis":
{
"avg": 108,
"distributionCounter":
{
"0": 2611,
"1": 482,
"10": 2054,
"100": 2028,
"1000": 3595,
"10000": 108,
"20000": 0,
"30000": 0
}
},
"totalTimeTimedOutMillis":
{
"count": 0,
},
}
Limit 100
The limit of 100 brought what we were aiming for - a cluster that is stable under heavy load. No timeouts, a short time spent in the indexer queue and on the writing thread (average 3ms for both) and only a single update that took over a second.
[JIRA-STATS] [INDEXING-QUEUE] index:CHANGE_HISTORY, total primary queue stats: {
"timeInQueueMillis":
{
"avg": 3,
"distributionCounter":
{
"0": 9440,
"1": 201,
"10": 530,
"100": 1120,
"1000": 18,
"10000": 0,
"20000": 0,
"30000": 0
}
},
"timeToUpdateIndexMillis":
{
"avg": 3,
"distributionCounter":
{
"0": 9875,
"1": 23,
"10": 125,
"50": 1203,
"100": 75,
"500": 6,
"1000": 1,
"9223372036854775807": 1
}
},
"totalTimeMillis":
{
"avg": 8,
"distributionCounter":
{
"0": 3795,
"1": 2879,
"10": 2036,
"100": 2548,
"1000": 50,
"10000": 1,
"20000": 0,
"30000": 0
}
},
"totalTimeTimedOutMillis":
{
"count": 0,
},
}
Real production instance indexing-limits stats
The data from one of our large clients proves the new limits will have minimal usability impact. Only very few issues would have their comments/changehistory items not fully indexed.
[JIRA-STATS] [INDEXING-LIMITS] total stats: ... data={
...
"numberOfComments":{
"count":1334984,
"min":0,
"max":2632,
"sum":6588589,
"avg":4,
"distributionCounter":{
"0":296176,
"1":525645,
"10":434660,
"100":69058,
"1000":9064,
"10000":381,
"20000":0,
"50000":0
}
},
"numberOfWorklogs":{
"count":1334964,
"min":0,
"max":9,
"sum":163,
"avg":0,
"distributionCounter":{
"0":1334861,
"1":78,
"10":25,
"100":0,
"1000":0,
"10000":0,
"20000":0,
"50000":0
}
},
"numberOfChangeHistory":{
"count":1334964,
"min":1,
"max":2455,
"sum":10138251,
"avg":7,
"distributionCounter":{
"0":0,
"1":90395,
"10":1111568,
"100":131177,
"1000":1469,
"10000":355,
"20000":0,
"50000":0
}
},
...
}
Jira Stats
Jira Stats proved to be very useful for our performance assessment. You can use them too to evaluate the performance of your test and production environments! Find out more in this knowledge base article.