We are currently blocked by an issue with scaling the Jira Data Center cluster using the steps in the Data Center App Performance Toolkit User Guide. The re-indexing of cluster nodes is broken, resulting in an infinite “Maintenance” mode of the cluster. We’ve been working with the DC approval team on the DC approval Slack workspace to get it working, but the manual fixes / workarounds are tedious. I’ve already spent more than 3 days on this and I still don’t have a working cluster.
If I understand it correctly, this is a known issue, however it is not being resolved because the DC approval team is in the process of replacing the CloudFormation templates with new Kubernetes / Terraform / Helm deployment.
As there is currently no communicated timeframe, this means that the Atlassian Marketplace Partner community is left with the burden of working around a broken framework for a mandatory step of an Atlassian program. Fuck this. You can keep your Data Center approval BS. I will tell people to move back to Server. I’m done with this crap.
@tpettersen you can explain it to our “mutual” customers.
CC: @jmort and @sopel as you were also in the call with the DC team telling us that the migration to Kubernetes would not interfere with the existing programs.
Hi @remie.
Could you please provide steps to reproduce the issue?
“Maintenance” mode is not a “known issue”. We’ve seen this issue several times from app partners, but we do not have steps to reproduce it.
I do not believe that framework is broken. CI for Jira is green. We’ll try to reproduce this issue manually. Exact steps from you would be useful.
Framework is not abandoned and you could get support in a timely manner in the community slack.
Migration to k8s is not related to this issue at all. We want to make a better deployment solution that has a dataset and index inside to make environment setup easier and faster for app partners.
Hi @remie. We were able to reproduce the issue you faced.
The workaround as Oleksandr Popov mentioned in slack works:
Go to System settings
Indexing page
Set the recovery index schedule to 5min ahead of the current time
Wait 10min until the index snapshot is created (snapshot location /media/atl/jira/shared/caches/indexesV2/snapshots)
After scaling new nodes will get an index recovered from the index snapshot
Sorry for the inconvenience. We are working on improving scripts and documentation and will include fixes in the next release.
It turns out we missed this bug, because CI was configured for a weekend, and the scaling test happened the next day after the default scheduled event already had happened.