Reindexing problem on Jira Server 8.13 on AKS

PierrePuiseux · May 26, 2021, 9:46am

Hello,

I successfully installed and migrated Jira Server from On Premise to Azure Kubernetes.
The data migration is done, plugins are imported and work but re-indexing fails.

I am sure to have enough resources (CPU and RAM) but it fails every time at the same moment, between 15% and 17%.

I am running a Postgres 11 database and I have already checked for those errors How to fix a JIRA application that is unable to perform a background re-index "at this time error" | Jira | Atlassian Documentation.

The tack trace looks like that at the crash (extracted from atlassian-jira.log file):

2021-05-26 11:04:43,076+0200 IssueIndexer:thread-6 DEBUG admin 660x862x1 197b92w 127.0.0.1 /secure/admin/IndexReIndex!reindex.jspa [c.a.j.issue.index.LuceneIssueIndexProvider] Opening index provider
2021-05-26 11:04:43,079+0200 IssueIndexer:thread-4 DEBUG admin 660x862x1 197b92w 127.0.0.1 /secure/admin/IndexReIndex!reindex.jspa [c.a.j.issue.index.LuceneIssueIndexProvider] Opening index provider
2021-05-26 11:04:43,897+0200 localhost-startStop-2 INFO [c.a.jira.index.MonitoringIndexWriter] [lucene-stats] flush stats: isForegroundIndexing=true, snapshotCount=48, totalCount=48, periodSec=274, flushIntervalMillis=5724, snapshotNoDocs=232468, totalNoDocs=232468, indexDirectory=/var/atlassian/application-data/jira/caches/indexesV1/comments, indexWriterId=com.atlassian.jira.index.MonitoringIndexWriter@3ce727a3, indexDirectoryId=MMapDirectory@/var/atlassian/application-data/jira/caches/indexesV1/comments lockFactory=org.apache.lucene.store.NativeFSLockFactory@7b1f04c8
2021-05-26 11:04:43,897+0200 localhost-startStop-2 INFO [c.a.jira.index.WriterWithStats] [index-writer-stats] COMMENT : Stopping writer stats.
2021-05-26 11:04:44,740+0200 localhost-startStop-2 INFO [c.a.jira.index.MonitoringIndexWriter] [lucene-stats] flush stats: isForegroundIndexing=true, snapshotCount=54, totalCount=54, periodSec=275, flushIntervalMillis=5106, snapshotNoDocs=57147, totalNoDocs=57147, indexDirectory=/var/atlassian/application-data/jira/caches/indexesV1/issues, indexWriterId=com.atlassian.jira.index.MonitoringIndexWriter@1187ed8f, indexDirectoryId=MMapDirectory@/var/atlassian/application-data/jira/caches/indexesV1/issues lockFactory=org.apache.lucene.store.NativeFSLockFactory@7b1f04c8
2021-05-26 11:04:44,740+0200 localhost-startStop-2 INFO [c.a.jira.index.WriterWithStats] [index-writer-stats] ISSUE : Stopping writer stats.
2021-05-26 11:04:50,784+0200 localhost-startStop-1 INFO [c.a.jira.startup.JiraHomeStartupCheck] The jira.home directory ‘/var/atlassian/application-data/jira’ is validated and locked for exclusive use by this instance.

I can provide the Atlassian Support Zip if needed.

Nota: I successfully re-indexed these data on an other environment (not on AKS) so nothing is needed to be modified into the database.

Thank you for your help.

Pierre

PierrePuiseux · June 4, 2021, 12:49pm

Hello,

I solved my problem by changing livenessProbe and readynessProbe in Kubernetes.

Jira status page during re-indexation sends a 503 HTTP code, so Kubernetes considers the application in failure and it restarts the pod before the reindexing finishes.
I now watch out the TCP port and not the HTTP code on /status page and it works.

I think that Jira shouldn’t send a 500 HTTP code while re-indexing because it is a “normal” and “wanted” operation launched by an administrator of the application.
For the moment, I edited Helm Chart that I used (https://github.com/stevehipwell/helm-charts/pull/232).

Am I the only person who faced this problem ?