ACSB /install webhooks hang because there is only 1 thread to execute them

In the SpringBoot framework of Atlassian Connect, has anyone had the problem that Atlassian Fortified displays non-100% “Installation callback success rate” while the CPU is low, RAM is low, the instance seems healthy?

We’ve found what we believe is the source of the problem: It seems /install webhooks are blocking each other because they are executed in a single thread. And worse, the same queue manages AddonInstalledEvents which are supposed to be published asynchronously.

Our workaround: Setting spring.task.scheduling.pool.size to 20 instead of the default 1.

But it would be good if someone could send a PR to Atlassian, since it seems I don’t have the permissions. Look at the blue lines in the code:

  • Line 90: It creates a future to guarantee that the future will be executed in less than 3 seconds. However, the way Futures work in Java is that they are jobs in a thread. So which threadpool is it using? That’s right, the transactionExecutor, which size is 1 by default.
  • Line 131: It creates an async event so that plugins can deal with their own data after installation. However, which threadpool is it using? That’s right, the same transactionExecutor.

Are we the only ones with this problem, or does everyone configure their threadpool size before going live?

Using Atlassian Connect Spring Boot 2.3.6 and Spring Boot 2.7.12.

Hello

I’ve checked this on Connect Spring Boot 3.0.4 and Spring Boot 2.7.13 in our app. It seems fine for us.

The installedImpl is scheduled to a Spring Boot thread pool with a core size of one. However, it has a max thread limit of Int.Max in our case with the default Spring configuration. So, it spans extra threads if its core threads is busy.

In my testing multiple installs could go on concurrently, as far as I can tell. Snippets my test Thread dumps:

"task-1@15228" prio=5 tid=0x3a nid=NA sleeping
  java.lang.Thread.State: TIMED_WAITING
	  at java.lang.Thread.sleep(Thread.java:-1)
	  at ch.mibex.HackHack.sleepSomeTime(HackHack.java:6)
	  at com.atlassian.connect.spring.internal.lifecycle.LifecycleController.installedImpl(LifecycleController.java:126)
	  at com.atlassian.connect.spring.internal.lifecycle.LifecycleController.lambda$installed$0(LifecycleController.java:84)

"task-3@15328" prio=5 tid=0x3c nid=NA runnable
  java.lang.Thread.State: RUNNABLE
	  at java.lang.Thread.sleep(Thread.java:-1)
	  at ch.mibex.HackHack.sleepSomeTime(HackHack.java:6)
	  at com.atlassian.connect.spring.internal.lifecycle.LifecycleController.installedImpl(LifecycleController.java:126)

"task-2@15326" prio=5 tid=0x3b nid=NA runnable
  java.lang.Thread.State: RUNNABLE
	  at java.lang.Thread.sleep(Thread.java:-1)
	  at ch.mibex.HackHack.sleepSomeTime(HackHack.java:6)
	  at com.atlassian.connect.spring.internal.lifecycle.LifecycleController.installedImpl(LifecycleController.java:126)

"http-nio-8090-exec-9@13329" daemon prio=5 tid=0x33 nid=NA waiting
  java.lang.Thread.State: WAITING
	  at jdk.internal.misc.Unsafe.park(Unsafe.java:-1)
	  at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:252)
	  at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:444)
	  at java.util.concurrent.FutureTask.get(FutureTask.java:203)
	  at com.atlassian.connect.spring.internal.lifecycle.LifecycleController.installed(LifecycleController.java:91)
	  at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(NativeMethodAccessorImpl.java:-1)
	  at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)

Also, the spring.task.scheduling.pool.size property seems to change the size of another thread pool, which doesn’t seemed be used in our app at all. Note again, this are different versions of Spring and Atlassian connect.