An update on our investigation to solve intermittent iframe loading problems in Connect apps

I have just modified Sentry to include all AP keys in that case. And I get that in these cases AP is returned with

["_xdm",“parentTargets”,"_data","_hostOrigin","_top","_host","_topHost","_initTimeout","_initReceived","_initCheck","_isKeyDownBound","_eventHandlers","_pendingCallbacks","_keyListeners","_version","_apiTampered","_isSubIframe","_onConfirmedFns","_promise","_messageHandlers",“resize”,“container”,“size”,“registerAny”,“register”,"_hostModules",“defineGlobal”,“defineModule”,“subCreate”,“Dialog”,“define”,“require”,“Meta”,“meta”,“localUrl”,"_util"]

while normally it should return

["_xdm",“parentTargets”,"_data","_hostOrigin","_top","_host","_topHost","_initTimeout","_initReceived","_initCheck","_isKeyDownBound","_eventHandlers","_pendingCallbacks","_keyListeners","_version","_apiTampered","_isSubIframe","_onConfirmedFns","_promise",“request”,“messages”,“flag”,“dialog”,“inlineDialog”,“env”,“events”,"_analytics",“scrollPosition”,“dropdown”,“host”,“cookie”,“history”,“navigator”,“user”,“context”,“jira”,“dropdownList”,"_messageHandlers",“resize”,“container”,“size”,“registerAny”,“register”,"_hostModules",“defineGlobal”,“defineModule”,“subCreate”,“Dialog”,“define”,“require”,“Meta”,“meta”,“localUrl”,"_util",“getUser”,“getCurrentUser”,“getTimeZone”,“getLocale”,“getLocation”,“sizeToParent”]

1 Like

Is the URL to this fallback URL public?

Is the URL to this fallback URL public?

@BobBergman - it would have been public at the time of the test, but I don’t think we kept it live long-term.

I don’t know if this is related or not, but I have been getting some AP errors for some time now, where AP is defined (so all.js is loaded as I am checking for AP to load before accessing it), but is missing some functions/properties

Hi @RaimisJ - I haven’t heard anyone else reporting these symptoms, so I also don’t know if this is related. It would be interesting to hear if anyone else is observing the same behaviour.

@RaimisJ Thanks for the info. If an app or page includes all.js, and is loaded in the browser outside of a Jira / Confluence iframe, then AP will be missing some functions / properties as you describe.

We’ve seen this in the past for scenarios such as:

  • Automated testing of an app (without Jira / Confluence)
  • Pages served by the app that are not intended to be iframes (eg. external links)

If you, or anyone could confirm you’re seeing this error on a customer instance, inside an Atlassian iframe that would be a significant clue

@dboyd, I can confirm that these errors are for pages loaded in iframe since it otherwise has all the parameters in a query string, that the usual customer instance has.

This is not related to the main report and the issues we have been experiencing.

The main problem is that loading add-on iframe times out and add-on code is not executed at all.
When it happens, all apps/iframes are affected.

More information: Apps fail to load due to timeouts

We’re seeing this same behavior in some cases (not all though) as @RaimisJ.

Late to the party I know (holidays etc).

We’ve had an influx of these in the past couple of months. So far for us the issue has been:

  1. Corporate firewalls
  2. Content filtering extensions
  3. objects/methods on AP missing (ie AP.context ).
  4. Other (we suspect #1 but they’re able to access the page directly which then causes other issues).

What’s bothersome about #1 and #2 is that it recently started. It’s not until you start looking at HAR files that we’re able to see things(which btw - is really difficult for us to get for the customers that just uninstall us - it sure would be nice if the error messaging was updated perhaps?).

The #1 is really difficult as well since for large companies - any changes to the firewall has to go through approvals.

We had one case where the end user had to allow list the atlassian.net domain (not ours) to get things working for #2. (I suspect that #1 is related to this).

2 Likes

User here, we have iframe errors filling our logs and the thing we did change recently is that we added JSD, so our server now runs Jira Software and Jira Service Desk and it happens when we enable the qTest add-on.

base-url/browse/JSD-1 throws an error
base-url/browse/JSW-1 mostly does not

Best we can tell in our case all it does it fill up the logs, which is not welcome.

Not sure whether its another red-herring or whether putting JSD in the mix makes it more reproducible for somebody.

Hello @MartinaRiedel,

Thanks for sharing this information. From the level of detail you have provided, it’s hard to tell whether or not the symptoms you are experiencing are part of this problem or something different.

If you are able to capture additional information about the error, such as detailed log messages or a HAR file capture of the problem, please submit them via the survey link at the top of this post - http://go.atlassian.com/connect-app-load-failure-survey

Thanks,
Joe.

Hi Joe, it turned out to be something different.
Thanks a bunch for your reply.
Martina

Morning @HeyJoe,

Just wondering if you’ve been able to reliably reproduce? Anecdotally we have not seen this come through in support requests since Christmas - are you aware of any changes that may have improved the situation?

Thanks Joe, have a great day,
Nick Muldoon, Easy Agile

I faced it on my own a few times in the recent weeks.

However, the number of support cases dropped significantly. I wonder if something was fixed or customers get used to live with it.

Cheers,
Jack

Hi everyone,

Thank you to those who uploaded further information to help us troubleshoot this problem.

@nick @jack we have not deployed any changes beyond additional diagnostics/monitoring that would solve the problem. Given that at this stage, we haven’t ruled out a browser issue, it’s possible that a recent change to Chrome/Firefox/etc. has improved the situation.

Unfortunately, we have not yet identified the root cause of the problem. Despite this, I want to provide an update on our on-going investigation.

As we analysed data from affected apps, we were able to identify cases where the problem was due to an issue in the app itself. This is a pertinent reminder that the generic nature of the error makes it hard to identify the source of the error. Cases where we identified bugs in the app were solved in collaboration with the app developer.

We re-investigated potential reliability issues with the CDN. We identified ~0.05% of requests failing after a HTTP 200 response with a ClientConnectionError. We do not believe this is the source of the problem, but are still investigating.

We also investigated the HAR files uploaded to us via the survey form. One file showed an interesting scenario of duplicate GET requests to the iframe. To dig into this further, we’re building further analytics to see if this is a regular occurrence. Further investigation of this clue is proceeding.

If your app is still experiencing this problem, we encourage you to upload diagnostics for us to analyse at http://go.atlassian.com/connect-app-load-failure-survey. More information will increase the likelihood that we can identify a pattern of behaviour that leads to the root cause.

Thanks for your continued help in working to solve this problem for our shared customers.

Regards,
Joe Clark [Atlassian]

9 Likes

Thanks for the update Joe, greatly appreciated.

Late last year I reported an issue with links created by an app in Confluence break when the page hasn’t fully loaded (the user falls through to a JIRA page instead).
Could there be any relationship between that and this problem of intermittent iframe loading?
Support ticket reference is DEVHELP-5553

Hi @james.dellow,

Thanks for the extra info. I’ll get the team to take a look at DEVHELP-5553 and see if it’s related.

Thanks,
Joe.

Just to close the loop on my question, Atlassian support told me “it seems like it is unrelated to the problem you have shared. We believe this might be a problem with the JavaScript framework”

They are going to raise a public Atlassian Connect JS API ticket for it. I’ll share that when I have it, just in case anyone comes across this here.

1 Like

Is one of the symptoms of this the JWT token being expired?
I have one install where I saw this problem in the app’s log. When I proactivly asked the customer if they were experiencing problems, they told me, “We quite often receive what looks like a timeout” and provided this screenshot:

Hey James - I’ll follow up on your question with the engineering team. Thanks!

EDIT: Also, here is the issue that James mentioned: https://ecosystem.atlassian.net/browse/ACJS-1162

2 Likes

Further to my comment about the JWT token is expired, the customer followed up today in response to some troubleshooting questions and told me:

It may be coincidence but we haven’t observed this since uninstalling the following:

I had already tested my app and SubSpace Navigation being installed together and didn’t notice any issues. So just a coincidence?

I don’t know if they have other apps running on the page.