New Jira Cloud Webhook Retry Policy

All Jira webhooks have a timestamp in the body, so you should be able to identify them by that: timestamps should be pretty unique; I don’t expect two different webhooks to be sent at exactly the same time, let alone two webhooks with otherwise the same content.

To make it extra clear: the timestamp is the time the event was generated, not when the webhook was sent, so it will remain unchanged across retries.

If that’s still not enough, could you create an ACJIRA ticket for your header proposal?

Last time I looked, postfuntion webhooks did not contain a timestamp. We’d love a correlation ID too, so we don’t have to compare event bodies or hashes etc

Alright, standard Jira webhooks have timestamps, but post-functions and maybe some other webhooks might not. Also, it’s possible to subscribe to webhooks without a body, so you wouldn’t get the timestamp in this case either. I’ll put adding the header into our backlog.

3 Likes

Thanks a lot! We use both webhooks and post-function triggers and would need the correlation header for the latter.

I’m happy to say that we’ve just added a new header sent with every webhook: X-Atlassian-Webhook-Identifier that contains a unique webhook ID, the same across retries. Note that each tenant has their own pool of IDs, so to uniquely identify a webhook you actually need a pair of <tenant, webhookId>.

7 Likes

Awesome thanks! :smiley:

Thanks a lot for those continued improvements! :slight_smile:

In order to bridge the gap towards 100% reliability for webhooks, there’s now a proposal of providing a REST API for the webhook request history. Please vote/watch ACJIRA-1981 if you’d benefit from something like this. Thanks!

Does it also apply to post-function /triggered calls (which are technically webhooks)?

If so, I have a question: how can we use this to make post-function executions more reliable? I understand that if the /triggered call gets “lost” in transit, Jira will automatically retry the call at a later time (but that’s unrelated to the new header). But if the call does make it to the app, how do we take advantage of the header to handle cases where an error occurs (such as a 403 or 429 error returned by Jira) during the execution of the post-function? I understand that we could return an http error, but unfortunately we can’t because the actual post-function execution is handled by worker processes, all our web-facing processes do is queue the post-function execution and that rarely fails… This architecture was the only way to handle the variable flow of post-function calls from Jira and the very variable processing time of each post-function.

Anything we can do with the X-Atlassian-Webhook-Identifier? Could it be used to identify the “root cause” of the REST calls the app will make during the post-function execution? And eventually be used to create some equivalent of a transaction system (i.e. “eventual consistency”)?

Yes, the header is included in post-functions as well, and retries work in exactly the same way for post-function webhooks. But I’m afraid there is not much else you can do with it except for identifying retried webhooks that you have already seen. The problems you are describing have more to do with the infamous problem with DB connections, not webhooks. The way we send webhooks is hardly related to that.

Actually it’s not just about DB connections, which are a hot topic right now, but about data consistency in general, whatever the error is that prevents the post-function from completing its execution. But you’re right, it’s not related to the retries per se. I was just hoping that the id in the header could be used as the foundation for eventual consistency.

We’ve just released a new API for retrieving webhooks that failed delivery: Get failed webhooks.

2 Likes

Is this feature on?

I’m seeing the x-atlassian-webhook-identifier but not the X-Atlassian-Webhook-Retry header. I’m also forcing a 500 error but the webhook is not resent.

Do I need to do anything to enable the feature?

/webhook/failed is returning an empty

{
  "maxResults" : 100
}

Yes, this feature is enabled. You can’t enable or disable the feature manually.

The header X-Atlassian-Webhook-Retry is included only in webhooks that have been retried.
The webhook can be retried up to 15 minutes later.

The Failed webhooks API will list the webhooks only after all webhook retry attempts failed. Currently, all webhooks attempts can take up to 75 minutes.

1 Like

Hi @kkercz,

what is the request timeout for webhooks?

We’re interested because when we record the webhook internally, we retry upon failure; but it doesn’t make sense to retry past the time when Jira decides to resend the webhook. So we’d like to place a timeout on the webhook recording process based on Jira’s webhook request timeout.

Thanks
Igor

@jtrzebiatowski is this only a Jira feature? I can not find a x-atlassian-webhook-identifier in confluence webhooks? Is it on the roadmap to align the behavior of webhooks across different products?