RFC 130: Upcoming changes to invocation rate limits

LilyYang · April 13, 2026, 11:35pm

Project Summary

Publish: April 14, 2026
Discuss: April 28, 2026
Resolve: May 25, 2026

Problem

Context

Forge apps are subject to both Forge platform and Atlassian app limits, which are in place to ensure fair usage and reliability
- REST API calls from Forge apps to Atlassian apps (such as Jira and Confluence) are impacted by Atlassian app limits (e.g. points-based rate limiting). This RFC does not relate to these limits.
- The Forge platform applies limits on the rates of invocations, static resources, and storage consumption. The limits are documented here.
Different rate-limiting models are applied to different types of invocations in order to cater to their different use cases. This document is seeking feedback for user-led invocations only. The table below outlines the different types of invocations and their rate limits:

Type of invocation	How is rate limiting handled?	In scope?
Interactive invocations (i.e. resulting from user interaction with a specific Forge module and/or app), including: - Usage of app via UI - Web triggers - Jira workflow modules - Confluence adfExport - JQL functions	Limited on a fixed, per-minute window and not automatically retried. The current limits are documented here and are as follows: - Per environment: 30k RPM - Per installation: 5k RPM - Per user: 1.2k RPM	Yes
Scheduled triggers	There are separate limits applied to the number of scheduled triggers per app so they will not be restricted by invocation rate limits	No
Atlassian app events and async events (via the async events API)	Rate limiting events are handled gracefully by the Forge platform and invocations are eventually consistent	No

Areas of concern

As the Forge platform continues to grow, rate limits and how they are handled will need to adapt to maintain reliability and support larger-scale apps. There are two key issues that this RFC addresses:

The existing model of global, app-level rate limits doesn’t scale for apps with large numbers of users and installations.
Retry logic for user-led invocations is difficult to implement as waiting through the minute-long rate-limiting window can be quite disruptive

Proposed Solution

Flatten rate-limiting model by removing app-level and per-user limit. Apps will only be rate-limited on a per-installation basis
- This ensures that limits scale fairly for apps with large numbers of users and installations
- Per-user invocation limits will no longer be enforced at a platform-level and can instead by implemented within each Forge app as needed. Point (2) will support this change.
Rate limited invocations will return information about the rate-limiting event to support in-app retry logic.
- Rate limiting errors will contain the field rateLimitProperties which contains rateLimitValue, rateLimitRemaining, rateLimitReset. These can be used to implement retry behaviour
Migrate rate limits from a per-minute to a per-second, sliding window
- This will support platform reliability with the removal of global limits
- This will also provide an opportunity for graceful retry methods

The following table describes the proposed changes to user-initiated invocation limits, via the UI or web triggers. Our analysis of current invocation traffic indicates that a very small number of installations would experience rate limiting under these proposed limits. We will contact you directly if your app will be impacted to discuss next steps, but you are also welcome to reach out to us if you have questions or concerns about your app.

Level	Current	Proposed	Change
Per user	1.2k RPM	Remove
Per app installation	5k RPM	300 RPS OR 7k RPM: whichever is hit first - The per-second limit protects against spiky traffic	1.4x increase over a one-minute timeline
Per environment	30k RPM	Remove	Removal of global limit provides more flexibility, especially for apps with large numbers of installs

Handling invocation rate limits

In preparation for these rate limits, we recommend implementing appropriate rate-limiting retry mechanisms in your Forge apps. Some examples of this are demonstrated below:

Retrying user-initiated UI invocations (i.e. invoke via resolver)

A back-off with retry method is suitable for user-initiated invocations as the invocation response is returned directly to the invoking user.

async function fetchTextWithRetry() {
    const maxRetries = 3;
    let retries = 0;
  
    while (retries < maxRetries) {
      try {
        const data = await invoke('getText', { example: 'my-invoke-variable' });
        setData(data);
        return;
      } catch (error) {
        // Do not retry if the error is not related to rate-limiting
        if (error.status !== 429) {
          throw error;
        }

        // Throw an error if the invocation has been retried 
        retries++;
        if (retries === maxRetries) {
          throw error;
        }
  
        // Wait until the rate limit window resets before retrying
        // This value will always be less than 1 second
        const delayInMilliseconds = Math.floor(error.limitHeaders['rateLimitReset'] - Date.now());
        if (delayInMilliseconds > 0) {
          await new Promise(resolve => setTimeout(resolve, delayInMilliseconds));
        }
      }
    }
  }

Retrying web trigger invocations

async function callWebtriggerWithRetry() {
  const maxRetries = 3;

  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch('http://webtrigger.example.com', {
      method: 'POST',
    });

    // Retry if rate limit is exceeded
    if (response.status === 429) {
      if (attempt === maxRetries - 1) {
        throw new Error(`Failed to call webtrigger after ${maxRetries} attempts`);
      }

      // Wait until the rate limit resets before retrying
      // This value will always be less than 1 second
      const delayInMilliseconds = Math.floor(response.headers.get('rateLimitReset') - Date.now());
      if (delayInMilliseconds > 0) {
        await new Promise(resolve => setTimeout(resolve, delayInMilliseconds));
      }
      continue;
    }

    // Throw non-rate-limit errors
    if (response.status !== 200) {
      throw new Error(`Webtrigger returned status: ${response.status}`);
    }

    // Return successful response
    return response.json();
  }
}

Invocations not initiated by users (i.e. product events and async events)

These are retried under the hood and will eventually succeed. If a rate-limiting event does happen, developers should consider making requests in batches to distribute the load.

Asks

While we would appreciate any reactions you have to this RFC (even if it’s simply giving it a supportive “Agree, no serious flaws”), we’re especially interested in learning more about:

Whether installation-level, per-second limits provide enough flexibility for your apps, especially those with large numbers of users
- If not, what processes and/or changes can we put in place to best support your use-case?
How else we can be providing support in the period leading up to this change
Other limits that may be impacting your apps which may also require review

kashev · April 14, 2026, 2:17am

This proposal won’t work well for our apps, and I hope it’s materially changed before being put into effect. The current per-user scaling points never quite made sense to me, but the number of users in an instance is a good proxy for the amount of content in the installation. Our app, and my guess is, many others, make a number of requests that have an approximately linear relationship with the amount of content (pages, blog posts, work items) in a Confluence or Jira instance. Thus, for operations that need to read many pieces of content, large app customers will, by definition, pay the most (since they pay per-user), and get worse relative performance.

HeyJoe · April 14, 2026, 3:16am

Hi @kashev ,

I’m not quite sure I understand your feedback.

You say that per-user scaling points don’t make sense for you, but in this proposal we are removing the per-user scaling limits that exist today. The proposed rate limits will be greatly simplified and be only keyed by installation.

Could you provide some more information about why this won’t work for you, and what you’d like to see us do differently?

nathanwaters · April 14, 2026, 4:07am

Rate limit scoped per tenant = structural isolation to prevent the noisy neighbour problem.
Rate limit scaled per user = quota calculation that matches expected usage demand (and acts as a ~decent proxy for tenant content scale).

Which means the formula should look something like:

per-tenant quota (RPM) = base + (total users * per-user allowance)

Enforcement would only occur at the per-tenant level not the per-user level. No need for RPS complexity. That gives you a simple predictable formula whether the install has 10 users or 10,000.

And this should be applied consistently across all rate limits. eg if you’re going to say that global pool limits are illogical (of course they are!) then don’t apply them to the REST API rate limits.

jens · April 14, 2026, 6:18am

Like @kashev and @nathanwaters said (I think), the number of licensed users should definitely affect the RPM limit.

It doesn’t seem to make a lot of sense to apply the same limits for example to 5 and 50K user instances.

Essentially repeating @nathanwaters formula:

per-tenant quota (RPM) = base + (total users * per-user allowance)

Whether you additionally apply a limit to individual users, or whether it’s RPM or RPS shouldn’t matter too much then.

The noisy neighbour problem must be solved, but the “noisy family member” problem (not sure if that exists - I refer to other users on same instance) is probably not the biggest concern for vendors.

LilyYang · April 14, 2026, 6:33am

Thanks @nathanwaters and @jens for the feedback. At the time of invocation, we don’t have data on the total number of users on a tenant, so we aren’t able to apply that formula directly. Additionally, maintaining a uniform rate limit helps us to ensure the stability of our platform.

However, we can implement a process to assign a higher limit to tenants with a large number of users on a per-needs basis. Does this process sound like a suitable solution to this problem? Happy to hear your thoughts and ideas otherwise!

nathanwaters · April 14, 2026, 6:58am

That data gap needs to be fixed first then.

A single tenant/install can have up to 100,000 users on Jira and 150,000 users on Confluence.

How would a rate limiting model work that is not tied to that user count variable?

REST API rate limit tier 1 global pool opens the entire marketplace to DoS attacks but they do use a variable user count. Instead of fixing the obvious flaw they too went with a “per-needs process” which has unnecessarily spun up a huge new bureaucratic waste of time.

SkanderBenMahmoud · April 14, 2026, 8:35am

I want to raise a security concern with removing per-user limits.

The current per-user cap (1.2k RPM) acts as a blast radius limiter – a single compromised account or malicious user cannot exhaust the entire installation quota and deny service to everyone else in the tenant. Removing it creates an straightforward DoS vector, and web triggers make it worse since they are often publicly accessible.

Pushing per-user enforcement down to app code is a regression, and most developers won’t think about the adversarial case.

AaronMorris1 · April 14, 2026, 9:58am

@LilyYang – Thank you for this RFC.

Like the other responders, I am confused by the idea of implementing installation-level limits that are insensitive to the number of users in an installation…especially given that the RFC is scoped to user-initiated events.

A snippet from the problem statement:

It is clear that the proposed model would improve scalability for apps with a large number of installations. (Win!) But it is unclear how it improves scalability for apps with large numbers of users (per installation).

If I understand the proposal correctly, then:

# of users in an installation	Average limit per user
1	300 RPS or 7K RPM
10	30 RPS or 700 RPM
100	3 RPS or 70 RPM
1,000	0.3 RPS or 7 RPM
10,000	.03 RPS or .7 RPM
100,000	.003 RPS or .07 RPM

Is that an accurate interpretation? If so, it doesn’t seem scalable at all.

Of course, it is unreasonable to expect 100% of all users on a site to be using an app at the same time, but the concern remains. On the surface, it seems that a single-user site could use an app aggressively and never once be rate-limited, while an enterprise installation might experience rate-limiting routinely without aggressive usage.

I think it would help the conversation if you were to explain how the new model will be more scalable for “apps with large numbers of users,” which is explicitly in scope of the problem statement (as I read it).

scott.dudley · April 14, 2026, 1:25pm

process to assign a higher limit to tenants with a large number of users on a per-needs basis

As in prior comments, having to consider “per-needs” issues and requesting manual allocations would add additional complexity to a process that should ideally deal with all of this automatically.

At the time of invocation, we don’t have data on the total number of users on a tenant, so we aren’t able to apply that formula directly

I agree with Nathan’s comment: it seems like Atlassian should first figure out the problem of how to get data on the total number of users for a tenant, then once that is solved, use that data to come back and design a per-tenant formula that scales.

I imagine that eventual consistency would be fine here for the majority of vendors (meaning no need to fetch seat count data in real time, so long as it converges within, say, half a day, and maybe also give a generous allocation to apps whose seat count data has not yet propagated).

Like other posters, I do like the prior per-user-per-minute limits that Atlassian is proposing to remove: 1200 invocations/minute is 20 invocations per second, which already seems like a lot. As mentioned in previous comments, having this does help to contain the blast radius from a single user.

Other problems that the per-user limit helps prevent: automations running amok, a single bad actor intentionally generating huge costs for an app, and Atlassian bug/design issues that overwhelm the app’s quota (such as the CQL macro rendering issue where one search can invoke thousands of macros).

It also seems like an app that requires 1200 invocations per user per minute could perhaps be optimized to better batch data. Regardless, if a potential one-minute outage still represents a problem for certain vendors, how about using a token bucket that is replenished at a higher frequency (every 15 seconds)?

BogdanButnaru · April 14, 2026, 2:46pm

Two things are enormously important for helping us deal with rate limits, both before changes and in general:

ALWAYS return the rate limit properties in a header, not just when a request is rate limited. We need that info both to see how our app consumes the various limits in practice, and to measure and adjust our traffic patterns. If you only give us the info when we are at or close to the rate limit, you force us to choose between deploying untested code and abusing your systems so we can test it, and also makes it impossible to plan for our change and growth.
Before any change is effective, ALSO send a separate set of rate limit properties that tells us exactly what the future algorithm will do. And do that for a long enough period of time that we can actually adapt our code to the new (future) behavior and test it properly. This includes deploying code to production that monitors and sends back telemetry about how much of those future limits would be used in practice.

JacopoLanzoni1 · April 14, 2026, 4:05pm

Haven’t read the RFC yet, but shouldn’t this be RFC-130? You already have RFC-129: Surfacing Partner Migration Plans to Site Admins for Apps Running on Connect

PrinceNyeche · April 15, 2026, 10:15pm

Thanks for the proposal, at least this is one step in the right direction. I would like to know, how would per app installation limit work based on the number of users within the tenant? Rather than guess, its better you let us know. Likewise, echoing what @BogdanButnaru mentioned, it would be a great help if the headers always return the rate limit data. I think, that part is not too complicated to achieve, industries like DataDog return these rate-limit headers in their API always, so Atlassian can do the same.

Thank you again for coming up with the proposal and for working on a more stable Forge platform for everyone. I’m sure with everyone’s comment you can get an overall picture of what the community is asking.

jonlopezdeguerena · April 16, 2026, 10:39am

Thanks for the RFC.

I agree with the concerns already raised by others. In particular, I think installation-level limits should take tenant size into account, since a flat limit does not scale equally for small tenants and large enterprise sites.

I also strongly support always exposing rate limit headers before the limit is reached, so apps can observe real usage, tune traffic, and adapt safely to the new model.

These two points would make the proposal much easier to work with in practice.