RFC-128: Local development mocks for Forge Storage

Thank you again for the work on this @EthanFernandez !
Answering your question:

Given SQL is your primary concern, would you want direct access to the local database or would a convenience api be preferred / just as valuable?

The key requirements for us are high performance when bulk loading large datasets and full support for all database types without loss of fidelity. If the convenience API can guarantee this, it would be sufficient for our needs.

Thanks for your response.

Ideally, I’d prefer rate limits and latency behaviour to be as close as possible to production, as that would help surface potential issues early in the development cycle.

If those limits and latency settings could also be configurable, that would definitely be a plus, as it would give teams more flexibility depending on their testing strategy.

Hi team,

Thanks for sharing this RFC. At alchemist Dev, we usually work with PostgreSQL.

Here is our feedback regarding your questions:

  1. The most important Forge storage offering we work with is SQL. 2 & 3. Regarding the capabilities, we need tools that allow us to interact with the database efficiently. Specifically, we would like to see:
  • Options to verify data quality.

  • The ability to execute fast queries locally.

  • A control system or interface to view the data quickly and easily. This is very important for debugging, so we can inspect the exact state of the data when an error occurs.

Thanks for your work on this.

Summary Post

My team and I first want to give a big thank you to everyone who’s replied so far. The feedback and comments you’ve shared have been very useful to our decision making and have given us a better understanding of your workflows.

The feedback has shaped the conversations around the design and we have a few follow up questions.

Additional questions

How you run your end to end tests at the moment

We understand that some of you are developing apps with container compute, and may not be using the javascript SDKs for storage capabilities. If you’re one of those, we’d like to understand better your existing setup (e.g. programming language, etc).

Can we please get some user stories from you, for example:

  • We use container compute and our app is built in Golang, so we have created our own package on top of the REST APIs for KVS.

  • We deploy a new forge environment to test changes, install it our site and use tunnel for our e2e tests with Cypress.


Once again, thank you for all the positive and thoughtful feedback. We’ll keep monitoring, discussing, and replying as we iterate on the design.

Thank you for your reply and detailed feedback @AnttiPeltola !

  • Mock database for Forge SQL without ratelimits

For an initial release we won’t likely support rate limits.

  • Access to that database from outside of the app.

  • Related to “Bulk seeding data”: Preferably have a way to do a different seeding for different test scenarios.

Developers frequently request this feature, and it remains a priority. We aim to include it in the initial release.

  • To run e2e tests in parallel you generally need to be careful about what data gets wiped during a test case so the seeding should also have fine grain control (it should not just wipe the entire database)

We wouldn’t want to impose anything on how you run your e2e tests, but from experience running test in parallel it can cause flakiness, so we would advise against it. But the ability to seed data may help you here.

1 Like

@EthanFernandez

Here’s more detail:

Regarding the compute, we use containers with NodeJS. We have created a custom client for Knex, which handles the SQL HTTP connection.

Our E2E tests run in CI using Playwright against a real Atlassian cloud instance.

More details about the e2e flow:

  1. Branch-based environment creation — The CI pipeline derives a Forge environment name from the branch, giving each feature branch an isolated deployment.

  2. Build & deploy — The pipeline builds the frontend, builds a Docker image for the container, pushes it to Forge’s container registry, and then deploys it to the previously created environment. Our e2e tests don’t use forge tunnels.

  3. Install on test site — The pipeline resolves the environment ID via the Forge GraphQL API and installs the app on a dedicated test instance.

  4. Create web triggers for database access — We create Forge web triggers for executing SQL. These give the test runner HTTP endpoints to interact with Forge SQL from outside the app — necessary because FSQL is otherwise only accessible from within the Forge runtime.

  5. Smoke tests — A smoke test suite runs first to verify the environment is healthy. If it fails, the pipeline exits early.

  6. Global setup & authentication — Playwright’s global setup authenticates test users against Atlassian login (including 2FA/OTP), persists session cookies, and seeds the FSQL database with test data via the web trigger.

  7. E2E test execution — Playwright tests run serially against the Confluence instance with the app installed. Tests use the web trigger to set up and tear down FSQL state per test. We capture traces, video, and screenshots on failure with retries on CI.

  8. Cleanup — A separate script deletes the branch-specific Forge environments nightly.

Note on flakiness:

  • Our database structure does allow us to run majority of the test cases in parallel without being affected by other tests, but there is a limited set that needs to run in sequence so that there is no flakiness. Also, we don’t edit common Atlassian data accessed by the test to which would be flaky.

Hi @EthanFernandez ,

Would latency modelling be the most important feature to you, and would you want it configurable or simply as accurate as possible?

That is a good question. On the one hand, I would prefer latency modelling being simply as accurate as possible, but sometimes you also want to test the worst cases that might be different from the average case. If you allow us to configure it and have a preset for “as accurate as possible”, that would be nice.

Would rate limiting and / or error injection be valuable for you?

Yes, absolutely. But also configurable. We noticed that some endpoints have “hidden” rate limits that are different from the documented once. Thus us being able to adjust these would be nice.

Resolve


Thanks again to everyone who has contributed feedback on RFC‑128 so far. The comments have been extremely helpful in shaping our initial design and clarifying which capabilities matter most for real-world workflows.

For the first release, we’ll focus on:

  • Local mocks for Forge KVS and SQL, usable in tunnel mode

  • Strong API parity (pagination, query behaviour, SQL syntax/dialect)

  • Data seeding and reset to support repeatable tests

  • Side access for SQL to inspect and manage data locally

We’re explicitly deferring, but tracking:

  • Rate limit simulation, fault/error injection, and latency simulation

  • OS support

We’re closing this RFC and moving ahead with implementation based on this scope.

Thank you.

5 Likes

CC: @ErkkiLepre @BenRomberg @clouless @FlixAndre @chrschommer @jonlopezdeguerena @ac-tom @billjamison @UlrichKuhnhardtIzym1 @AnttiPeltola

CC: @FranciscoGomez

Thank you :slight_smile: