Is there a way to serialize access to a Forge storage entry

Hello,

For various reasons I would like to store multiple pieces of information as an object in the Forge storage entry. However, I need to a way to serialize the update of this information - is possible that several users can try to update this information at the same time and the last write wins strategy will not work in this case. I would need to delay all other access to this storage entry while current connection, reads it and updates it.
Could you please tell me if there any way to achieve this with Forge storage API or/and framework?

Thank you,
Bogdan

1 Like

Hello @mabo

Have you seen if our transactions feature for custom entities will help solve for your use case?

i.e. if you mark your data being written as transient and always check for this condition before persisting the data, you can serialize data access. TTLs can be used here, if that works for your use case.

Hey @mabo, I actually made an attempt to solve this same problem in my own Forge application recently ( Atlassian Marketplace ) unfortunately the code isn’t open source but I’ll try and publish a reusable package in the near future.

The solution I took was as follows:

  1. Ensure that the data object being saved includes an attribute with the last saved timestamp
  2. When you load the object this timestamp will be included. When you go to make an update ensure that the timestamp of the loaded data is included
  3. In your save function compare the timestamp received with the latest timestamp on the object and reject the request if the object has been updated since it was loaded

Here’s where it gets tricky though. If there is high concurrency for the application then you need to prevent simultaneous access. So you can create another object in the store that you use as a “lock” that must be “acquired” before updates can be made (and the lock is released after the update is made).

If you’re using the key-value storage approach then you still cannot guarantee this will work (there is a window of opportunity where 2 requests could simultaneously update the lock) but if you’re using the SQL solution then it can guarantee two requests won’t both acquire the lock at the same time.

Unfortunately there is nothing out-of-the-box in Forge that does this for you, which is why you’ll have to write some code to achieve it, but it is possible.

I hope this helps, it’s relatively complex though so it will be better for me to put this together in an example in open-source repository.

Regards,
Dave

1 Like

Hi @varun

Thank you for your response. Could you point me to any documentation that outlines how conflict resolution is handled during a Custom Entity transaction? Additionally, is there a guarantee that the entries referenced in the transaction’s update and condition clauses are properly locked throughout its execution?

Thank you

Hi @ddraper ,

Thank you for your response. I was thinking along similar lines - though I hadn’t considered using the database to create the “lock” objects. It’s certainly a robust approach but feels like overkill for this use case.

I’ve been exploring alternatives like Custom Entity transactions, although I haven’t found any guarantees around serialization. The Forge Cache API seems more promising, especially with its setIfNotExists and TTL features, but it’s still in EAP.

What strikes me as odd is how little discussion there seems to be around this subject. Forge storage feels like a solid foundation for caching, yet without proper support for serialization or atomic operations, it cannot safely be used as a cache mechanism.

Thank you

1 Like

@mabo I think you’re 100% right and I think that the more conversations there are like this the more the need will be highlighted. Ideally there would be a solution available out of the box although I can see that it is not a requirement for all Forge applications.

In my case the need was there because there was a high likelihood that 2 people might land on the app for the same board at the same time and with the auto-save approach I’d taken there was a high probability for data loss.

I’m very aware that the solution I’ve implemented doesn’t guarantee safe atomic transactions but it’s a pragmatic approach that will greatly reduce the chances of a problem occurring.

I’m not actually in the Ecosystem team (I’m just an enthusiastic fan of Forge) but my understanding is that there is constant iteration on storage capabilities (as demonstrated by the new APIs that are available) so hopefully it will be easier in the future.

I will try and get my code shared ASAP though so you can at least take a look at it!

We have encountered similar problems and agree that forge should provide a locking mechanism that applications can use to perform concurrency control between lambdas. Ideally we could do this with atomic Forge cache operations using a Redlock or similar algorithm. Atlassian it seems, frowns on this as a non-scalable solution due to the lock acquisition load and would prefer if you use something like transactions. However as you’ve noted the transaction boundary and lock control capabilities are quite limited there and so are the throttle limits. One alternate approach you might consider is to serialize your work by creating task events and sending them through and Async Event Queue that has been configured with a concurrency limit of 1. It’s not great but does allow serialized access to a resource without competition from other lambda’s.

I’ve done something similar. Here’s a rough outline of the approach:

We built a chunked storage system on top of Forge’s storage API to handle json datasets that exceed the 240KiB limit. The main challenge was making it safe for multiple users/processes to read/write simultaneously without corrupting data.

High-Level Approach:

  • Split large objects into chunks with a specific metadata block that is used for tracking
  • Use atomic write patterns and validation to ensure data consistency
  • Implement cleanup mechanisms for failed operations

Key Concurrency Mechanisms:

UUID Isolation: Each multi-part write gets a unique UUID appended to chunk keys (data_chunk_0_abc123) as well as a full set of metadata. This prevents chunks from different concurrent writes from mixing - even if two processes write to the same logical key simultaneously, their chunks remain separate. NOTE: the metadata includes format version, app version, timestamp, whether the write is 1 or many chunks and lastly the actual data.

Write-Then-Metadata Pattern: Always write all chunks first, then the main metadata last. The main metadata is the key that defines the ownership of the storage. This ensures readers never see incomplete data - they either get the complete dataset or nothing at all.

Chunk Validation: When reading, we validate that every chunk’s metadata (UUID, timestamp, total size, etc.) matches the main metadata. If any chunk is from a different write operation, we delete the entire corrupted dataset and return null.

Jittered Rate Limiting: Multiple processes hitting Forge’s rate limits would normally retry simultaneously, making the problem worse. We add random jitter to retry delays, spreading the load over time.

Cleanup Process: A scheduled weekly background task removes orphaned chunks from failed writes based on age, preventing storage bloat.

Task Isolation: Each operation gets a unique taskId, creating separate namespaces so concurrent operations don’t interfere.

Downsides:

  1. We are dealing with json and to split it up into chunks, we had to escape the string and write the chunk as a string, which causes undesirable bloat in the size of the value.
  2. It’s irritating to have a process to clean up orphaned chunks.
  3. It feels like a lot of code for something that seems like it should be simple.

Would appreciate any feedback on the approach including potential issues or suggestions to make it more efficient.

@mabo - Apologies for the delayed response. ( I missed the notification for your reply)

…that outlines how conflict resolution is handled during a Custom Entity transaction

  • If there’s an ongoing transaction on the entities specified in the transaction, only one transaction will be allowed to proceed, preventing concurrent writes.

  • Transactions are required to complete atomically, resulting in all or none semantics

… is there a guarantee that the entries referenced in the transaction’s update and condition clauses are properly locked throughout its execution?

  • The underlying data store uses optimistic locking to prevent concurrent writes. Should the underlying version change for the entry that is being updated, you will receive an error from the storage API.

We will update the transactions API documentation to capture these points around locking and atomic commits.

Thanks,

Varun

forge should provide a locking mechanism that applications can use to perform concurrency control between lambdas

There’s some work happening on this that may help your use case. Let me get my colleagues who are capturing requirements/building a solution to chime in here.

Hi @jeffryan ,

Thank you for sharing your perspective on this issue. I had considered using async events at one point, but was advised against it by someone at Atlassian. Their reasoning was: “As far as I understand, async events may not work in this circumstance, as there is no ordered consistency.”
It’s possible this guidance predated the introduction of concurrency limits, which I believe weren’t supported initially. Alternatively, I may not have articulated the specifics of my use case clearly enough at the time.

Thank you @varun for providing these additional details and it would be great if they would be included in the documentation.

Hi @lead_crimsalytics ,

Thank you for sharing your approach to this issue. I may be missing something, but from what I can tell, it looks like you’re working with a single metadata block that can be updated by multiple connections - which seems to reintroduce the original problem.

If you’re using multiple metadata entries instead, how do you determine whether they are separate or overlapping requests?

Hi @mabo - you’re 100% correct as I’m using a last writer wins approach which works for me as I’m dealing with a cache. Apologies as I missed that detail in your original request. It seems like some sort of semaphore or mutex is required, which is incredibly irritating in this day and age—meaning we ought to be able to write sizable blocks of data without massive amounts of code in forge. The net result of this gymnastics exercise will be fewer apps and less stability and more frustration.

To achieve that locking, you could add it to the metadata block on the first chunk. The downside is that to acquire the lock, it’s a read, followed by a write, followed by another read to make sure you got the lock. Theoretically 2 processes could break this and some platform or language level lock is necessary. Others noted this earlier. The window is probably pretty small but these kinds of things make engineers crazy and users frustrated if/when they do happen.

Apologies for not being able to help!

1 Like