For our Confluence Cloud addon, we have to implement ADF parsers. We’re worried about loading the whole ADF documents upfront in memory when we want to transform them. Is there a maximum size for an ADF document?
We have difficulty programming a parser which uses Java streams (I’m talking about Collections’s streams, not IO streams), because ADF does not specify the order of the “type” property in nodes. Example:
In the above excerpt, I know the type is “table” so I can reinitialize all my variable and parse each row of the table without having to wait for the closing bracket of the table node.
But in the following excerpt, I have to keep everything in memory, and wait until the end to know that it was, in fact, a table:
The order is not specified, and the same situation happens for every element, so I need to load the whole DOM in memory before I can work. I can only load the full DOM if the maximum document size is constrained.
Thank you very much for your research. Do you know whether Confluence uses a streaming API to parse ADF? Or do they build a whole model in memory of the entire document?
If we are close to their implementation, at least we can respond to changes more easily.
Best regards,
Adrien