Exporting Attachments (advanced)

I have a very large attachments folder (several Terabytes) that I would like to export. Since space is a a premium, I have written a custom export plugin that only counts the bytes written to the various output streams (it actually does more than that, but that is irrelevant to this question).

I am currently running the count against our server and it is going to take a few days. I would like to shorten this window if possible. A find command with +1G show that a significant chunk of the space is files > 1 Gb.

The question I have is, can I safely omit these files from the export, while writing their location to a script file that will tar them manually, allowing them to be copied after the import? Or will I need to create a custom import to handle the other direction? By “safely” I mean “if I put an ‘if’ test in my plugin”, because I can fairly easily omit them.

Hm… your description is very vague…
First question: Why do you have terabytes of attachments in Confluence anyway? Maybe Confluence isn’t the right tool for you? May you should clean up content first and get rid of old data?

I think the only files you can safely omit are the extracted text files. See: Hierarchical File System Attachment Storage | Confluence Data Center and Server 8.0 | Atlassian Documentation

Else, maybe you have some old attachment revisions, so you may skip revisions… → note: you should remove revisions via the API (to keep the DB clean as well) and not just not storing the files… of course you will lose revisions this way.

First answer. Some of the attachments are large binaries. 106 of them represent 1.3 Tb.

Second answer. Yes, it is not the right tool. But I am only the hired help, not Senior Management.

Third answer. I have already purged the old data, reducing the overall size by about 30%. I have a recurring scheduled job that informs the space owners how much space their content is taking, but I cannot force them to remove it.

I have additional data since the original post. The export time is around 60 hours, and the export file size is around1.7 Tb. I estimate, based on analysis of the export, that removing the 106 files mentioned above will reduce the export time to approximately 20 hours. And, of course, the import would be faster as well.