Confluence Server 7.14 Hibernate Errors

jcarter · September 7, 2021, 10:22pm

With the release of the Confluence 7.14 EAP we’ve done some testing of ScriptRunner for Confluence. While most features seem to work about as expected, a lot of our automated tests are failing with hibernate errors whenever they try to cleanup after themselves by deleting a test page or test space that they created.

I assume that this is due to the Hibernate upgrade, but it’s not immediately obvious what to do about it from the upgrade notes.

We’re not using any of the now forbidden APIs (HibernateContentQueryFactory, RegionAccessStrategy).

The first such error in our test suite comes when trying to delete a page via the com.atlassian.confluence.core.ContentEntityManager$removeContentEntity

2021-09-05 04:21:54,203 ERROR [http-nio-8080-exec-7] [core.persistence.hibernate.HibernateObjectDao] Unable to index object: page: third page v.1 (854290) -- deleted object would be re-saved by cascade (remove deleted object from associations): [com.atlassian.confluence.pages.Page#854290]
javax.persistence.EntityNotFoundException: deleted object would be re-saved by cascade (remove deleted object from associations): [com.atlassian.confluence.pages.Page#854290]
	/// LONG STACKTRACE GOES HERE //

A similar, but distinct error occurs when trying to delete a space using com.atlassian.confluence.spaces.DefaultSpaceManager#removeSpace(com.atlassian.confluence.spaces.Space).

org.springframework.dao.DataIntegrityViolationException: Hibernate operation: could not execute statement; sql=n/a; Referential integrity constraint violation: "FKLMWEU06NFT59G7MW1I1MYORYS: PUBLIC.CONTENT FOREIGN KEY(SPACEID) REFERENCES PUBLIC.SPACES(SPACEID) (884981)"; SQL statement:
delete from SPACES where SPACEID=? [23503-200]; nested exception is org.h2.jdbc.JdbcSQLIntegrityConstraintViolationException: Referential integrity constraint violation: "FKLMWEU06NFT59G7MW1I1MYORYS: PUBLIC.CONTENT FOREIGN KEY(SPACEID) REFERENCES PUBLIC.SPACES(SPACEID) (884981)"; SQL statement:
delete from SPACES where SPACEID=? [23503-200]
	at org.h2.message.DbException.getJdbcSQLException(DbException.java:459)
	/// LONG STACKTRACE GOES HERE //
Caused by: org.h2.jdbc.JdbcSQLIntegrityConstraintViolationException: Referential integrity constraint violation: "FKLMWEU06NFT59G7MW1I1MYORYS: PUBLIC.CONTENT FOREIGN KEY(SPACEID) REFERENCES PUBLIC.SPACES(SPACEID) (884981)"; SQL statement:
delete from SPACES where SPACEID=? [23503-200]

Any tips welcome. cc the Wise @Javan.

I’ve attached files with a fuller stack trace.

full_page_delete_stacktrace.txt (47.5 KB)
full_space_delete_stacktrace.txt (42.6 KB)

Javan · September 8, 2021, 5:14am

Hey Jonny!

We haven’t actually released that Hibernate change yet We’re working on getting it just right™ so the upgrade won’t be making the 7.14 release.

In terms of the issue you’re having, I’ve reached out to my team to give me a hand digging into the nitty gritty with this one. I’ll post back here once I have some updates!

patricia · September 8, 2021, 9:56am

Hi @Javan ,

Does this mean that the information in Preparing for Confluence 7.14 | Confluence Data Center and Server 8.6 | Atlassian Documentation is not correct?

We are targeting Confluence 7.14 for this upgrade.

Thanks

jcarter · September 8, 2021, 3:17pm

Just taking a shot in the dark here, but could it be that the confluence-plugin-test-resources jar needs to be updated? We rely on that to create a template home directory in our build.

I noticed that part of 7.14 is a change from C3P0 to Hikari for connection pool management. Yet the confluence.cfg.xml file in the [confluence-plugin-test-resource|https://packages.atlassian.com/maven-closedsource-local/com/atlassian/confluence/plugins/confluence-plugin-test-resources/7.14.0-beta1/] zip & jar files still has a bunch of c3p0 configuration properties.

I imagine those get ignored if Hikari is in action, but perhaps we need analagous properties set?

The properties in question are:

    <property name="hibernate.c3p0.acquire_increment">1</property>
    <property name="hibernate.c3p0.idle_test_period">100</property>
    <property name="hibernate.c3p0.max_size">60</property>
    <property name="hibernate.c3p0.max_statements">0</property>
    <property name="hibernate.c3p0.min_size">20</property>
    <property name="hibernate.c3p0.timeout">30</property>

Like I said, purely a shot in the dark, but I was looking through the completed changes listed on the preparing for… document and that seemed like the most likely bit of the upgrade notes to have an effect.

UPDATE: I tried adding the recommended hikari XML properties to confluence.cfg.xml and restarting Confluence afresh. Unfortunately, I still get the above errors. So I don’t think it’s the Hikari properties in confluence.cfg.xml causing the problem.

rrobins · September 8, 2021, 11:00pm

I’ve confirmed that the hibernate upgrade won’t make it for 7.14 and have updated the information on the preparing for 7.14 page. Sorry for the delay.

patricia · September 9, 2021, 8:19am

Hi @rrobins , thanks for confirming. However, I still see this change targeted for 7.14 in Preparing for Confluence 7.14 | Confluence Data Center and Server 8.0 | Atlassian Documentation

So if is not going to be done in 7.14, could we assume it will be targeted for 7.15?

ttranminh · September 9, 2021, 11:37am

Hi @jcarter,
We did make some improvements on how a space is removed in 7.14. The improvement is that we now delete the pages in the space in batches (each batch is executed in a separate transaction). And the removeSpace method is also executed in a new transaction as well. However, it seems that the 1st error occurred while the test was deleting a page. If possible, would you please post a sample test that clean up itself? Did the 2 errors occur when executing the same test or different tests?

Tam

jcarter · September 10, 2021, 7:44pm

@ttranminh - Thanks! I think I know what’s happening.

And the removeSpace method is also executed in a new transaction as well.
If possible, would you please post a sample test that clean up itself?

Sometimes all it really takes is the right question. This was exactly the insight I needed to suss out what’s going on! I tried to create a minimal test that would reproduce the error. In doing that, I discovered that our test fixtures were cleaning up the space first, then the pages. Because space cleanup is happening in a different transaction, we were getting the above hibernate transaction errors. I’m not certain this will resolve all the problems, but it gives me some direction to poke at.

jcarter · September 10, 2021, 8:34pm

Okay, update for @ttranminh. Individual tests are passing now, but when I run multiple tests in succession, each of which creates & deletes a space as part of its operation, I get intermittent errors like the second one in my original post.

org.springframework.dao.DataIntegrityViolationException: Hibernate operation: could not execute statement; sql=n/a; Referential integrity constraint violation: "FKLMWEU06NFT59G7MW1I1MYORYS: PUBLIC.CONTENT FOREIGN KEY(SPACEID) REFERENCES PUBLIC.SPACES(SPACEID) (884981)"; SQL statement:
delete from SPACES where SPACEID=? [23503-200]; nested exception is org.h2.jdbc.JdbcSQLIntegrityConstraintViolationException: Referential integrity constraint violation: "FKLMWEU06NFT59G7MW1I1MYORYS: PUBLIC.CONTENT FOREIGN KEY(SPACEID) REFERENCES PUBLIC.SPACES(SPACEID) (884981)"; SQL statement:
delete from SPACES where SPACEID=? [23503-200]
	at org.h2.message.DbException.getJdbcSQLException(DbException.java:459)
	/// LONG STACKTRACE GOES HERE //
Caused by: org.h2.jdbc.JdbcSQLIntegrityConstraintViolationException: Referential integrity constraint violation: "FKLMWEU06NFT59G7MW1I1MYORYS: PUBLIC.CONTENT FOREIGN KEY(SPACEID) REFERENCES PUBLIC.SPACES(SPACEID) (884981)"; SQL statement:
delete from SPACES where SPACEID=? [23503-200]

I haven’t quite come up with reliable steps to reproduce the problem; it doesn’t appear to be one test in particular that is causing the problem. Sometimes the tests can create & delete spaces, but at some point they start to error out. It seems like once it happens, the Confluence environment is a bit borked and I can’t delete the spaces via simple scripts like this one:

import com.atlassian.confluence.spaces.SpaceManager
import com.atlassian.sal.api.component.ComponentLocator

def spaceManager = ComponentLocator.getComponent(SpaceManager)
spaceManager.getAllSpaces().each {space ->
    if (space.key.startsWith('SPCFXTR')) { // SPCFXTR is the default key prefix our fixture uses when creating spaces for tests
        log.debug "Removing space ${space.displayTitle} (${space.key})"
        spaceManager.removeSpace(space)
    }
}

After a certain point, I managed to get to a state where I couldn’t even look at the Space Directory without an error. I’ve attached the stacktrace from this.
space_directory_error.txt (32.5 KB)

CDAC won’t let me upload the space export for one of the un-deletable spaces, but I’ve got it if you want it.

rrobins · September 12, 2021, 11:27pm

My apologies @patricia, I missed updating the reference to 7.14 in the Preparing for 7.14 page. So sorry for causing confusion. I updated the page to indicate it’s now targeting 7.15.

ttranminh · September 13, 2021, 12:29am

Hi @jcarter,
Based on the error in your post, it seems that the content in that space was referenced in another transaction (maybe your test created some contents and those contents were still being processed (usually in an index queue) while the space was deleted. There could be 2 workarounds (just my speculation though) to make sure the space gets deleted successfully:

Retry the space deletion after a short period.
Ensure no other threads are processing the space content (i.e. flush the index queue before removing the space).

Hope these suggestions would help!
Cheers,
Tam

jcarter · September 15, 2021, 10:44pm

So, I’ve been looking at this in detail over the past couple of days. I have narrowed it down to one test of our REST API that causes the problem. Once that test fails to delete its test space, a bunch of other tests fail.

What, you might wonder, does this problematic test do?

It calls a REST Endpoint. One of ours. Is that endpoint holding any state? No, it doesn’t appear to be. What about the service the endpoint calls? Nope, and plenty of comparable integration tests talk to that service without equivalent failures.

I’m more than a bit puzzled. I can confirm that the endpoint works normally, so I’m not exactly worried about the release. I can ignore the test in CI for confluence 7.14 until I can find a more fundamental solution. At the same time, I can’t help but think that the problem may be with Confluence and not with ScriptRunner.

To help me debug this, I setup a couple of simple event listeners on the SpaceWillRemoveEvent and SpaceRemoveEvent:

log.debug "Space will be removed ${event.space.key}"
log.debug "Progress meter: ${event.progressMeter.status} ${event.progressMeter.percentageComplete} ${event.progressMeter.currentCount} ${event.progressMeter.total}"

On the failing test, only the first one fires during test cleanup. Here’s the output in my atlassian-confluence.log file, along with the error that follows:

2021-09-15 16:54:15,575 DEBUG [http-nio-8080-exec-3] [onresolve.scriptrunner.runner.ScriptBindingsManager] Space will be removed SPCFXTR1751493783
2021-09-15 16:54:15,576 DEBUG [http-nio-8080-exec-3] [onresolve.scriptrunner.runner.ScriptBindingsManager] Progress meter: Deleting custom content from space 2 0 0
2021-09-15 16:54:15,604 WARN [http-nio-8080-exec-3] [confluence.impl.hibernate.ConfluenceHibernateTransactionManager] Commit failed. Rolling back. Error: Hibernate operation: could not execute statement; sql=n/a; Referential integrity constraint violation: "FKLMWEU06NFT59G7MW1I1MYORYS: PUBLIC.CONTENT FOREIGN KEY(SPACEID) REFERENCES PUBLIC.SPACES(SPACEID) (917530)"; SQL statement:
delete from SPACES where SPACEID=? [23503-200]; nested exception is org.h2.jdbc.JdbcSQLIntegrityConstraintViolationException: Referential integrity constraint violation: "FKLMWEU06NFT59G7MW1I1MYORYS: PUBLIC.CONTENT FOREIGN KEY(SPACEID) REFERENCES PUBLIC.SPACES(SPACEID) (917530)"; SQL statement:
delete from SPACES where SPACEID=? [23503-200]
	at org.h2.message.DbException.getJdbcSQLException(DbException.java:459)
	at org.h2.message.DbException.getJdbcSQLException(DbException.java:429)
	at org.h2.message.DbException.get(DbException.java:205)
	at org.h2.message.DbException.get(DbException.java:181)
	at org.h2.constraint.ConstraintReferential.checkRow(ConstraintReferential.java:373)
	at org.h2.constraint.ConstraintReferential.checkRowRefTable(ConstraintReferential.java:390)
	at org.h2.constraint.ConstraintReferential.checkRow(ConstraintReferential.java:265)
	at org.h2.table.Table.fireConstraints(Table.java:1057)
	at org.h2.table.Table.fireAfterRow(Table.java:1075)
	at org.h2.command.dml.Delete.update(Delete.java:153)
	at org.h2.command.CommandContainer.update(CommandContainer.java:198)
	at org.h2.command.Command.executeUpdate(Command.java:251)
	at org.h2.server.TcpServerThread.process(TcpServerThread.java:406)
	at org.h2.server.TcpServerThread.run(TcpServerThread.java:183)
	at java.lang.Thread.run(Thread.java:748)

 -- url: /confluence/plugins/servlet/scriptrunner/remote-control | traceId: 47e933eccafe744e | userName: admin
2021-09-15 16:54:15,604 WARN [http-nio-8080-exec-3] [confluence.impl.hibernate.ConfluenceHibernateTransactionManager] Performing rollback. Transactions:
    [com.atlassian.confluence.spaces.DefaultSpaceManager.removeSpace]: PROPAGATION_REQUIRES_NEW,ISOLATION_DEFAULT (Session #1458418868)
        [null]: PROPAGATION_REQUIRES_NEW,ISOLATION_DEFAULT (Session #220329135)
            [pluginRequires_New_Read_WriteTx]: PROPAGATION_REQUIRES_NEW,ISOLATION_DEFAULT (Session #1204409187)
            [null]: PROPAGATION_REQUIRES_NEW,ISOLATION_DEFAULT (Session #1871825828)
            [null]: PROPAGATION_REQUIRES_NEW,ISOLATION_DEFAULT (Session #409590822)
            [null]: PROPAGATION_REQUIRES_NEW,ISOLATION_DEFAULT (Session #287629328)
      ->[null]: PROPAGATION_REQUIRES_NEW,ISOLATION_DEFAULT (Session #2018796876)
 -- url: /confluence/plugins/servlet/scriptrunner/remote-control | traceId: 47e933eccafe744e | userName: admin
2021-09-15 16:54:15,605 WARN [http-nio-8080-exec-3] [confluence.impl.hibernate.ConfluenceHibernateTransactionManager] Performing rollback. Transactions:
  ->[com.atlassian.confluence.spaces.DefaultSpaceManager.removeSpace]: PROPAGATION_REQUIRES_NEW,ISOLATION_DEFAULT (Session #1458418868)
        [null]: PROPAGATION_REQUIRES_NEW,ISOLATION_DEFAULT (Session #220329135)
            [pluginRequires_New_Read_WriteTx]: PROPAGATION_REQUIRES_NEW,ISOLATION_DEFAULT (Session #1204409187)
            [null]: PROPAGATION_REQUIRES_NEW,ISOLATION_DEFAULT (Session #1871825828)
            [null]: PROPAGATION_REQUIRES_NEW,ISOLATION_DEFAULT (Session #409590822)
            [null]: PROPAGATION_REQUIRES_NEW,ISOLATION_DEFAULT (Session #287629328)
        [null]: PROPAGATION_REQUIRES_NEW,ISOLATION_DEFAULT (Session #2018796876)
2021-09-15 16:54:15,583 ERROR [http-nio-8080-exec-3] [atlassian.confluence.pages.DefaultPageManager] Exception occurred while deleting page with ID 884820: No row with the given identifier exists: [com.atlassian.confluence.pages.Page#884821]
 -- url: /confluence/plugins/servlet/scriptrunner/remote-control | traceId: 47e933eccafe744e | userName: admin
2021-09-15 16:54:15,600 ERROR [http-nio-8080-exec-3] [engine.jdbc.spi.SqlExceptionHelper] Referential integrity constraint violation: "FKLMWEU06NFT59G7MW1I1MYORYS: PUBLIC.CONTENT FOREIGN KEY(SPACEID) REFERENCES PUBLIC.SPACES(SPACEID) (917530)"; SQL statement:
delete from SPACES where SPACEID=? [23503-200]
 -- url: /confluence/plugins/servlet/scriptrunner/remote-control | traceId: 47e933eccafe744e | userName: admin
2021-09-15 16:54:15,601 ERROR [http-nio-8080-exec-3] [org.hibernate.internal.ExceptionMapperStandardImpl] HHH000346: Error during managed flush [org.hibernate.exception.ConstraintViolationException: could not execute statement]
 -- url: /confluence/plugins/servlet/scriptrunner/remote-control | traceId: 47e933eccafe744e | userName: admin

I don’t have the source for the 7.14.0-beta1 version, but I can guess from the 7.13 source and the class files from the jars in the webapp that this log means the com.atlassian.confluence.content.CustomContentSpaceRemovalListener has fired okay. I guess the question is what’s happening at lines 269 and 214 in the DefaultSpaceManager nowadays, and why might it fail as described above. I don’t doubt that separating the deletion of a space into separate transactions is the underlying cause.

BTW, @ttranminh , I tried your suggestion of flushing the index queue before removing the space. No dice. Same error.

I’m open to any ideas, of course. I’d also be happy to hop on a call and talk code in detail. You know where to find me.

EDIT: As a further aside, here’s the database cleanup I’ve had to do after the test fails leaving behind some detritus.

delete from public.NOTIFICATIONS where CONTENTID in (select CONTENTID from public.content where SPACEID = :deadSpaceId)
delete from public.CONTENTPROPERTIES where CONTENTID in (select CONTENTID from public.content where SPACEID = :deadSpaceId)
delete from public.BODYCONTENT where CONTENTID in (select CONTENTID from public.content where SPACEID = :deadSpaceId)
delete from public.content where SPACEID = :deadSpaceId

This is true even if I start a new thread after the first attempt to delete the dead space fails. So simply retrying the space deletion after a short period isn’t a solution. The space deletion partly happens, then fails midway through, leaving behind some junk that causes all future space deletion operations to fail.

jcarter · September 16, 2021, 9:00pm

Okay, final update: I have it!

The final problems I was having, above, were because my fixture for creating pages wasn’t clearing things out properly before the SpaceManager removed the space. That’s what was leading to that final hibernate error: a page that was still in the session somewhere would have been re-saved during space deletion. I believe that the reason this one test triggered it is because it creates some parent/child relationships between pages as part of its execution.

I’ve changed our PageFixture from doing this at cleanup:

createdPages.reverse().each {
    def page = pageManager.getPage(it.id)
    if (page) {
        pageManager.getVersionHistorySummaries(page).each {
            pageManager.refreshPage(pageManager.getPage(it.id))
        }
        contentPermissionManager.setContentPermissions([:], page)
        pageManager.removeContentEntity(page)
    }
}
createdPages.clear()

To doing this:

createdPages.reverse().each {
    def page = pageManager.getPage(it.id)
    if (page) {
        pageManager.getVersionHistorySummaries(page).each {
            pageManager.refreshPage(pageManager.getPage(it.id))
        }
        pageManager.trashPage(page, DefaultDeleteContext.SUPPRESS_NOTIFICATIONS)
        trashManager.purge(page.spaceKey, page.id)
    }
}
createdPages.clear()

Thanks for all the help @ttranminh & @Javan, and for reading my brain dumps.

ttranminh · September 20, 2021, 12:10pm

Hi @jcarter ,
I am glad that you have found the root cause!