Request for Improvement: Too little semantic markup makes automated tests unnecessarily hard

Dear Confluence DC team,

over the last months I’ve noticed a trend that annoys me in new UIs for Confluence DC. It happens more and more often that new UIs you release have nearly no semantic markup. They look nice but when you look at the DOM, there’s no way to tell what is what.

In the “good old times”, we had CSS classnames and ids and semantic elements like BUTTONs with type=submit.

Why is that important? Test automation and automated scripting. We use Playwright for automated E2E testing of our app and we also use Playwright scripts to configure fresh Confluence instances (the Playwright script basically clicks through the setup assistant, disables unwanted UPM apps, imports spaces and the like).

An example to show what I mean

The new space import assistant (we’ve encountered it in version 9.0.0-rc2 for the first time, but it might be around longer) is a great example for the problem. Let me show you some examples:

This is the success screen. There’s a green “Finished” status label in it. It would be nice to check that it exists in our automated import script to make sure everything worked. Shouldn’t be too hard, I mean it looks like it has something like class="success", doesn’t it?

Wrong! The highlighted DOM element below is the status item. There’s nothing about it except the localized label to find out what’s going on. There’s also nothing else meaningful on the whole page (anything with a class like “success” or the like) that would allow a non-human to interpret the result:

One more example from the same new UI module: After uploading the ZIP file, a modal dialog appears. We obviously want to click the blue button:

But in the DOM, the two buttons are EXACTLY the same, except for the localized labels (which we should never use because… localization) and the color, defined deep down in anonymous auto-generated CSS classes. Our only way out: selecting via the order buttons, hoping nobody moves them or changes the markup: SECTION[role=dialog] BUTTON:nth-child(2).

What’s the point?

As these examples show, this sloppiness in creating the UI seriously weakens automated test cases for everybody. Many selectors for elements are weak selectors, because we need to rely on order, hierarchy or (shudder) even the localized labels. This makes automated tests break more often than necessary, reducing the assured quality of Marketplace Apps and therefore the whole Atlassian ecosystem.

Please return to a practice of sprinkling meaningful IDs, roles and classnames in your UIs. Maybe let your engineers spend a day writing automated tests for their UIs so they feel the pain.

I understand that there’s a lot in your backlog, we’ve all been there. All sympathies, but this would really help us out. Thanks for considering!

Dear community, do you have the same problem? Or are we the only ones using stuff like Playwright? Or do we miss something?

5 Likes

Hi @SteffenMueller ,

Thanks for the feedback.

I understand the desire for a readable markup and stable selectors that could be used as a test API.
As we try to improve Confluence DC accessibility, I can offer an alternative way to write tests. Frameworks like Playwright go to great lengths to encourage good accessibility practices and to create more user-focused tests.

I would argue that using localised text in tests is useful, as that is what the user reads or hears. On the other hand, using arbitrary markup selectors like id or class makes the test use different information than the user. We have used them in many of our E2E test examples in the past and still do in most of our page objects, however we strive to use role and text based selectors instead.

In both cases that you described, Confluence still has much to improve, but ideal test would be using selectors like getByRole('dialog').getByRole('button', { name: 'Bestätigen' }) or page.getByLabel('Status') (this is where Confluence needs to get better and use semantic HTML tag like <output> for the value). I agree that relying on position or order of elements is prone to breaking during upgrades. While translations may change too, to me it seems like a legitimate issue to be picked up by the test, because we may need to update documentation or other resources based on it - it impacts users equally.

Hope that helps!

1 Like

Dear @jhronik,

thank you for responding and understanding. There’s just one detail I disagree with that I’d like to share:

My case for tests interacting with Confluence is NOT validating the Confluence UI. I want to get through the interactions with Confluence as the host application as fast and stable as possible and reach the UI of our own app for the real testing. There, I’m more open to “testing what the user really sees”. But for the host application, I don’t want to maintain a list of potential button labels (we want to test our app in multiple locales) and update it whenever the Confluence localization team changes labels.

So, I think that Confluence as a host application has an additional role in E2E tests where “stable and painfree” is more valuable than “accurate user perspective tests”, at least in my opinion.

But for the other points, I’m looking forward to seeing more semantic tags, thank you!

2 Likes

Somewhat in the same category. We do E2E tests for our Cloud products. What over and over breaks are ‘random new tutorial steps’ suddenly appearing in new places, breaking the tools.

I’m not sure a good solution for that. We would really loved to have a ‘technical’ login, where we get the least amount of ‘interactivity’. However, that is a whole new feature, so unlikely to happen =).

3 Likes

For me that’s something that I’ll also observed for a while when applying JS / CSS for little hacks here and there. I’ll think it was first Crowd as they switched some UI to React and now more and more Confluence interfaces switching to React.

getByRole() is something that comes from UI testing libraries, correct and can’t be used to e.g. apply some custom CSS on Confluence? How to deal with those cases?

Right, I understand what you mean.
We use React and Atlaskit in most new UI and the tooling is set up to minify the generated markup, so it doesn’t leave a lot of readable tags or classes.
We’ve got several methods available but I admit our developer documentation is far from comprehensive (unlike the user documentation that I find really detailed thanks to our amazing content designers). Your feedback is really valuable to point out where we need to focus, so please keep posting it!
Let me just provide some hints, hopefully they’ll help solve at least some of the pain points.

There’s Java Selenium pageobjects available, e.g. ConfluenceLoginPage (Atlassian Confluence 8.9.0 API)
Unfortunately we don’t publish a matching Playwright pageobjects yet :frowning:

The deprecated login method using unencrypted URL parameters can be re-activated in test environment by setting atlassian.allow.insecure.url.parameter.login system property in Confluence at startup, e.g. like AMPS do it:
https://bitbucket.org/atlassian/amps/src/e9c79fc20ba376fa0e4ea1367b53b2e0e01ef44a/amps-maven-plugin/src/main/java/com/atlassian/maven/plugins/amps/AbstractProductHandlerMojo.java?at=master#AbstractProductHandlerMojo.java-551
However I stress this is strictly for test purposes as using it in a production would compromise instance security.

Last, the Benefits modal dialog can be hidden in tests by adding the benefits.modal.disable dark feature, e.g. by starting Confluence with -Datlassian.darkfeature.benefits.modal.disable=true system property.

1 Like

As this would be useful both in DC and Cloud, I would add my experience as well.

We use xpaths, to be driver agnostic. So we can easily switch driver provider. Also xpaths allow for easy iteration over elements, belonging to the same “group”; joining them as strings; direct string substitution.

Role and text based selectors are not flexible enough for advanced scenarios.

Testing cloud apps is complicated. Because you deliver one universal product, and we continously test in production. Which is acceptable. But then would be nice to have more flexibility of writing tests, more confidence in selectors and less flakiness.

In the end consistent Altassian effort here does not seem like a big cost, but would benefit developers.

It would be nice if Atlassian had some kind of internal standard of at least providing something like data-testid="foo", in reality though, I don’t expect this to happen.

In general for testing our app, we use role/name selectors, in line with how the user would interact with the app. I think this is far less brittle, and if Atlassian is keeping up to their accessibility standards, there should always be an accessible option available.

We only run our tests in English, I can’t really see any decent ROI on us running end to end tests for different languages, but that’s personal preference.

A global system property to disable all popups/onboarding consistently would also be nice, but I doubt this is feasible with distributed teams.

1 Like

YES, I 100% agree with this. This is an ongoing issue on both cloud and data center.

Completely agree with this too.
Data center is less of an issue here, because at least some nag screens can be turned off by disabling system addons.
The sheer multitude of nag popups on cloud however is a serious problem that makes e2e testing a nightmare. (I’d argue that it’s also extremely annoying even as an end-user, but that’s a different story…)

1 Like

I also wanted to add my voice in favor of having an option to disable all onboarding/help popups for the purpose of having less distractions in end-to-end tests. Would be much appreciated.

3 Likes

Unfotunately I need to add more to this ticket. Because I see situation get’s worse, not better. My dream here is Atlassian would acknowledge this issue and introduce some process improvements, which would benefit the audience.

As for today there is a different xpath for “Assignee” button for Task and for Bug in Jira Cloud. Meaning partial rollout of UI changes, which affect only one issue type. If same element can have different xpaths, then it has very strong effect on maintabiity of our tests. I also doubt such way of rolling out changes is desirable for anyone.

As explained in my previous post xpaths are crucial for advanced testing scenarios.

1 Like