Get content Confluence REST API is not working properly with non-ASCII characters in query parameters

Hello! I have a question regarding the Confluence REST endpoints. I want to use a REST endpoint from Confluence that allows me to retrieve the ID of a content by providing the space key, title and type of that content. The request I am doing looks like this:

URI uri = UriComponentsBuilder.fromPath("/rest/api/content")
                    .queryParam("type", contentType)
                    .queryParam("spaceKey", spaceKey)
                    .queryParam("title", contentTitle)
                    .build().toUri();
atlassianHostRestClients.authenticatedAsAddon().getForObject(uri, Object.class);

Whenever ‘contentTitle’ contains non-ASCII characters like German umlauts or the euro sign the request will be answered with 400 - Bad Request (without any additional information).
The same request works fine when I try it in the browser.

I am using Spring Boot Atlassian Connect framework.

This problem happens if I use “rest/api/content/search?cql” endpoint as well. Has anyone of you experienced a similar issue to that one? Is it a workaround for that?

Thank you in advance!

2 Likes

I had a look at the docs of the method you are using. It says:

Note: encoding, if applied, will only encode characters that are illegal in a query parameter name or value such as "=" or "&" . All others that are legal as per syntax rules in RFC 3986 are not encoded. This includes "+" which sometimes needs to be encoded to avoid its interpretation as an encoded space. Stricter encoding may be applied by using a URI template variable along with stricter encoding on variable values. For more details please read the “URI Encoding” section of the Spring Framework reference.

You probably need to do something like this:

                    .queryParam("title", URLEncoder.encode(contentTitle, "UTF-8"))

But to be honest, the way that Spring does URL encoding is pretty horrible (and in some cases even dangerous), so I would recommend using another library or constructing the URLs by hand (and encoding each query parameter, also the space key and content type).

@candid The solution you suggested does not do the encoding correctly. In fact the result is a double encoded title. The URL I get does not return the right results even when I try it in the browser.

I just tried it for a page with an umlaut in the title and it worked to access it in the browser. So I suspect that the problem lies somewhere else.

Would you mind sharing the exact URL that you are using? Also, try from the dev console to fetch it like this: fetch(url).then((res) => res.json()).then((res) => { console.log(res); }).catch((err) => { console.error(err); }). This has sometimes shown an error message to me that I was otherwise not able to see (not sure why).

I’m also wondering what kind of content type you are trying to fetch, and in case it is custom content, whether you have set the preventDuplicateTitle option on the custom content module definition.

The request I am doing with the hardcoded values would look like this:

URI uri = UriComponentsBuilder.fromPath("/rest/api/content")
                    .queryParam("type", "page")
                    .queryParam("spaceKey", "MS")
                    .queryParam("title", "Täst it€m")
                    .build().toUri();
atlassianHostRestClients.authenticatedAsAddon().getForObject(uri, Object.class); 

As for a duplicate title issue, this is not the case. In my Confluence instance there is only one page with that title.
Whereas, as for the request you suggested me previously, it is actually doing a double encoding of the title. For that reason I have to call .build(true) to avoid the encoding of the already encoded query parameter. Even in this case I get the same error - 400 Bad Request.
I have also tried the java.net.URI single argument constructor and passed the fully encoded URL which works in browser. Still the same error - 400 Bad Request.

Ah, I misunderstood your message earlier. So the request does work in the browser, but it does not work in the app.

Since you say that it works with non-umlaut page titles, I guess it cannot be an authentication issue.

In that case I am out of ideas. I don’t have experience with the Spring Boot Atlassian Connect framework. In our code we manually implemented JWT authentication, and we don’t have this problem. So I would guess that it might be an encoding bug in that framework.

I quickly browsed the source code of the framework and discovered something suspicious here, where it uses Charset.defaultCharset() to decode URI parameters. Now I didn’t look into the details when that method is ever called and what it does exactly, so this is just a shot in the dark, but maybe try starting your JVM with -Dfile.encoding=UTF-8 (according to this that sets the default charset).

@candid I tried starting my JVM with -Dfile.encoding=UTF-8 , but it didn’t solve my issue. Anyway, thank you very much for all your suggestions so far.

Does anyone from Atlassian team who reads these posts knows more about what could be wrong with that topic?