If you ever want to retrieve one or multiple pages by title using the Confluence REST API and thought that would be a matter of one simple call against the CQL or Content REST API, you have been fooled by a collection of bugs in the REST endpoints of Confluence Cloud and Server. In this post I am investigating how different types of page titles behave when used against the different endpoints and am suggesting a solution for app developers how to fetch pages by titles that also works for eccentric titles, as well as making some suggestions to Atlassians how to make our lives easy again.
Our app provides users the option to create pages. To avoid title conflicts, we are validating the title that they have typed in by querying Confluence for existing pages with the same title. As reported here, there is a bug in the Confluence REST API where page titles are handled as case-sensitive, even though they are not. As a workaround, we have to use the fuzzy search (using the ~
operator).
It seems that there have been some changes to the search API on Cloud, causing our app to crash due to a new type of error: com.atlassian.confluence.api.service.exceptions.BadRequestException: CQL was parsed but the search manager was unable to execute the search. CQL: 'title~\"\\\"\"'. Error message: com.atlassian.confluence.search.searchnsmarts.rest.SSSearchClientException: There was an illegal request passed to S&S Content Search API : HTTP/1.1 400
.
Investigation
I have done some detailed investigation to find out which special characters in page titles behave in which way. I have created a bunch of test pages with a title like Test!page
, where the !
is one of these characters: !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~
and tried to find them using different ways. Note that there might be other special characters (non-printable or non-ASCII) that cause problems and that I haven’t tested yet. If you come across such a case, please let me know.
- Using
/rest/api/content?spaceKey=TEST&title=Test!page
, I could find all of them correctly. On Cloud this is case-sensitive, while on Server it isn’t. - Using
/rest/api/search?cql=title="Test!page"
, I had the following results:- For
"
(using?cql=title="Test%5C"page"
),Test^page
andTest`page
were returned on Cloud, but notTest"page
. On Server, nothing was returned. - For
\
, it worked when using?cql=title="Test%5C%5Cpage"
. When using?cql=title="Test%5Cpage"
, it returnedTest^page
andTest`page
on Cloud, but notTest\page
, and nothing on Server. - For
^
and`
, bothTest^page
andTest`page
were returned. Escaping it with a backslash didn’t change anything. - All other characters found the right page
- If a page title ends with
\
, I have to escape that as\\\"
, otherwise it fails withEscaped end quote in phrase
. It does not make sense but it seems to work. - Note that on Cloud this is case-insensitive, while on Server it is case-sensitive.
- For
- Using
/rest/api/search?cql=title~"Test%20page"
found all pages exceptTest'page
,Test.page
,Test_page
andTest:page
- Using
/rest/api/search?cql=title~"Test!page"
, I had the following results:- For
!:<>=\
it returned no results - For
"
I received aBadRequestException
(which makes sense, because it should be escaped) - For
()/[]^{}
, I received anSSSearchClientException
on Cloud. On Server, these returned all pages exceptTest'page
,Test.page
andTest:page
. - For
*?
, it returnedTest'page
,Test.page
,Test_page
andTest:page
- For
+,-;@`|~
, it returned all pages exceptTest'page
,Test.page
,Test_page
andTest:page
- For
'._
, it returned the right page - The special case
title~""
ortitle~" "
throws an Internal Server Error
- For
- Using
/rest/api/search?cql=title~"Test%5C!page"
(escaping the character with a\
), I had the following results:- For
"
, it returned no results - For
'.:_
, it returned the right page - For
!#$%&()*+,-/;<=>?@[\]^`{|}~
, it returned all pages exceptTest'page
,Test.page
,Test_page
andTest:page
- When searching for pages that only contain such special characters, for example a page called
:
, Cloud returns all pages from the space, while Server returns no results.
- For
So I have found out the following things about the internals of CQL:
- Searching for a page by its title works, except when the title contains a
"
. But it is case-sensitive on Server. - In a fuzzy search, the characters
*
and?
have a special meaning as documented in the Confluence docs. They can be escaped using a\
. - The characters
'._:
are seen as a regular character, while the characters!"#$%&()*+,-/;<=>?@[\]^`{|}~
are seen as word separators. When used in a fuzzy search query, the latter ones are equal to each other and to a space. - The characters
!:<>=\"()/[]^{}
have a special meaning and have to be escaped using a\
. All other characters can be escaped but don’t have to.
Solution
I have come up with these methods to fetch pages by title that should work for any kind of title. Since I’m getting lost in all these rules myself, it could be that I’ve made a mistake somewhere.
Update: Check the comments below for a more convenient method to fetch pages by title.
Finding a single page by title
- For Server: Use the
/rest/api/content?spaceKey=SPACE&title=Page+title
endpoint. - For Cloud:
- If the page title doesn’t contain a
"
, use the/rest/api/search?cql=space="SPACE"+and+title="Page+title"
endpoint. Escape\
as\\
in the beginning or middle of the title and as\\\"
at the end of the title. Manually filter the results and only accept that with the right title (case-insensitive comparison). Make sure to quote the space key, as some space keys don’t work without quotes. - If the page title contains a
"
but no characters that are relevant for case sensitivity, use the/rest/api/content?spaceKey=SPACE&title=Page+title
REST endpoint. - If the page title contains both
"
and characters that are relevant for case sensitivity, use/rest/api/search?cql=space="SPACE"+and+title~"Page+title"
and look if any page has the title that you are looking for (case-insensitive comparison). In the page titles, replace the characters!<>=\"()/[]^{}*?
with spaces and escape the character:
as\:
. Make sure to fetch all result pages, as some fuzzy searches might return a lot of results.
- If the page title doesn’t contain a
Finding multiple pages by title
- For Cloud:
- For all pages not containing a
"
, like this:title in ("Page title 1","Page title 2")
- For all pages containing a
"
, like this:title~"Page title")
, where you replace the characters!<>=\"()/[]^{}
with spaces and escape the character:
as\:
. - For all pages containing a
"
and only consisting of the characters!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~
, you have to fetch them using the/rest/api/content?spaceKey=SPACE&title=Page+title
endpoint or by fetching all pages from the space (slow for large spaces) and finding your desired page in the list. - Always filter the results and only keep those whose title you are looking for (case-insensitive comparison)
- For all pages not containing a
- For Server:
- For titles that don’t contain any letters (so case-insensitivity doesn’t matter) or
"
quotes use CQL as intitle in ("Page title 1","Page title 2")
- For titles only consisting of the characters
!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~
and containing at least one"
, you have to fetch them manually using the/rest/api/content?spaceKey=SPACE&title=Page+title
API. - For all other pages, use the CQL like this:
title~"Page+title"
. In the page titles, replace the characters!<>=\"()/[]^{}*?
with spaces and escape the character:
as\:
. Filter the results and only use those that have a matching title (case-insensitive comparison).
- For titles that don’t contain any letters (so case-insensitivity doesn’t matter) or
To find multiple pages using CQL, some using exact matching and some using fuzzy matching, do your query like this: space="SPACE" and (title in ("Page title 1","Page title 2") or title~"Page title 3" or title~"Page title 4")
. Make sure to put the space in quotes, as some space keys don’t work without quotes. Make sure to fetch all result pages, as some fuzzy searches might return a lot of results.
Example code
/**
* Decide whether to fetch a page with a particular title using the Content API, CQL (exact match) or CQL (fuzzy match).
* @param title The title of the page to fetch
* @param forMultiple Whether we would like to fetch the page as part of a batch request trying to fetch multiple pages at once
*/
const getTitleMethod = (title, forMultiple) => {
const isCased = title.toUpperCase() !== title.toLowerCase();
if (!forMultiple && (isServer || !isCased)) {
// When fetching a single item, the Content API can be used for all titles on Server and for titles without case-sensitive characters on Cloud
return TITLE_METHOD.CONTENT;
} else if ((isCloud || !isCased) && !title.includes('"')) {
// CQL exact matching can be used for all titles on Cloud and and for uncased titles on Server, except for titles with a quotation mark in them
return TITLE_METHOD.CQL;
} else if (!title.match(/^[!"#$%&'()*+,-./:;<=>?@[\\\]^_`{|}~]+$/)) {
// Fuzzy matching can be used for all titles, except the ones that only consist of special characters
return TITLE_METHOD.CQL_FUZZY;
} else {
// Pages that cannot be fetched through CQL have to be fetched using the content API, even when multiple pages should be fetched at once
return TITLE_METHOD.CONTENT;
}
};
/**
* Put a string in quotes and escape some characters so that it can be used in CQL.
*/
export const quoteCql = (value) => (
`"${value
// Quotes should be escaped as \", but since there is no way to match quotes in CQL, let's remove them to avoid crashes
.replace(/"/g, ' ')
// Backslashes should be escaped as \\. At the end of the string, they need to be escaped as \\\"
.replace(/\\/g, (m, o) => o === value.length - 1 ? '\\\\\\"' : '\\\\')
}"`
);
/**
* Put a string in quotes and escape some characters so that it can be used on the right side of a ~ fuzzy match operator in CQL.
* @param value
*/
export const quoteFuzzyCql = (value) => (
`"${value
.replace(/[!<>=\\"()/[\]^{}*?]/g, ' ')
.replace(/:/g, '\\:')
}"`
);
/**
* Create a CQL query to fetch the specified pages. Only works for titles that can be fetched through CQL, titles that have to be fetched through the Content API have to be fetched separately.
*/
export const makeCqlQuery = (spaceKey, titles) => (
const parts = titles
.filter((title) => getTitleMethod(title, true) === TITLE_METHOD.CQL_FUZZY)
.map((title) => `title~${quoteFuzzyCql(title)}`);
const byCql = titles.filter((title) => getTitleMethod(title, true) === TITLE_METHOD.CQL);
if (byCql.length > 0) {
parts.push(`title IN (${byCql.map((title) => quoteCql(title)).join(',')})`);
}
return `type="page" and space=${quoteCql(spaceKey) and (${parts.join(' OR ')})`;
);
What Atlassian can do
- Announce publicly that the search API has changed on Cloud and that
()/[]^{}
need to be escaped now in fuzzy search terms, otherwise the search crashes. It has not been like this before. - Make the
/rest/api/content?spaceKey=SPACE&title=Page+title
API case-insensitive on Cloud. It just does not make any sense that it is case-sensitive, when page titles are case insensitive. - Make the
title
CQL field case-insensitive on Server. (Note that I have tested on 7.0, it might have changed in more recent versions.) - Fix the handling of
"
in page titles in CQL on both Cloud and Server. Right now pages with a"
in the title cannot be found using atitle="Page title"
query. - Fix the handling of
\
at the end of a string in CQL on both Cloud and Server. Right now the string"\\"
throws an error, while the string"\\\""
represents\
, which does not make any sense. - Update the CQL documentation:
- Mention that characters can be escaped using
\
. - Mention that
!:<>=\"()/[]^{}
have a special meaning in fuzzy search terms and need to be escaped. - Mention the bugs that will not be fixed.
- Mention that characters can be escaped using