Performance: REST vs. GraphQL

klaussner · May 9, 2023, 11:24am

I’m currently porting an app to the new REST v2 and GraphQL APIs. In this post, I want to share the results of some performance tests that I ran. The app uses a lot of expansions, and since they are not supported anymore in the new API, I wanted to find out which of the new alternatives (REST v2 or GraphQL) performs better. Unfortunately, both new APIs are significantly slower than REST v1 for the tested use case.

Scenario

I fetched ten pages, their ancestors (two for each page), and a content property for each page. For REST v1, I made ten parallel requests with two expansions and for GraphQL, I used the pages field to fetch all pages in a single request. For REST v2, I tested two different implementations—one that fetches ancestors and content properties sequentially for a particular page and one that parallelizes requests as much as possible.

Expand to show request details.

REST v1:

/wiki/rest/api/content/[id]?expand=ancestors.metadata.properties.property,metadata.properties.property

REST v2:

/wiki/api/v2/pages/[id]
/wiki/api/v2/pages/[id]/properties?key=property

GraphQL:

query GetPages {
  confluence {
    pages(ids: [...]) {
      pageId
      title
      ancestors {
        pageId
        title
        properties(keys: "property") {
          value
        }
      }
      properties(keys: "property") {
        value
      }
    }
  }
}

Results

The following box plot shows the results of the four tests (REST v1, REST v2 with sequential requests, REST v2 with parallel requests, and GraphQL):

REST v1 performs significantly better than the alternatives. Even with maximal parallelism, the overhead of multiple REST v2 requests is so big that fetching the pages takes three times as long as the REST v1 implementation. Even a single GraphQL request takes almost twice as long. Additionally, GraphQL performance is less predictable because the variance of request durations is much higher compared to both REST APIs.

Conclusion

I very much prefer the GraphQL API over REST v1 in terms of developer experience and I know that the REST v2 and GraphQL APIs are still new. But hopefully there’s still room for performance improvements before REST v1 is shut down. I don’t know if there’s anything else we can do to make sure that the performance of our apps doesn’t suffer from the transition to the new APIs.

SimonKliewer · May 16, 2023, 2:34pm

Hi @klaussner -

Thank you for the detailed feedback around the Confluence REST API v2. This is exactly the kind of feedback we are looking for, feedback that helps ensure we are working towards the best possible experience for our consumers.

I see that to achieve what you have outlined above with the V2 API, you need to make 6x the amount of calls with V2 vs. V1. For each page, you will need to fetch that page’s parent & the parent’s parent, and then fetch the corresponding content properties for each of those pages in a separate call.

We are working on a few new endpoints that will help out here. They are:

Get all ancestors of page → given a page ID, will return a list of that page’s ancestors
Get content properties for pages → will return all content properties that correspond to the page type, and will include an array type page-id filter so response can be narrowed down to a set of pages
- (there will also be Get content properties for <type> for other content types)

Once the above endpoints are released, the count of required calls from the V2 API will drop from 6 to 2. While we cannot promise this will be faster than V1, hopefully this will address: Even with maximal parallelism, the overhead of multiple REST v2 requests is so big that fetching the pages takes three times as long as the REST v1 implementation.

Please let us know what you think!

There will be examples outside of this one where sometimes the count of calls required to fetch the same set of content will increase from the V1 to V2 API. We are moving forward with this tradeoff for the reasons outlined in our blogpost here. However, we are confident this is a step in the right direction, and are excited to be working with the community in order to make this transition as smooth and performant as possible.

Thank you again!

marc · May 16, 2023, 4:19pm

Hi @SimonKliewer ,
Will there be an endpoint where we can get a content property by name? Currently (I believe) we must get all content properties of e.g. a page and filter out the one with the right name.

klaussner · May 19, 2023, 7:28pm

Thank you, @SimonKliewer!

Reducing the number of requests with these new APIs will definitely help.

I assume that it will still be slower than REST v1 because we have to retrieve all pages first and wait for the slowest request before we can retrieve the properties. But I will run another performance test when the new APIs are available and post the results here.

riku · May 23, 2023, 10:24am

Hi @SimonKliewer and thanks @klaussner for sharing your test results!

Adding the “Get content properties for pages” that can be filtered to an array of page ids sounds promising for our app.

I just wanted to point out that it would be great to have same filters for all the endpoints (where those filters make sense)

For our app, it would be great to have eg “Get labels of pages”, “Get attachments of pages”, or even “Get ancestors of pages” endpoints that could be limited to an array of page-id

Another important filter for our use case would be to limit all of these “of pages” calls to descendants of given ancestor page

Btw, it seems that Get pages can already be limited to an array of ids - I believe this is not possible with the REST v1 (without using CQL). This new feature could be a concrete improvement for a few use cases in our app, and also could reduce the number of REST calls needed in the test case discussed here, especially if there would be “Get ancestors of pages” endpoint.

andreas1 · May 25, 2023, 11:28am

I can only confirm what @riku said.

We also need the possibility for feed multiple ids into one v2 call especially when it comes to client side calls. On the server side I could parallelize such calls but one the client side we are limited to 6 parallel requests.

@SimonKliewer : to give you a concrete example what no customer would accept. In one of our apps we need to collect the labels of all ancestors. In case there are 1000 ancestors this would take 167 rounds and depending on the reaction time of the v2 API 25-40 seconds. In v1 we collect the labels of 250 ancestors in one single call. This means 4 rounds and <= 1 second.

SimonKliewer · May 25, 2023, 7:53pm

Hi @marc - thanks for the feedback! I think it would be great for the content property bulk fetch endpoints to have both id and key filters that accept multiple values. This would help cut down on the # of calls required. I’ll look into getting this added.

Sounds good @klaussner - thanks for the help here!

Hi @riku and @andreas1! Thanks for the feedback and makes total sense! I think for resources where we have this hierarchal relationship (attachments on page(s), labels on page(s)), it would be great to have endpoints that support fetching these resources for multiple “parent” entities in one call.

I see how this is especially important when it comes to client side requests.

We are looking for ways to make concrete improvements like these, I’ll bring this to the team to get prioritized. Thanks so much!