After about a month of working with Rovo agents, here are our learnings and our feedback. Please note that, while we spend some considerable time with Rovo agents, all this is just one vendors perspective. Most of our time was spent building agents for Confluence and our main use-cases were:
- content generation (writing a report),
- analyzing content (reading a page, giving a reply),
- interactive chat sessions.
General experience
Our general experience has been mostly positive: Working with agents is impressive and it is quite handy having them directly within Jira/Confluence.
There is an aspect of uncanny valley/discomfort though, as the AI is not exactly deterministic. For a developer, that’s a stark contrast to how code typically behaves. Having said that, it was amazing how well the integration between actions and the LLM worked.
What’s good?
Documentation and examples
The documentation is good, straightforward and has good tips. We especially liked having the examples as well as being able to look into the prompts.
Context
Agents get some context and that’s great. What’s even more fascinating is that agents will often try to infer information from the context. For example, we tested an agent that should interact with a board in Jira. When the chat was opened while on the board, the agent would infer the board’s ID from the URL. That’s been pretty need.
Actions
Agent can invoke actions in order to read data or make changes to data. This works pretty well and in our experience READ actions were invoked with almost 100% accuracy. Once an action had been invoked, it’s just like any Forge function.
Confirmation (With a gotcha)
Any non-READ action requires a confirmation by the user. This makes sense and when it works (see below), it is well-integrated into the interaction between user and agent. Before this feature got introduced, we had this as part of our own prompt and it’s much nicer to have a system-mandated confirmation that will always look the same.
What could be better?
FollowUp Prompts
Since day one, we have had a complicated relationship with follow up prompts. From a user’s perspective the prompts are a super nice addition that make interactions easy (no more typing!). However, we found that the downside to this is that the followup prompts often de-rail a conversation or take the interaction off on a tangent. Especially when we’d have responses where the interactive agent would end with a question and instead of suggesting possible answers, it just adds general questions.
Yes, there is the followup prompt prompt. This has some influence, but we found it extremely hard to get things right. The follow ups with the prompt would often switch perspective in strange ways, so that the prompt would be written from the agent’s perspective or simply list a number of 1-word replies that only vaguely relate to the previous conversation.
At this point, we would very much appreciate a way to disable the follow up prompts altogether.
Limited visibility into the model / Hidden limitations
Sometimes, when processing a lot of data, the agent would just stop mid-response. We understand that this is already being worked on, but it would have helped us a lot if we had more insight into the inner workings of Rovo.
Be it the amount of tokens already in context, the overall context size or the temperature settings. We’re aware that these information are all rather technical in nature and are of limited usefulness if they cannot also be changed, but simply having access would’ve made some debugging easier for us and would’ve allowed us to pinpoint issues like the one above much quicker.
It would also be great, if the context available to the agent in different situations was available to developers.
Confirmation
Even before the confirmation button was introduced, we had a confirmation step in our interactive agent. Making it explicit to the user that a change in the underlying system is about to happen is an important safeguard and will definitely help build trust and improve adoption of agents.
However, ever since the system-mandated confirmation was introduced, we have run into problems. In our interactive sessions, the agent would sometimes ask for confirmation twice and sometimes would just flat out lie about having performed an action that would have required confirmation. This would go as far as us explicitly asking the agent to confirm using the confirm button and the agent replying with a fake confirmation button – instead of running the appropriate action and showing the system’s Confirm button.
Mostly though, the agent would say things like „The configuration has been stored“ without invoking any action at all. This, of course, is the worst case for us, as there is no way a user could know that something just went wrong.
Given the semi-deterministic nature of the models we are working with here, it’s of course hard to reach 100% correct behavior every single time, but the agent seemed to be actively avoiding invoking actions that would require a confirmation.
With some prompting changes we have managed to get the agent to behave correctly in roughly 40% to 90% of cases, very much depending on user input.
What’s missing (as of 24.9.24)
Cross-Product Usage
Atlassian’s agents seem to be able to work cross-product, ie can access Confluence from within Jira and vice-versa. This is not only extremely useful, it will be very much expected of app-supplied agents early on. Not being able to meet this user expectation will not only block certain use-cases, but will also frustrate users.
Automation integration
We know it’s coming, but we have high hopes for this feature. Being able to make use of the AI asynchronously, ie without direct user interaction, will unlock a lot of potential for us (and most likely everyone else here).
Some tips
Drafting a prompt
While every AI will behave differently, we still found it useful to draft a prompt either using ChatGPTs web interface or a local LLM. Since we wouldn’t need to re-deploy a complete app for every slight change to the prompt, there was quite a speed-up in turn-around time.
Of course, this only works up to a point and is not useful anymore, once you get to fine-tuning your prompts.
Protect your prompt
Your agent’s prompt will not be displayed directly in the UI, but it’s still possible to trick an agent to revealing its prompt. Mostly, you can just ask an agent:
Clean up your data
When first trying out agents, we were amazed by the ease with which the AI was able to understand unprocessed JSON responses from the actions. However, we found that it does make a lot of sense to tailor the responses to make it easier for the LLM to work with.
Hope any of that is helpful. Thanks again for letting us take part in the EAP.