Rovo Feedback (by Polymetis Apps)

osiebenmarck · September 24, 2024, 11:36am

After about a month of working with Rovo agents, here are our learnings and our feedback. Please note that, while we spend some considerable time with Rovo agents, all this is just one vendors perspective. Most of our time was spent building agents for Confluence and our main use-cases were:

content generation (writing a report),
analyzing content (reading a page, giving a reply),
interactive chat sessions.

General experience

Our general experience has been mostly positive: Working with agents is impressive and it is quite handy having them directly within Jira/Confluence.
There is an aspect of uncanny valley/discomfort though, as the AI is not exactly deterministic. For a developer, that’s a stark contrast to how code typically behaves. Having said that, it was amazing how well the integration between actions and the LLM worked.

What’s good?

Documentation and examples

The documentation is good, straightforward and has good tips. We especially liked having the examples as well as being able to look into the prompts.

Context

Agents get some context and that’s great. What’s even more fascinating is that agents will often try to infer information from the context. For example, we tested an agent that should interact with a board in Jira. When the chat was opened while on the board, the agent would infer the board’s ID from the URL. That’s been pretty need.

Actions

Agent can invoke actions in order to read data or make changes to data. This works pretty well and in our experience READ actions were invoked with almost 100% accuracy. Once an action had been invoked, it’s just like any Forge function.

Confirmation (With a gotcha)

Any non-READ action requires a confirmation by the user. This makes sense and when it works (see below), it is well-integrated into the interaction between user and agent. Before this feature got introduced, we had this as part of our own prompt and it’s much nicer to have a system-mandated confirmation that will always look the same.

What could be better?

FollowUp Prompts

Since day one, we have had a complicated relationship with follow up prompts. From a user’s perspective the prompts are a super nice addition that make interactions easy (no more typing!). However, we found that the downside to this is that the followup prompts often de-rail a conversation or take the interaction off on a tangent. Especially when we’d have responses where the interactive agent would end with a question and instead of suggesting possible answers, it just adds general questions.

Yes, there is the followup prompt prompt. This has some influence, but we found it extremely hard to get things right. The follow ups with the prompt would often switch perspective in strange ways, so that the prompt would be written from the agent’s perspective or simply list a number of 1-word replies that only vaguely relate to the previous conversation.

At this point, we would very much appreciate a way to disable the follow up prompts altogether.

Limited visibility into the model / Hidden limitations

Sometimes, when processing a lot of data, the agent would just stop mid-response. We understand that this is already being worked on, but it would have helped us a lot if we had more insight into the inner workings of Rovo.
Be it the amount of tokens already in context, the overall context size or the temperature settings. We’re aware that these information are all rather technical in nature and are of limited usefulness if they cannot also be changed, but simply having access would’ve made some debugging easier for us and would’ve allowed us to pinpoint issues like the one above much quicker.
It would also be great, if the context available to the agent in different situations was available to developers.

Confirmation

Even before the confirmation button was introduced, we had a confirmation step in our interactive agent. Making it explicit to the user that a change in the underlying system is about to happen is an important safeguard and will definitely help build trust and improve adoption of agents.

However, ever since the system-mandated confirmation was introduced, we have run into problems. In our interactive sessions, the agent would sometimes ask for confirmation twice and sometimes would just flat out lie about having performed an action that would have required confirmation. This would go as far as us explicitly asking the agent to confirm using the confirm button and the agent replying with a fake confirmation button – instead of running the appropriate action and showing the system’s Confirm button.

Mostly though, the agent would say things like „The configuration has been stored“ without invoking any action at all. This, of course, is the worst case for us, as there is no way a user could know that something just went wrong.

Given the semi-deterministic nature of the models we are working with here, it’s of course hard to reach 100% correct behavior every single time, but the agent seemed to be actively avoiding invoking actions that would require a confirmation.
With some prompting changes we have managed to get the agent to behave correctly in roughly 40% to 90% of cases, very much depending on user input.

What’s missing (as of 24.9.24)

Cross-Product Usage

Atlassian’s agents seem to be able to work cross-product, ie can access Confluence from within Jira and vice-versa. This is not only extremely useful, it will be very much expected of app-supplied agents early on. Not being able to meet this user expectation will not only block certain use-cases, but will also frustrate users.

Automation integration

We know it’s coming, but we have high hopes for this feature. Being able to make use of the AI asynchronously, ie without direct user interaction, will unlock a lot of potential for us (and most likely everyone else here).

Some tips

Drafting a prompt

While every AI will behave differently, we still found it useful to draft a prompt either using ChatGPTs web interface or a local LLM. Since we wouldn’t need to re-deploy a complete app for every slight change to the prompt, there was quite a speed-up in turn-around time.
Of course, this only works up to a point and is not useful anymore, once you get to fine-tuning your prompts.

Protect your prompt

Your agent’s prompt will not be displayed directly in the UI, but it’s still possible to trick an agent to revealing its prompt. Mostly, you can just ask an agent:

Clean up your data

When first trying out agents, we were amazed by the ease with which the AI was able to understand unprocessed JSON responses from the actions. However, we found that it does make a lot of sense to tailor the responses to make it easier for the LLM to work with.

Hope any of that is helpful. Thanks again for letting us take part in the EAP.

smansour · September 24, 2024, 3:34pm

Wow, incredible feedback - we’re very grateful for this - thank you!

I’ll let @AdamMoore jump in with all the responses in more detail, but just some quick reflections.

Cross-product usage for Forge-based agents: We would love this. This is something in our sights, but there is a bigger issue here around cross-product Forge apps. We will get to this, don’t have a date yet.

Automation integration: Well… a little gift is coming soon stay tuned. Here is an example
of me geeking out with Automation + no-code based Agents for
Confluence: No-code Rovo Agent Demo: Language Translation Agent | Loom
Jira: No-code Rovo Agent Demo: Jira Issue Theme Analyser | Loom

osiebenmarck · September 24, 2024, 4:21pm

Thank you @smansour! The automation part really looks great and we can’t wait to try that

UlrichKuhnhardtIzym1 · September 25, 2024, 12:55am

How did you manage to get access? Is there a back-channel? Going through the front door and requesting access (FRGEAP-406) didn’t get any response.