Rovo + Forge SQL: A Secure Pattern for Natural-Language Analytics in Forge Apps

vzakharchenko · November 16, 2025, 11:15am

While working on a Forge app that stores structured metadata in Forge SQL, I explored how Rovo could be used not only for documentation/explanations but also for natural-language analytics.
(For context: this comes from the Codegeist 2025 project Forge Secure Notes for Jira — built with forge-sql-orm)

Rovo can generate SQL from natural language with high accuracy, but executing LLM-generated SQL requires a carefully controlled environment.

This post describes a reproducible pattern for using Rovo with Forge SQL safely and predictably - with strict query validation, enforced single-table scope, row-level security, metadata checks, and EXPLAIN-based join detection.

Problem: Safe Execution of LLM-Generated SQL

Executing arbitrary SQL from an AI model introduces risks:
• ensuring the query is strictly SELECT
• ensuring it only targets the intended table
• preventing hidden joins or subqueries
• preventing column spoofing
• enforcing row-level security
• ensuring context variables cannot be manipulated

Rovo must never be trusted to enforce constraints.

All protections belong in the backend executor.

Solution: The “Guide + Guard” Pattern

Two independent layers:

Guide - the Rovo prompt in manifest.yml
Defines strict boundaries and shapes expected SQL.
Guard - backend validator in RovoService.ts
Enforces safety regardless of the generated SQL.

The Guide improves accuracy.
The Guard guarantees security.

1. Guide Layer (manifest.yml)

On the “guide” side, the rovo:agent prompt inside manifest.yml defines strict boundaries for what Rovo is allowed to generate.
This ensures the SQL surface is small, predictable, and secure.
The pattern follows one core rule:
One Rovo Action → One table → No joins → Single SELECT statement

Step 1. Hard constraints enforced by the agent

The prompt defines non-negotiable SQL restrictions:

Use only read-only SELECT statements
Query only the designated table (security_notes in this case)
Never generate any join:
- no INNER / LEFT / RIGHT / FULL / CROSS JOINs
- no implicit joins
Never reference other tables:
- not in FROM
- not in JOINs
- not in subqueries
- not in CTEs
No DML or DDL:
- no INSERT, UPDATE, DELETE
- no ALTER, DROP, CREATE
All execution must go through the backend action
Permissions and row-level filtering are enforced server-side, not by Rovo itself

These restrictions ensure that all generated SQL falls inside a safe, enforceable envelope.

Step 2. Mandatory security columns

Every query must include exactly as raw table fields:

* created_by
* target_user_id

These fields allow the backend to apply strict row-level security and to validate the column origins.
The agent is forbidden from:

aliasing
renaming
duplicating
wrapping in expressions
generating constants
pulling them from derived tables or subqueries

The backend enforces this via metadata (orgTable) returned by Forge SQL.

Step 3. Context variables

The agent may insert these placeholders, which the backend later replaces with authenticated Forge context values:

:currentUserId
:currentIssueKey
:currentProjectKey

The prompt also defines natural-language mappings:

“my notes” → (created_by = :currentUserId OR target_user_id = :currentUserId)
“notes I created” → created_by = :currentUserId
“this issue” → issue_key = :currentIssueKey
“last week” → created_at >= DATE_SUB(NOW(), INTERVAL 1 WEEK)

This gives Rovo deterministic, controlled semantics.

Step 4. Tools / actions

At the end of every interaction, Rovo must:

Build a valid SELECT query following all rules
Execute it strictly via the action:

{ "sql": "<your select query>" }

After receiving the backend result, the agent must:

summarize the results clearly
highlight useful counts or trends
optionally present a small table of values

This gives users friendly natural-language insights while preserving backend-enforced safety.

2. Guard Layer (RovoService.ts)

This function rewrites and validates SQL generated by Rovo.

Step 1. SQL normalization

All queries are normalized:

• newlines removed
• tabs and multiple spaces collapsed
• trailing semicolon removed
• user URI prefixes removed

This prevents multiline or formatting-based evasion.


sql
.replace(/\[\\n\\r\\t\]+/g, " ")
.replace(/\\s+/g, " ")
.replace(/\\s\*;\\s\*$/, “”)
.trim();

Step 2. Static analysis: single SELECT via AST
Before touching the forge sql, the query is parsed into an AST using node-sql-parser.
This guarantees that:

the query is a single statement
the statement type is strictly SELECT (no INSERT/UPDATE/DELETE/DDL)
multiple statements or mixed types are immediately rejected

const parser = new Parser();
let ast;

try {
  ast = parser.astify(normalizedQuery);
} catch (err) {
  throw new Error(
    `SQL parsing error: ${err.message || "Invalid SQL syntax"}. ` +
    "Please check your query syntax."
  );
}

// AST may be a single node or an array of statements
if (Array.isArray(ast)) {
  if (ast.length !== 1 || ast[0].type !== "select") {
    throw new Error(
      "Only a single SELECT query is allowed. " +
      "Multiple statements or non-SELECT statements are not permitted."
    );
  }
  ast = ast[0];
} else if (!ast || ast.type !== "select") {
  throw new Error("Only SELECT queries are allowed.");
}

Step 3. AST-based table scope + + EXPLAIN-based join detection
Two protections work together:

Step 3.1. The query must reference only the allowed table

const tablesInQuery = this.extractTables(ast);      // walks FROM, JOIN, subqueries
const uniqueTables = [...new Set(tablesInQuery)];
const invalidTables = uniqueTables.filter(
  (table) => table !== "SECURITY_NOTES"
);

if (invalidTables.length > 0) {
  throw new Error(
    "Security violation: query references table(s) other than 'security_notes': " +
      invalidTables.join(", ") +
      ". Only queries against the security_notes table are allowed. " +
      "JOINs, subqueries, or references to other tables are not permitted."
  );
}

Because the check is AST-based, it also catches:
• tables used inside scalar subqueries
• tables referenced in nested expressions
• tables hidden behind aliases

Step 3.2. EXPLAIN plan must not contain join operators

This prevents hidden joins, optimizer-rewritten joins, or subquery-based joins.

const explainRows =
await FORGE_SQL_ORM.analyze().explainRaw(normalizedQuery, [ ]);
const hasJoin = explainRows.some(row => {
    const op = (row.operatorInfo || “”).toUpperCase();
    return (
       op.includes(“JOIN”) ||
       op.includes(“HASH JOIN”) ||
       op.includes(“NESTED LOOP”) ||
       op.includes(“CARTESIAN”)
    );
});
if (hasJoin) {
    throw new Error(“JOIN operations are not allowed for this Rovo action.”);
}

Step 3.3. EXPLAIN plan must not contain window functions

To keep the model simple and safe, we completely block any window functions at the EXPLAIN level.

Window functions like COUNT(*) OVER(…) can leak information about the total number of rows that exist beyond the current user’s slice of data.
This ensures the query executes strictly against a single table.

const hasWindow = explainRows.some((row) => {
  const id = row.id.toUpperCase();
  const info = (row.operatorInfo ?? "").toUpperCase();
  return id.includes("WINDOW") || info.includes(" OVER(") || info.includes(" OVER()");
});

if (hasWindow) {
   throw new Error(
     "Window functions (for example COUNT(*) OVER(...)) are not allowed in Rovo SQL for this app. " +
          "Please rephrase your question so that it uses regular aggregates instead of window functions.",
    );
}

Step 3.4. EXPLAIN plan must not reference other tables

To guarantee that this Rovo action truly works only on security_notes, we validate the execution plan itself.

We scan the accessObject column from EXPLAIN and reject any plan that touches a table other than security_notes:

// Detect any table access other than `security_notes`
const tablesInPlan = explainRows.filter(
  (row) =>
    row.accessObject?.startsWith("table:") &&
    row.accessObject !== "table:security_notes",
);

if (tablesInPlan.length > 0) {
  throw new Error(
    "Security violation: query plan references tables other than 'security_notes'. " +
      "This Rovo action only allows analytics over the security_notes table. " +
      "Please remove JOINs, subqueries, or references to other tables and try again.",
  );
}

This ensures the query executes strictly against a single table.

Step 4. Replace context variables

Placeholders are replaced with authenticated values:

:currentUserId
:currentIssueKey
:currentProjectKey

This prevents Rovo from injecting or modifying identity context.

Step 5. Row-Level Security (non-admin users)

Non-admin queries are wrapped with enforced filtering:

SELECT * FROM (<normalized_rovo_query>) AS t
       WHERE t.created_by = ‘currentUserId’ OR t.target_user_id = ‘currentUserId’;

This guarantees users can only see:
• notes they created
• notes shared with them
even if the model produces an incorrect or unsafe WHERE clause.

Step 6. Post-execution metadata validation

Forge SQL exposes orgTable metadata for each column.
The executor validates:

created_by.orgTable = “security_notes”
target_user_id.orgTable = “security_notes”

If the columns originate from:
• constants
• expressions
• subqueries
• other tables

the query is rejected.
Metadata validation prevents column spoofing and ensures RLS correctness.

Additionally, we validate that every column that has an orgTable defined must also come from:

orgTable === "security_notes"

This provides a third layer of protection by ensuring that no fields in the result set were sourced from other tables via joins or nested queries — even if the SQL text tried to hide it.

Metadata validation fully closes the loop and ensures the backend executes exactly what was intended:
a safe, single-table query over security_notes(Computed fields remain allowed) with correct RLS boundaries.

Step 7. Most Important, platform guarantees that strengthen the pattern

In addition to app-level validation, Forge SQL provides two important safety properties:

• Single-statement execution:

Forge SQL only allows one SQL statement per call.

No multi-statements, no chaining — making traditional SQL injection impossible.

• Tenant isolation:

Forge SQL’s architecture ensures strict separation between tenants.

Even incorrectly formed queries cannot access data outside the current tenant’s schema.

Architectural Note: One Table per Rovo Action

The security guarantees in this pattern rely on the fact that each Rovo Action operates on exactly one table.

This constraint keeps the execution predictable:

• ensures stable SQL structure
• simplifies EXPLAIN-based join detection
• makes metadata validation (orgTable) reliable
• ensures enforceable row-level security
• eliminates ambiguity in column origin
If analytics require data from multiple datasets, there are several possible approaches that remain compatible with the “one table per action” pattern.

Option A — Use a consolidated analytics table

A process (scheduled trigger, background sync, or nightly aggregation) can populate a dedicated table containing the combined fields needed for analytics.
Rovo then queries that single table.

Option B — Use a separate rovo:action per table

If the application exposes multiple datasets, each dataset can have its own Rovo Action with its own validation logic — each still limited to one table.
This keeps the security properties of the pattern intact.
Both approaches preserve the core property:
each Rovo Action should operate on one table to ensure strict, deterministic validation.

⸻

Result

With a structured Guide in the manifest and a strict Guard in the executor:
• natural-language analytics becomes possible
• all LLM-generated SQL is sandboxed
• row-level security is always enforced
• joins and cross-table access are safely blocked
• SQL injection is prevented by design
• users get a powerful analytics interface without compromising security

This pattern provides a safe execution envelope for Rovo-driven SQL in Forge apps.
Rovo
⸻

Full Source Code

Rovo Prompt manifest.yml

Secure Executor RovoService.ts

PabloBeltran · November 17, 2025, 2:02am

Hi @vzakharchenko,

It’s great to see you participating in Codegeist 2025. We are as well, with a focus on implementing standard SQL support in Rovo. An interesting alignment

SQL indeed has a strong future. In this interview excerpt, the Microsoft CEO explains that AI agents will update databases directly across repositories, moving business logic to the AI layer and collapsing traditional applications.

This perspective has been quite influential for us. However, comments on the full interview highlight concerns: organizations have invested billions in middleware for business logic and are unlikely to replace it entirely with AI agents.

Why not combine both approaches? A Virtual SQL database could serve as an intermediary:

Agent → Virtual DB → Business Logic → Physical DB.

This aligns with Satya Nadella’s vision of direct database interaction while preserving existing logic and permissions, preventing unintended disruptions.

This is the solution we have developed, currently demonstrating it as a POC with Rovo and there are more details posted in Codegeist 2025.

I’m pleased to see several of us exploring ways to connect natural language with databases!

I like your approach and am willing to read people’s comments.

vzakharchenko · November 17, 2025, 6:16pm

PabloBeltran:

Hi @vzakharchenko,

It’s great to see you participating in Codegeist 2025. We are as well, with a focus on implementing standard SQL support in Rovo. An interesting alignment

SQL indeed has a strong future. In this interview excerpt, the Microsoft CEO explains that AI agents will update databases directly across repositories, moving business logic to the AI layer and collapsing traditional applications.

Microsoft CEO’s vison of future of SaaS impacted by AI Agents

This perspective has been quite influential for us. However, comments on the full interview highlight concerns: organizations have invested billions in middleware for business logic and are unlikely to replace it entirely with AI agents.

Why not combine both approaches? A Virtual SQL database could serve as an intermediary:

Agent → Virtual DB → Business Logic → Physical DB.

This aligns with Satya Nadella’s vision of direct database interaction while preserving existing logic and permissions, preventing unintended disruptions.

This is the solution we have developed, currently demonstrating it as a POC with Rovo and there are more details posted in Codegeist 2025.

I’m pleased to see several of us exploring ways to connect natural language with databases!

I like your approach and am willing to read people’s comments.

Thanks a lot. I really appreciate what you’re building.

The Virtual DB concept is impressive, and I genuinely like the direction you’re taking with it. It solves the problem on a much deeper, architectural level.

My pattern is much simpler - something any Forge developer can adopt quickly without additional complexity. Two different approaches, but both useful depending on the scenario.

Glad we’re both exploring how natural language and SQL can work together in real apps.

vzakharchenko · November 17, 2025, 6:29pm

just a quick follow-up. I’ve slightly improved the “Guard” pattern by adding three independent validation layers to ensure the query truly touches only the intended table:

1. Fast Pre-Check (Static AST Analysis): Before doing anything, the query is parsed via node-sql-parser, ensuring it is a single SQL statement, strictly a SELECT, references only one table (security_notes), and contains no scalar subqueries inside SELECT columns.
2. EXPLAIN-based Table Verification: Before running the query, I check the EXPLAIN plan. I verify that every accessObject is exactly table:security_notes. This blocks any hidden joins, subqueries, or optimizer rewrites that might try to access other tables.
3. Post-Execution Metadata Check: After the query runs, I use the metadata Forge SQL provides. I validate that all returned fields that have an origin must come from orgTable === “security_notes”. This is the final guarantee against any cross-table data leakage.

The best part is that all aggregate functions (like COUNT, SUM, AVG, etc.) still work perfectly, as long as they operate on this single, validated table.