Natural Language to OPA

Posted Jun 7, 2026

By Rohit N B

4 min read

Most platform teams know what they want to enforce. They just can’t always write it.

“Require a manual approval before anything goes to production.” “Block deployments that skip the security scan step.” “Warn when a release comes from a branch other than main.” These are clear, reasonable policies. Expressing them as Rego — the policy language behind Open Policy Agent — is a different skill. It requires understanding package naming, helper rule patterns, the input.* schema your policy engine exposes, and the difference between a rule that evaluates correctly and one that silently passes everything because the default is wrong.

The result is a bottleneck. Policy intent lives in a Jira ticket or a platform team’s head. Getting it into an OCL file that actually enforces what was intended requires someone who knows Rego, knows the Platform Hub schema, and has time to write it. In most organizations, that’s a short list.

What changes when you add a language model

The core insight is that translating intent to Rego is exactly the kind of structured transformation that language models are good at — provided they have the right context.

“Right context” is doing a lot of work in that sentence. A model with no knowledge of your Platform Hub schema will generate syntactically valid Rego that references fields that don’t exist or uses slugs it invented. That’s worse than useless — it looks correct until you test it. The fix is to give the model the schema, your existing OCL files as examples of the naming conventions and structure you expect, and live environment data so the scope it generates references real project slugs and real environment names.

With that context, the translation problem becomes tractable. You describe what you want to enforce in plain English. The model maps your intent to specific input.* fields in the Platform Hub schema, derives a package name, writes the scope and conditions Rego, and produces an OCL file that follows the conventions already in your repository.

This isn’t autopilot. The model can still get it wrong — particularly on edge cases involving conditional fields like input.Release (absent on runbook runs) or input.SkippedSteps (a runtime value that static analysis can’t resolve). But it gets the structure right, the naming right, and the common patterns right. The gap you’re closing is the one between “we know what we want” and “we have a reviewable draft.”

A concrete example

Say you want to enforce that deployments to production always include a manual intervention step, and that the step hasn’t been skipped by the person triggering the deployment.

In plain English: “Block deployments to the production environment on the payments-service project unless an Octopus.Manual step is present, enabled, and not in the skipped steps list.”

That intent maps to three specific fields in the Platform Hub schema: input.Environment.Slug to scope to production, input.Steps[].ActionType to find the manual step, and input.SkippedSteps to verify it wasn’t excluded. A model with the schema and a few existing OCL files as examples can write that policy correctly, with appropriate helper rules and reason text, in under a minute.

What it can’t do is tell you whether the policy will affect projects you didn’t intend to touch, or whether the environment slug in your Octopus instance is "production" or "prod" or "Production". That’s where the what-if report comes in.

The what-if report is the safety mechanism

Generating a policy draft is fast. Knowing its impact before you enforce it requires something more. A what-if report evaluates the generated policy statically against your actual deployment process definitions — pulled live from the Octopus API — and classifies each in-scope project as passing, affected, or unable to evaluate statically.

This is what makes the pattern trustworthy rather than just convenient. You see exactly which projects will be warned or blocked before the policy is active. You can tune scope, adjust conditions, and re-run the analysis until the impact matches what you intended. Only then do you merge and publish.

The three-category output matters because it’s honest. “Unable to evaluate” isn’t a failure — it’s the report telling you that a condition depends on runtime state (which steps a user chose to skip, what package version was selected) that can’t be resolved from the process definition alone. Knowing that is useful. It tells you where to focus manual testing before promoting a policy from warn to block.

The discipline this requires

Translating natural language to policy is only as good as the intent you bring to it. Vague intent produces vague policy. “Improve deployment quality” isn’t a policy. “Require the security-scans process template on all deployments to staging and production for the billing-service project” is.

The forcing function of describing your policy in terms concrete enough for a model to act on is, itself, valuable. It surfaces ambiguity that existed in the original requirement before any code is written. If you can’t describe the condition precisely enough for an agent to implement it, you probably haven’t agreed on what you actually want to enforce.

That’s not a problem the model solves. It’s a problem the model exposes.

Concepts, IssueOps

This post is licensed under CC BY 4.0 by the author.

What changes when you add a language model

A concrete example

The what-if report is the safety mechanism

The discipline this requires

Trending Tags