Use OPA for a unified toolset and framework for policy across the cloud native stack.
Use OPA to decouple policy from the service’s code so you can release, analyze, and review policies without sacrificing availability or performance.
Declarative
Express policy in a high-level, declarative language that promotes safe, performant, fine-grained controls.
Use a language purpose-built for policy in a world where JSON is pervasive.
Iterate, traverse hierarchies, and apply 150+ built-ins like string manipulation and JWT decoding to declare the policies you want enforced.
Context-aware
Leverage external information to write the policies you really care about.
Instead, write logic that adapts to the world around it and attach that logic to the systems that need it.
Architectural Flexibility
Daemon - Sidecar
Deploy OPA as a separate process on the same host as your service.
Integrate OPA by changing your service’s code, importing an OPA-enabled library, or using a network proxy integrated with OPA.
Library
Embed OPA policies into your service.
Integrate OPA as a Go library that evaluates policy, or integrate a WebAssembly runtime and use OPA to compile policy to WebAssembly instructions.
Policy as Code
Introduction
Overview
OPA provides a high-level declarative language that lets you specify policy as code and simple APIs to offload policy decision-making from your software.
OPA decouples policy decision-making from policy enforcement.
When your software needs to make policy decisions it queries OPA and supplies structured data (e.g., JSON) as input.
OPA accepts arbitrary structured data as input.
OPA generates policy decisions by evaluating the query input against policies and data.
OPA and Rego are domain-agnostic so you can describe almost any kind of invariant in your policies.
Policy decisions are not limited to simple yes/no or allow/deny answers.
Like query inputs, your policies can generate arbitrary structured data as output.
Rego
OPA policies are expressed in a high-level declarative language called Rego.
Rego is purpose-built for expressing policies over complex hierarchical data structures.
When OPA evaluates policies it binds data provided in the query to a global variable called input. You can refer to data in the input using the . (dot) operator.
To refer to array elements you can use the familiar square-bracket syntax
1 2 3 4 5
input.servers[0].protocols[1]
// ---
"ssh"
You can use the same square bracket syntax if keys contain other than [a-zA-Z0-9_]. E.g., input["foo~bar"].
1 2 3 4 5
input.servers[0]["protocols"][0]
// ---
"https"
If you refer to a value that does not exist, OPA returns undefined. Undefined means that OPA was not able to find any results.
Expressions - AND
To produce policy decisions in Rego you write expressions against input and other data.
1 2 3 4 5
input.servers[0].protocols[0] == "https"
// ---
true
OPA includes a set of built-in functions you can use to perform common operations like string manipulation, regular expression matching, arithmetic, aggregation, and more.
1 2 3
count(input.servers[0].ports) >= 3
true
Multiple expressions are joined together with the ; (AND) operator.
For queries to produce results, all of the expressions in the query must be true or defined. The order of expressions does not matter.
If any of the expressions in the query are not true (or defined) the result is undefined.
Variables
You can store values in intermediate variables using the := (assignment) operator. Variables can be referenced just like input
1 2 3 4 5 6 7 8 9 10 11 12
s := input.servers[0] s.id == "app" p := s.protocols[0] p == "https"
// ---
+---------+-------------------------------------------------------------------+ | p | s | +---------+-------------------------------------------------------------------+ | "https" | {"id":"app","ports":["p1","p2","p3"],"protocols":["https","ssh"]} | +---------+-------------------------------------------------------------------+
When OPA evaluates expressions, it finds values for the variables that make all of the expressions true. If there are no variable assignments that make all of the expressions true, the result is undefined.
1 2 3
s := input.servers[0] s.id == "app" s.protocols[1] == "telnet"
Variables are immutable. OPA reports an error if you try to assign the same variable twice.
OPA must be able to enumerate the values for all variables in all expressions. If OPA cannot enumerate the values of a variable in any expression, OPA will report an error.
1 2
x := 1 x != y # y has not been assigned a value
Iteration
Like other declarative languages (e.g., SQL), iteration in Rego happens implicitly when you inject variables into expressions.
There are explicit iteration constructs to express FOR ALL and FOR SOME
Now the query asks for values of i that make the overall expression true.
When you substitute variables in references, OPA automatically finds variable assignments that satisfy all of the expressions in the query. Just like intermediate variables, OPA returns the values of the variables.
1 2 3 4 5 6 7 8 9 10
some i; input.networks[i].public == true
// ---
+---+ | i | +---+ | 2 | | 3 | +---+
You can substitute as many variables as you want.
1 2 3 4 5 6 7 8 9
some i, j; input.servers[i].protocols[j] == "http"
// ---
+---+---+ | i | j | +---+---+ | 3 | 0 | +---+---+
If variables appear multiple times the assignments satisfy all of the expressions.
1 2 3 4 5 6 7 8 9 10 11 12
some i, j id := input.ports[i].id input.ports[i].network == input.networks[j].id input.networks[j].public
// ---
+---+------+---+ | i | id | j | +---+------+---+ | 1 | "p2" | 2 | +---+------+---+
If you only refer to the variable once, you can replace it with the special _ (wildcard variable) operator. Conceptually, each instance of _ is a unique variable.
1 2 3 4 5
input.servers[_].protocols[_] == "http"
// ---
true
If OPA is unable to find any variable assignments that satisfy all of the expressions, the result is undefined.
1
some i; input.servers[i].protocols[i] == "ssh"
backwards-compatibility
In the first stage, users can opt-in to using the new keywords via a special import: import future.keywords.every introduces the every keyword described here.
Importing every means also importing in without an extra import statement.
At some point in the future, the keyword will become standard, and the import will become a no-op that can safely be removed.
FOR SOME
some ... in ... is used to iterate over the collection (its last argument), and will bind its variables (key, value position) to the collection items.
It introduces new bindings to the evaluation of the rest of the rule body.
1 2 3 4 5 6 7 8 9 10 11
public_network contains net.id if { some net in input.networks # some network exists and.. net.public # it is public. }
// ---
[ "net3", "net4" ]
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
shell_accessible contains server.id if { some server in input.servers "telnet" in server.protocols }
shell_accessible contains server.id if { some server in input.servers "ssh" in server.protocols }
// ---
[ "app", "busybox" ]
FOR ALL
every allows us to succinctly express that a condition holds for all elements of a domain.
no_telnet_exposed if { every server in input.servers { every protocol in server.protocols { "telnet" != protocol } } }
// ---
true
1 2 3 4 5 6 7 8 9
no_telnet_exposed_alt if { every server in input.servers { not "telnet" in server.protocols } }
// ---
true
1 2 3 4 5 6 7 8 9 10 11 12
any_telnet_exposed if { some server in input.servers "telnet" in server.protocols }
no_telnet_exposed_not_any if { not any_telnet_exposed }
// ---
true
Rules
Rego lets you encapsulate and re-use logic with rules. Rules are just if-then logic statements.
Complete Rules
Complete rules are if-then statements that assign a single value to a variable.
Every rule consists of a head and a body. In Rego we say the rule head is true if the rule body is true for some set of variable assignments.
1 2 3 4 5 6 7 8
any_public_networks := trueif { some net in input.networks # some network exists and.. net.public # it is public. }
// ---
true
any_public_networks := true is the head
some net in input.networks; net.public is the body
All values generated by rules can be queried via the global data variable. The path of a rule is always: data.<package-path>.<rule-name>.
1 2 3 4 5
data.example.rules.any_public_networks
// ---
true
If you omit the = <value> part of the rule head the value defaults to true.
1 2 3 4
any_public_networks if { some net in input.networks net.public }
To define constants, omit the rule body. When you omit the rule body it defaults to true. Since the rule body is true, the rule head is always true/defined.
1 2 3
package example.constants
pi := 3.14
1 2 3 4 5
pi > 3
// ---
true
If OPA cannot find variable assignments that satisfy the rule body, we say that the rule is undefined.(which is not the same as false.)
Lastly, you can check if a value exists in the set using the same syntax
1 2 3 4 5
public_network["net3"]
// ---
"net3"
Logical OR
When you join multiple expressions together in a query you are expressing logical AND. To express logical OR in Rego you define multiple rules with the same name.
package example import future.keywords.every # "every" implies "in"
allow := true { # allow is trueif... count(violation) == 0 # there are zero violations. }
violation[server.id] { # a server is in the violation set if... some server in public_servers # it exists in the 'public_servers' set and... "http" in server.protocols # it contains the insecure "http" protocol. }
violation[server.id] { # a server is in the violation set if... some server in input.servers # it exists in the input.servers collection and... "telnet" in server.protocols # it contains the "telnet" protocol. }
public_servers[server] { # a server exists in the public_servers set if... some server in input.servers # it exists in the input.servers collection and...
some port in server.ports # it references a port in the input.ports collection and... some input_port in input.ports port == input_port.id
some input_network in input.networks # the port references a network in the input.networks collection and... input_port.network == input_network.id input_network.public # the network is public. }
allow := true { # allow is trueif... count(violation) == 0 # there are zero violations. }
violation[server.id] { # a server is in the violation set if... some server public_server[server] # it exists in the 'public_server' set and... server.protocols[_] == "http" # it contains the insecure "http" protocol. }
violation[server.id] { # a server is in the violation set if... server := input.servers[_] # it exists in the input.servers collection and... server.protocols[_] == "telnet" # it contains the "telnet" protocol. }
public_server[server] { # a server exists in the public_server set if... some i, j server := input.servers[_] # it exists in the input.servers collection and... server.ports[_] == input.ports[i].id # it references a port in the input.ports collection and... input.ports[i].network == input.networks[j].id # the port references a network in the input.networks collection and... input.networks[j].public # the network is public. }
To integrate with OPA you can run it as a server and execute queries over HTTP.(-s / --server) By default OPA listens for HTTP connections on 0.0.0.0:8181.
1 2
$ opa run -s example.rego {"addrs":[":8181"],"diagnostic-addrs":[],"level":"info","msg":"Initializing server. OPA is running on a public (0.0.0.0) network interface. Unless you intend to expose OPA outside of the host, binding to the localhost interface (--addr localhost:8181) is recommended. See https://www.openpolicyagent.org/docs/latest/security/#interface-binding","time":"2022-10-14T22:32:11+08:00"}
When you query the /v1/data HTTP API you must wrap input data inside of a JSON object
By default data.system.main is used to serve policy queries without a path. When you execute queries without providing a path, you do not have to wrap the input. If the data.system.main decision is undefined it is treated as an error.
1 2 3 4 5 6 7 8 9 10
$ curl -i 127.1:8181 -d @input.json -H 'Content-Type: application/json' HTTP/1.1404 Not Found Content-Type: application/json Date: Sat,14 Oct 202214:43:27 GMT Content-Length:86
You can restart OPA and configure to use any decision as the default decision - default_decision
1 2 3 4 5 6 7 8 9 10
$ opa run -s --set=default_decision=example/allow example.rego {"addrs":[":8181"],"diagnostic-addrs":[],"level":"info","msg":"Initializing server. OPA is running on a public (0.0.0.0) network interface. Unless you intend to expose OPA outside of the host, binding to the localhost interface (--addr localhost:8181) is recommended. See https://www.openpolicyagent.org/docs/latest/security/#interface-binding","time":"2022-10-14T22:45:16+08:00"}
// Construct a Rego object that can be prepared or evaluated. r := rego.New( rego.Query(os.Args[2]), rego.Load([]string{os.Args[1]}, nil))
// Create a prepared query that can be evaluated. query, err := r.PrepareForEval(ctx) if err != nil { log.Fatal(err) }
// Load the input document from stdin. var input interface{} dec := json.NewDecoder(os.Stdin) dec.UseNumber() if err := dec.Decode(&input); err != nil { log.Fatal(err) }
// Execute the prepared query. rs, err := query.Eval(ctx, rego.EvalInput(input)) if err != nil { log.Fatal(err) }
// Do something with the result. fmt.Println(rs) }
1 2
$ go run main.go example.rego 'data.example.violation' < input.json [{[[busybox ci]] map[]}]
Philosophy
A policy is a set of rules that governs the behavior of a software service.
OPA helps you decouple any policy using any context from any software system.
Policy Decoupling
Software services should allow policies to be specified declaratively, updated at any time without recompiling or redeploying, and enforced automatically.
What is OPA?
OPA is a lightweight general-purpose policy engine that can be co-located with your service. You can integrate OPA as a sidecar, host-level daemon, or library.
Services offload policy decisions to OPA by executing queries.
OPA evaluates policies and data to produce query results (which are sent back to the client).
Policies are written in a high-level declarative language and can be loaded dynamically into OPA remotely via APIs or through the local filesystem.
Why use OPA?
OPA is a full-featured policy engine that offloads policy decisions from your software.
Without OPA, you need to implement policy management for your software from scratch. That’s a lot of work.
Document Model
OPA policies (written in Rego) make decisions based on hierarchical structured data.
Importantly, OPA policies can make decisions based on arbitrary structured data.
OPA itself is not tied to any particular domain model.
Similarly, OPA policies can represent decisions as arbitrary structured data.
Data can be loaded into OPA from outside world using push or pull interfaces that operate synchronously or asynchronously with respect to policy evaluation.
We refer to all data loaded into OPA from the outside world as base documents.
These base documents almost always contribute to your policy decision-making logic.
However, your policies can also make decisions based on each other.
Policies almost always consist of multiple rules that refer to other rules (possibly authored by different groups).
In OPA, we refer to the values generated by rules (a.k.a., decisions) as virtual documents.
The term virtual in this case just means the document is computed by the policy, i.e., it’s not loaded into OPA from the outside world.
Base and virtual documents can represent the exact same kind of information.
Moreover, with Rego, you can refer to both base and virtual documents using the exact same dot/bracket-style reference syntax.
Consistency across the types of values that can be represented and the way those values are referenced means that policy authors only need to learn one way of modeling and referring to information that drives policy decision-making.
Additionally, since there is no conceptual difference in the types of values or the way you refer to those values in base and virtual documents
Rego lets you refer to both base and virtual documents through a global variable called data.
Similarly, OPA lets you query for both base and virtual documents via the /v1/data HTTP API.
location
Since base documents come from outside of OPA, their location under data is controlled by the software doing the loading.
On the other hand, the location of virtual documents under data is controlled by policies themselves using the package directive in the language.
Base documents can be pushed or pulled into OPA asynchronously by replicating data into OPA when the state of the world changes.
This can happen periodically or when some event (like a database change notification) occurs.
Base documents loaded asynchronously are always accessed under the data global variable.
On the other hand, base documents can also be pushed or pulled into OPA synchronously when your software queries OPA for policy decisions.
We refer to base documents pushed synchronously as input. Policies can access these inputs under the input global variable.
To pull base documents during policy evaluation, OPA exposes (and can be extended with custom) built-in functions like http.send.
Built-in function return values can be assigned to local variables and surfaced in virtual documents.
Data loaded synchronously is kept outside of data to avoid naming conflicts.
Summarizes the different models for loading base documents into OPA, how they can be referenced inside of policies, and the actual mechanism(s) for loading.
Asynchronous - data
Model
How to access in Rego
How to integrate with OPA
Asynchronous Push
The data global variable
Invoke OPA’s API(s), e.g., PUT /v1/data
Asynchronous Pull
The data global variable
Configure OPA’s Bundle feature
Synchronous Push
The input global variable
Provide data in policy query, e.g., inside the body of POST /v1/data
Synchronous Pull
The built-in functions, e.g., http.send
N/A
Data loaded asynchronously into OPA is cached in-memory so that it can be read efficiently during policy evaluation.
Similarly, policies are also cached in-memory to ensure high-performance and high-availability.
Data pulled synchronously can also be cached in-memory.
base
API request information pushed synchronously located under input.
Entitlements data pulled asynchronously and located under data.entitlements.
Resource data pulled synchronously during policy evaluation using the http.send built-in function.
virtual
The entitlements and resource information is abstracted by rules that generate virtual documents named data.iam.user_has_role and data.acme.user_is_assigned respectively.
External Data
OPA was designed to let you make context-aware authorization and policy decisions by injecting external data that describes what is happening in the world and then writing policy using that data.
OPA has a cache or replica of that data, just as OPA has a cache/replica of policy; OPA is not designed to be the source of truth for either.
This document describes options for replicating data into OPA.
The content of the data does not matter, but the size, frequency of update, and consistency constraints all do impact which kind of data replication to employ.
You should prefer earlier options in the list to later options, but in the end the right choice depends on your situation.
JWT Tokens
JSON Web Tokens (JWTs) allow you to securely transmit JSON data between software systems and are usually produced during the authentication process.
You can set up authentication so that when the user logs in you create a JWT with that user’s attributes (or any other data as far as OPA is concerned).
Then you hand that JWT to OPA and use OPA’s specialized support for JWTs to extract the information you need to make a policy decision.
Flow
User logs in to an authentication system, e.g. LDAP/AD/etc.
The user is given a JWT token encoding group membership and other user attributes stored in LDAP/AD
The user provides that JWT token to an OPA-enabled software system for authentication
The OPA-enabled software system includes that token as part of the usual input to OPA.
OPA decodes the JWT token and uses the contents to make policy decisions.
Updates
The JWT only gets refreshed when the user authenticates; how often that happens is up to the TTL included in the token.
When user-attribute information changes, those changes will not be seen by OPA until the user authenticates and gets a new JWT.
Size Limitations
JWTs have a limited size in practice, so if your organization has too many user attributes you may not be able to fit all the required information into a JWT.
Security
OPA includes primitives to verify the signature of JWT tokens.
OPA let’s you check the TTL.
OPA has support for making HTTP requests during evaluation, which could be used to check if a JWT has been revoked.
Overload input
For example, suppose your policy says that only a file’s owner may delete it.
The authentication system does not track resource-ownership, but the system responsible for files certainly does.
The file-ownership system may be the one that is asking for an authorization decision from OPA.
It already knows which file is being operated on and who the owner is, so it can hand OPA the file-owner as part of OPA’s input.
This can be dangerous in that it ties the integration of OPA to the policy, but often it’s sufficient to have the file-ownership system hand over all the file’s metadata.
Flow
OPA-enabled software gathers relevant metadata (and caches it for subsequent requests)
OPA-enabled software sends input to OPA including the external data
Policy makes decisions based on external data included in input
Updates
External data gets updated as frequently as the OPA-enabled software updates it.
Often some of that data is local to the OPA-enabled software, and sometimes it is remote.
The remote data is usually cached for performance and hence is as updated as the caching strategy allows.
Size Limitations
Size limitations are rarely a problem for OPA in this approach because it only sees the metadata for 1 request at a time.
However, the cache of remote data that the OPA-enabled service creates will have a limit that the developer controls.
Security
This approach is as secure as the connection between the OPA-enabled service and OPA itself, under the assumption that the OPA-enabled service gathers the appropriate metadata securely.
That is, using external data with this approach is as secure as using OPA in the first place.
Recommended usage
Local, Dynamic data
This approach is valuable when the data changes fairly frequently and/or when the cost of making decisions using stale data is high.
It works especially well when the external data is local to the system asking for authorization decisions.
It can work in the case of remote data as well, but there is more coupling of the system to OPA because the system is hardcoded to fetch the data needed by the policy (and only that data).
Bundle API
When external data changes infrequently and can reasonably be stored in memory all at once, you can replicate that data in bulk via OPA’s bundle feature.
The bundle feature periodically downloads policy bundles from a centralized server, which can include data as well as policy.
Every time OPA gets updated policies, it gets updated data too.
You must implement the bundle server and integrate your external data into the bundle server
OPA does NOT help with that–but once it is done, OPA will happily pull the data (and policies) out of your bundle server.
Flow
Three things happen independently with this kind of data integration.
A. OPA-enabled software system asks OPA for policy decisions
B. OPA downloads new policy bundles including external data
C. Bundle server replicates data from source of truth
Updates
The lag between a data update and OPA having the update is the sum of the lag for an update between data replication and the central bundle server and the lag for an update between the central bundle server and OPA.
So if data replication happens every 5 minutes, and OPA pulls a new bundle every 2 minutes, then the total maximum lag is 7 minutes.
Size limitations
OPA stores the entire datasource at once in memory.
Obviously this can be a problem with large external data sets.
Because the centralized server handles both policy and data it can prune data to just that which is needed for the policies.
Recommended usage
Static, Medium-sized data
This approach is more flexible than the JWT and input cases above
because you can include an entirely new data source at the bundle server without changing the authentication service or the OPA-enabled service.
You are also guaranteed that the policy and its corresponding data always arrive at the same time, making the policy-data consistency perfect.
The drawback is that the consistency of the data with the source of truth is worse than the input case and could be better or worse than the consistency for the JWT case (because JWTs only get updated on login).
One feature currently under design is a delta-based bundle protocol, which could improve the data consistency model significantly by lowering the cost of frequent updates.
But as it stands this approach is ideal when the data is relatively static and the data fits into memory.
Push Data
Another way to replicate external data in its entirety into OPA is to use OPA’s API for injecting arbitrary JSON data.
You can build a replicator that pulls information out of the external data source and pushes that information in OPA through its API.
This approach is similar in most respects to the bundle API, except it lets you optimize for update latency and network traffic.
Flow
Three things happen independently with this kind of data replication.
A. OPA-enabled software system asks OPA for policy decisions
B. Data replicator pushes data into OPA
C. Data replicator replicates data from source of truth
Depending on the replication scheme, B and C could be tied together so that every update the data replicator gets from the source of truth it pushes into OPA, but in general those could be decoupled depending on the desired network load, the changes in the data, and so on.
Updates
The total lag between the external data source being updated and OPA being updated is the sum of the lag for an update between the data source and the synchronizer plus the lag for an update between the synchronizer and OPA.
Size limitations
The entirety of the external data source is stored in memory, which can obviously be a problem with large external data sources.
But unlike the bundle API, this approach does allow updates to data.
Recommended usage
Dynamic, Medium-sized data
This approach is very similar to the bundle approach except it updates the data stored in OPA with deltas instead of an entire snapshot at a time.
Because the data is updated as deltas, this approach is well-suited for data that changes frequently.
It assumes the data can fit entirely in memory and so is well-suited to small and medium-sized data sets.
Pull Data during Evaluation
OPA includes functionality for reaching out to external servers during evaluation.
This functionality handles those cases where there is too much data to synchronize into OPA, or policy requires information that must be as up to date as possible.
That functionality is implemented using built-in functions such as http.send.
Current limitations
Credentials needed for the external service can either be hardcoded into policy or pulled from the environment.
The built-in functions do not implement any retry logic.
Flow
The key difference here is that every decision requires contacting the external data source. If that service or the network connection is slow or unavailable, OPA may not be able to return a decision.
OPA-enabled service asks OPA for a decision
During evaluation OPA asks the external data source for additional information
Updates
External data is perfectly fresh.
There is no lag between an update to the external data and when OPA sees that update.
Size limitations
Only the data actually needed by the policy is pulled from the external data source.
There is no need for a replicator to figure out what data the policy will need before execution.
Performance and Availability
Latency and availability of decision-making are dependent on the network.
This approach may still be superior to running OPA on a remote server entirely
because a local OPA can make some decisions without going over the network
those decisions that do not require information from the remote data server.
Recommended usage
Highly Dynamic or Large-sized data
If the data is too large to fit into memory, or it changes too frequently to cache it inside of OPA, the only real option is to fetch the data on demand.
The input approach fetches data on demand as well
but puts the burden on the OPA-enabled service to fetch the necessary data (and to know what data is necessary).
The downside to pulling data on demand is reduced performance and availability because of the network, which can be mitigated via caching.
In the input case, caching is under the control of the OPA-enabled service and can therefore be tailored to fit the properties of the data.
In the http.send case, caching is largely under the control of the remote service that sets HTTP response headers to indicate how long the response can be cached for.
It is crucial in this approach for the OPA-enabled service to handle the case when OPA returns no decision.
Summary
Approach
Perf/Avail
Limitations
Recommended Data
JWT
High
Updates only when user logs back in
User attributes
Input
High
Coupling between service and OPA
Local, dynamic
Bundle
High
Updates to policy/data at the same time. Size an issue.
Static, medium
Push
High
Control data refresh rate. Size an issue.
Dynamic, medium
Evaluation Pull
Dependent on network
Perfectly up to date. No size limit
Dynamic or large
Policy Language
What is Rego?
Rego was inspired by Datalog, which is a well understood, decades old query language.
Rego extends Datalog to support structured document models such as JSON.
Rego queries are assertions on data stored in OPA.
These queries can be used to define policies that enumerate instances of data that violate the expected state of the system.
Why use Rego?
Use Rego for defining policy that is easy to read and write.
Rego focuses on providing powerful support for referencing nested documents and ensuring that queries are correct and unambiguous.
Rego is declarative so policy authors can focus on what queries should return rather than how queries should be executed.
These queries are simpler and more concise than the equivalent in an imperative language.
Like other applications which support declarative query languages, OPA is able to optimize queries to improve performance.
The Basics
The simplest rule is a single expression and is defined in terms of a Scalar Value
Rules define the content of documents.
1
pi := 3.14159
Rules can also be defined in terms of Composite Values
1
rect := {"width": 2, "height": 4}
You can compare two scalar or composite values, and when you do so you are checking if the two values are the same JSON value
1 2
rect := {"width": 2, "height": 4} same if rect == {"height": 4, "width": 2} # true
You can define a new concept using a rule.
1 2
# undefined v if "hello" == "world"
Expressions that refer to undefined values are also undefined. This includes comparisons such as !=.
1 2 3
v if "hello" == "world" # undefined a if v == true # undefined b if v != true # undefined
We can define rules in terms of Variables as well The formal syntax uses the semicolon character ; to separate expressions. Rule bodies can separate expressions with newlines and omit the semicolon
1
t if {x := 42; y := 41; x > y} # true
1 2 3 4 5
t if { x := 42 y := 41 x > y } # true
if is optional.
1 2 3 4 5 6 7 8 9
v { "hello" == "world" } # undefined
t { x := 42 y := 41 x > y } # true
When evaluating rule bodies, OPA searches for variable bindings that make all of the expressions true. There may be multiple sets of bindings that make the rule body true. The rule body can be understood intuitively as
1
expression-1 AND expression-2 AND ... AND expression-N
The rule itself can be understood intuitively as If the value is omitted, it defaults to true.
When Rego values are converted to JSON non-string object keys are marshalled as strings, because JSON does not support non-string object keys.
Sets
In addition to arrays and objects, Rego supports set values. Sets are unordered collections of unique values. Just like other composite values, sets can be defined in terms of scalars, variables, references, and other composite values.
Set documents are collections of values without keys.
OPA represents set documents as arrays when serializing to JSON or other formats that do not support a set data type.
The important distinction between sets and arrays or objects is that sets are unkeyed while arrays and objects are keyed
i.e., you cannot refer to the index of an element within a set.
When comparing sets, the order of elements does not matter
1 2 3
same_set if { {1, 2, 3} == {3, 1, 2} } # true
Because sets share curly-brace syntax with objects, and an empty object is defined with {}, an empty set has to be constructed with a different syntax: set()
1
size := count(set()) # 0
Variables
Variables are another kind of term in Rego. They appear in both the head and body of rules.
Variables appearing in the head of a rule can be thought of as input and output of the rule.
In Rego a variable is simultaneously an input and an output.
If a query supplies a value for a variable, that variable is an input
and if the query does not supply a value for a variable, that variable is an output
q contains name if { some site in sites name := site.name }
query - q[x] In this case, we evaluate q with a variable x (which is not bound to a value). As a result, the query returns all of the values for x and all of the values for q[x], which are always the same because q is a set.
Query - q["dev"] On the other hand, if we evaluate q with an input value for name we can determine whether name exists in the document defined by q:
1
"dev"
Variables appearing in the head of a rule must also appear in a non-negated equality expression within the same rule.
This property ensures that if the rule is evaluated and all of the expressions evaluate to true for some set of variable bindings, the variable in the head of the rule will be defined.
In the reference above, we effectively used variables named i and j to iterate the collections. If the variables are unused outside the reference, we prefer to replace them with an underscore (_) character.
The underscore is special because it cannot be referred to by other parts of the rule, e.g., the other side of the expression, another expression, etc.
The underscore can be thought of as a special iterator. Each time an underscore is specified, a new iterator is instantiated.
Under the hood, OPA translates the _ character to a unique variable name that does not conflict with variables and rules that are in scope.
Composite Keys
References can include Composite Values as keys if the key is being used to refer into a set. Composite keys may not be used in refs for base data documents, they are only valid for references into virtual documents.
This is useful for checking for the presence of composite values within a set, or extracting all values within a set matching some pattern.
Rules are often written in terms of multiple expressions that contain references to documents.
1 2 3 4 5 6 7
apps_and_hostnames[[name, hostname]] { some i, j, k name := apps[i].name server := apps[i].servers[_] sites[j].servers[k].name == server hostname := sites[j].servers[k].hostname }
Several variables appear more than once in the body.
When a variable is used in multiple locations, OPA will only produce documents for the rule with the variable bound to the same value in all expressions.
The rule is joining the apps and sites documents implicitly.
In Rego (and other languages based on Datalog), joins are implicit.
Self-Joins
Using a different key on the same array or object provides the equivalent of self-join in SQL.
1 2 3 4 5 6 7 8 9
same_site[apps[k].name] { some i, j, k apps[i].name == "mysql" server := apps[i].servers[_] server == sites[j].servers[_].name other_server := sites[j].servers[_].name server != other_server other_server == apps[k].servers[_] }
Comprehensions provide a concise way of building Composite Values from sub-queries.
Like Rules, comprehensions consist of a head and a body.
The body of a comprehension can be understood in exactly the same way as the body of a rule
that is, one or more expressions that must all be true in order for the overall body to be true.
When the body evaluates to true, the head of the comprehension is evaluated to produce an element in the result.
The body of a comprehension is able to refer to variables defined in the outer body.
1 2
region := "west" names := [name | sites[i].region == region; name := sites[i].name]
In the above query, the second expression contains an Array Comprehension that refers to the region variable. The region variable will be bound in the outer body.
When a comprehension refers to a variable in an outer body, OPA will reorder expressions in the outer body so that variables referred to in the comprehension are bound by the time the comprehension is evaluated.
Rules define the content of Virtual Documents in OPA. When OPA evaluates a rule, we say OPA generates the content of the document that is defined by the rule.
Rule definitions can be more expressive when using the future keywordscontains and if.
Generating Sets
The following rule defines a set containing the hostnames of all servers
1 2 3
hostnames contains name if { name := sites[_].servers[_].hostname }
First, the rule defines a set document where the contents are defined by the variable name.
We know this rule defines a set document because the head only includes a key.
Second, the sites[_].servers[_].hostname fragment selects the hostname attribute from all of the objects in the servers collection.
From reading the fragment in isolation we cannot tell whether the fragment refers to arrays or objects.
We only know that it refers to a collections of values.
Third, the name := sites[_].servers[_].hostname expression binds the value of the hostname attribute to the variable name, which is also declared in the head of the rule.
All rules have the following form (where key, value, and body are all optional)
1
<name> <key>? <value>? <body>?
Generating Objects
Rules that define objects are very similar to rules that define sets
1 2 3 4 5 6 7
apps_by_hostname[hostname] := app if { some i server := sites[_].servers[_] hostname := server.hostname apps[i].servers[_] == server.name app := apps[i].name }
The rule above defines an object that maps hostnames to app names.
The main difference between this rule and one which defines a set is the rule head: in addition to declaring a key, the rule head also declares a value for the document.
In addition to rules that partially define sets and objects, Rego also supports so-called complete definitions of any type of document. Rules provide a complete definition by omitting the key in the head. Complete definitions are commonly used for constants:
1
pi := 3.14159
Rego allows authors to omit the body of rules. If the body is omitted, it defaults to true.
Documents produced by rules with complete definitions can only have one value at a time. If evaluation produces multiple values for the same document, an error will be returned.
In some cases, having an undefined result for a document is not desirable. In those cases, policies can use the Default Keyword to provide a fallback value.
Note that the (future) keyword if is optional here.
This module defines two complete rules data.play.fruit.apple.seeds + data.play.fruit.orange.color
Variables in Rule Head References
Example
Any term, except the very first, in a rule head’s reference can be a variable. These variables can be assigned within the rule, just as for any other partial rule, to dynamically construct a nested collection of objects.
# A partial object rule that converts a list of users to a mapping by "role" and then "id". users_by_role[role][id] := user if { some user in input.users id := user.id role := user.role }
# Partial rule with an explicit "admin" key override users_by_role.admin[id] := user if { some user in input.admins id := user.id }
# Leaf entries can be partial sets users_by_country[country] contains user.id if { some user in input.users country := user.country }
The first variable declared in a rule head’s reference divides the reference in a leading constant portion and a trailing dynamic portion. Other rules are allowed to overlap with the dynamic portion (dynamic extent) without causing a compile-time conflict.
1 2 3 4 5 6 7 8
# R1 p[x].r := y { x := "q" y := 1 }
# R2 p.q.r := 2
In the above example, rule R2 overlaps with the dynamic portion of rule R1’s reference ([x].r), which is allowed at compile-time, as these rules aren’t guaranteed to produce conflicting output. However, as R1 defines x as "q" and y as 1, a conflict will be reported at evaluation-time.
1
eval_conflict_error: object keys must be unique
1 2 3 4 5 6 7 8
# R1 p[x].r := y if { x := "q" y := 1 }
# R2 p.q.r := 1
output.json
1 2 3 4 5 6 7
{ "p":{ "q":{ "r":1 } } }
Conflicts are detected at compile-time, where possible, between rules even if they are within the dynamic extent of another rule.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
package play
import future.keywords
# R1 p[x].r := y if { x := "foo" y := 1 }
# R2 p.q.r := 2
# R3 p.q.r.s := 3
1
rego_type_error: rule data.play.p.q.r conflicts with [data.play.p.q.r.s]
Above, R2 and R3 are within the dynamic extent of R1, but are in conflict with each other, which is detected at compile-time.
Rules are not allowed to overlap with object values of other rules.
1 2 3 4 5 6 7
# R1 p.q.r := {"s": 1}
# R2 p[x].r.t := 2 { x := "q" }
1
eval_conflict_error: object keys must be unique
In the above example, R1 is within the dynamic extent of R2 and a conflict cannot be detected at compile-time. However, at evaluation-timeR2 will attempt to inject a value under key t in an object value defined by R1. This is a conflict, as rules are not allowed to modify or replace values defined by other rules.
1 2 3 4 5 6 7
# R1 p.q.r.s := 1
# R2 p[x].r.t := 2if { x := "q" }
output.json
1 2 3 4 5 6 7 8 9 10
{ "p":{ "q":{ "r":{ "s":1, "t":2 } } } }
As R1 is now instead defining a value within the dynamic extent of R2’s reference, which is allowed.
Functions
Rego supports user-defined functions that can be called with the same semantics as Built-in Functions. They have access to both the the data Document and the input Document
1 2 3 4 5 6 7 8 9 10
package play
import future.keywords
trim_and_split(s) := x if { t := trim(s, " ") x := split(t, ".") }
r := trim_and_split(" foo.bar ")
output.json
1 2 3 4 5 6
{ "r":[ "foo", "bar" ] }
Note that the (future) keyword if is optional here.
1 2 3 4
trim_and_split(s) := x { t := trim(s, " ") x := split(t, ".") }
Functions may have an arbitrary number of inputs, but exactly one output. Function arguments may be any kind of term.
If you need multiple outputs, write your functions so that the output is an array, object or set containing your results. If the output term is omitted, it is equivalent to having the output term be the literal true. Furthermore, if can be used to write shorter definitions.
1 2 3 4 5 6 7
f(x) { x == "foo" } f(x) if { x == "foo" } f(x) if x == "foo"
f(x) := true { x == "foo" } f(x) := trueif { x == "foo" } f(x) := trueif x == "foo"
The outputs of user functions have some additional limitations, namely that they must resolve to a single value. If you write a function that has multiple possible bindings for an output variable, you will get a conflict error
1 2 3 4 5
p(x) := y if { y := x[_] }
r := p([1, 2, 3])
1
eval_conflict_error: functions must not produce multiple outputs for same inputs
It is possible in Rego to define a function more than once, to achieve a conditional selection of which function to execute: Functions can be defined incrementally.
A given function call will execute all functions that match the signature given.
1 2 3 4 5 6 7 8 9
r(1, x) := y if { y := x }
r(x, 2) := y if { y := x * 4 }
r1 := r(1, 2)
1
eval_conflict_error: functions must not produce multiple outputs for same inputs
On the other hand, if a call matches no functions, then the result is undefined.
1 2 3 4 5
s(x, 2) := y if { y := x * 4 }
r := s(5, 3) # undefined
Function overloading
Rego does not currently support the overloading of functions by the number of parameters. If two function definitions are given with the same function name but different numbers of parameters, a compile-time type error is generated.
1 2 3 4 5 6 7
r(x) := result if { result := 2 * x }
r(x, y) := result if { result := (2 * x) + (3 * y) }
1
rego_type_error: conflicting rules data.play.r found
The error can be avoided by using different function names.
1 2 3 4 5 6 7 8 9
r_1(x) := result if { result := 2 * x }
r_2(x, y) := result if { result := (2 * x) + (3 * y) }
r := [r_1(10), r_2(10, 1)]
output.json
1 2 3 4 5 6
{ "r":[ 20, 23 ] }
In the unusual case that it is critical to use the same name, the function could be made to take the list of parameters as a single array.
1 2 3 4 5 6 7 8 9 10 11
r(params) := result if { count(params) == 1 result := 2 * params[0] }
r(params) := result if { count(params) == 2 result := (2 * params[0]) + (3 * params[1]) }
x := [r([10]), r([10, 1])]
output.json
1 2 3 4 5 6
{ "x": [ 20, 23 ] }
Negation
To generate the content of a Virtual Document, OPA attempts to bind variables in the body of the rule such that all expressions in the rule evaluate to True. This generates the correct result when the expressions represent assertions about what states should exist in the data stored in OPA. In some cases, you want to express that certain states should not exist in the data stored in OPA. In these cases, negation must be used.
For safety, a variable appearing in a negated expression must also appear in another non-negated equality expression in the rule.
OPA will reorder expressions to ensure that negated expressions are evaluated after other non-negated expressions with the same variables.
OPA will reject rules containing negated expressions that do not meet the safety criteria described above.
The simplest use of negation involves only scalar values or variables and is equivalent to complementing the operator
1 2 3 4
t if { greeting := "hello" not greeting == "goodbye" }
output.json
1 2 3
{ "t":true }
Negation is required to check whether some value does not exist in a collection. That is, complementing the operator in an expression such as p[_] == "foo" yields p[_] != "foo". However, this is not equivalent tonot p["foo"].
The most expressive way to state this in Rego is using the every keyword
1 2 3 4 5
no_bitcoin_miners_using_every if { every app in apps { app.name != "bitcoin-miner" } } # true
Variables in Rego are existentially quantified by default
1 2
arr := ["one", "two", "three"] x := i if arr[i] == "three" # 2
Define a rule that finds if there exists a bitcoin-mining app (which is easy using the some keyword). And then you use negation to check that there is NO bitcoin-mining app. Technically, you’re using 2 negations and an existential quantifier, which is logically the same as a universal quantifier.
1 2 3 4 5 6
any_bitcoin_miners if { some app in apps app.name == "bitcoin-miner" } # undefined
no_bitcoin_miners_using_negation if not any_bitcoin_miners # true
1 2 3 4 5
no_bitcoin_miners_using_negation with apps as [{"name": "web"}]
// ---
true
1 2 3 4 5
no_bitcoin_miners_using_negation with apps as [{"name": "bitcoin-miner"}, {"name": "web"}]
// ---
undefined
The undefined result above is expected because we did not define a default value for no_bitcoin_miners_using_negation. Since the body of the rule fails to match, there is no value generated.
1 2 3 4
no_bitcoin_miners if { app := apps[_] app.name != "bitcoin-miner" # THIS IS NOT CORRECT. }
1 2 3 4
no_bitcoin_miners if { some app in apps app.name != "bitcoin-miner" }
The reason the rule is incorrect is that variables in Rego are existentially quantified. This means that rule bodies and queries express FOR ANY and not FOR ALL. To express FOR ALL in Rego complement the logic in the rule body (e.g., != becomes ==) and then complement the check using negation FOR ALL = Not FOR ANY
Alternatively, we can implement the same kind of logic inside a single rule using Comprehensions.
1 2 3 4
no_bitcoin_miners_using_comprehension if { bitcoin_miners := {app | some app in apps; app.name == "bitcoin-miner"} count(bitcoin_miners) == 0 } # true
Whether you use negation, comprehensions, or every to express FOR ALL is up to you. The every keyword should lend itself nicely to a rule formulation that closely follows how requirements are stated, and thus enhances your policy’s readability.
The comprehension version is more concise than the negation variant, and does not require a helper rule while the negation version is more verbose but a bit simpler and allows for more complex ORs.
Modules
In Rego, policies are defined inside modules. Modules consist of:
Exactly one Package declaration.
Zero or more Import statements.
Zero or more Rule definitions.
Modules are typically represented in Unicode text and encoded in UTF-8.
Comments
Comments begin with the # character and continue until the end of the line.
Packages
Packages group the rules defined in one or more modules into a particular namespace.
Because rules are namespaced they can be safely shared across projects.
Modules contributing to the same package do not have to be located in the same directory.
The rules defined in a module are automatically exported.
That is, they can be queried under OPA’s Data API provided the appropriate package is given.
1 2 3
package opa.examples
pi := 3.14159
The pi document can be queried via the Data API
1
GET https://example.com/v1/data/opa/examples/pi HTTP/1.1
Valid package names are variables or references that only contain string operands. valid package names
# allow alice to perform any operation. allow if user == "alice"
# allow bob to perform read-only operations. allow if { user == "bob" method == "GET" }
# allows users assigned a "dev" role to perform read-only operations. allow if { method == "GET" input.user in data.roles["dev"] }
# allows user catherine access on Saturday and Sunday allow if { user == "catherine" day := time.weekday(time.now_ns()) day in ["Saturday", "Sunday"] }
input.json
1 2 3 4
{ "user":"alice", "method":"GET" }
Imports can include an optional as keyword to handle namespacing issues
1 2 3 4 5 6 7 8 9
package opa.examples import future.keywords
import data.servers as my_servers
http_servers contains server if { some server in my_servers "http" in server.protocols }
Future Keywords
To ensure backwards-compatibility, new keywords (like every) are introduced slowly. In the first stage, users can opt-in to using the new keywords via a special import
import future.keywords introduces all future keywords, and
import future.keywords.xonly introduces the x keyword – see below for all known future keywords.
Using import future.keywords to import all future keywords means an opt-out of a safety measure:
With a new version of OPA, the set of all future keywords can grow, and policies that worked with the previous version of OPA stop working.
This cannot happen when you selectively import the future keywords as you need them.
At some point in the future, the keyword will become standard, and the import will become a no-op that can safely be removed.
Note that some future keyword imports have consequences on pretty-printing If contains or if are imported, the pretty-printer will use them as applicable when formatting the modules.
in
More expressive membership and existential quantification keyword
1 2 3 4 5 6 7
deny { some x in input.roles # iteration x == "denylisted-role" } deny { "denylisted-role" in input.roles # membership check }
in was introduced in v0.34.0.
every
Expressive universal quantification keyword
1 2 3 4 5 6 7
allowed := {"customer", "admin"}
allow { every role in input.roles { role.name in allowed } }
There is no need to also import future.keywords.in, that is implied by importing future.keywords.every.
every was introduced in v0.38.0.
if
This keyword allows more expressive rule heads
1
deny if input.token != "secret"
if was introduced in v0.42.0.
contains
This keyword allows more expressive rule heads for partial set rules
1
deny contains msg { msg := "forbidden" }
contains was introduced in v0.42.0.
Some Keyword
The some keyword allows queries to explicitly declare local variables. Use the some keyword in rules that contain unification statements or references with variable operands if variables contained in those statements are not declared using := .
Statement
Example
Variables
Unification
input.a = [["b", x], [y, "c"]]
x and y
Reference with variable operands
data.foo[i].bar[j]
i and j
For example, the following rule generates tuples of array indices for servers in the “west” region that contain “db” in their name.
1 2 3 4 5 6
tuples contains [i, j] if { some i, j sites[i].region == "west" server := sites[i].servers[j] # note: 'server' is local because it's declared with := contains(server.name, "db") }
output.json
1 2 3 4 5 6 7 8 9 10 11 12
... "tuples":[ [ 1, 2 ], [ 2, 1 ] ] ...
Since we have declared i, j, and server to be local, we can introduce rules in the same package without affecting the result above:
1 2
# Define a rule called 'i' i := 1
If we had not declared i with the some keyword, introducing the i rule above would have changed the result of tuples because the i symbol in the body would capture the global value. removing some i, j
1 2 3 4 5 6 7 8 9
# Define a rule called 'i' i := 1
tuples contains [i, j] if { # some i, j sites[i].region == "west" server := sites[i].servers[j] # note: 'server' is local because it's declared with := contains(server.name, "db") }
output.json
1 2 3 4 5 6 7 8
... "tuples":[ [ 1, 2 ] ] ...
The some keyword is not required but it’s recommended to avoid situations like the one above where introduction of a rule inside a package could change behaviour of other rules.
Every Keyword
1 2 3 4 5 6 7 8
names_with_dev if { some site in sites site.name == "dev"
every server in site.servers { endswith(server.name, "-dev") } } # true
The every keyword takes an (optional) key argument, a value argument, a domain, and a block of further queries, its “body”.
The keyword is used to explicitly assert that its body is true for any element in the domain.
It will iterate over the domain, bind its variables, and check that the body holds for those bindings.
If one of the bindings does not yield a successful evaluation of the body, the overall statement is undefined.
If the domain is empty, the overall statement is true.
Evaluating every does not introduce new bindings into the rule evaluation.
Used with a key argument, the index, or property name (for objects), comes into the scope of the body evaluation
1 2 3 4 5 6 7 8 9 10 11 12 13 14
array_domain if { every i, x in [1, 2, 3] { x - i == 1 } # array domain } # true
object_domain if { every k, v in {"foo": "bar", "fox": "baz"} { # object domain startswith(k, "f") startswith(v, "b") } } # true
set_domain if { every x in {1, 2, 3} { x != 4 } # set domain } # true
Semantically, every x in xs { p(x) } is equivalent to, but shorter than, a “not-some-not” construct using a helper rule
1 2 3 4 5 6 7 8 9 10 11 12 13 14
xs := [2, 2, 4, 8]
larger_than_one(x) := x > 1
rule_every if { every x in xs { larger_than_one(x) } }
less_or_equal_one if { some x in xs not larger_than_one(x) }
Negating every is forbidden. If you desire to express not every x in xs { p(x) } please use some x in xs; not p(x) instead.
With Keyword
The with keyword allows queries to programmatically specify values nested under the input Document or the data Document, or built-in functions.
For example, given the simple authorization policy in the Imports section, we can write a query that checks whether a particular request would be allowed
1 2 3 4 5
allow with input as {"user": "alice", "method": "POST"}
// ---
true
1 2 3 4 5
allow with input as {"user": "bob", "method": "GET"}
// ---
true
1 2 3 4 5
not allow with input as {"user": "bob", "method": "DELETE"}
// ---
true
1 2 3 4 5
allow with input as {"user": "charlie", "method": "GET"} with data.roles as {"dev": ["charlie"]}
// ---
true
1 2 3 4 5
not allow with input as {"user": "charlie", "method": "GET"} with data.roles as {"dev": ["bob"]}
// ---
true
1 2 3 4 5 6 7
allow with input as {"user": "catherine", "method": "GET"} with data.roles as {"dev": ["bob"]} with time.weekday as "Sunday" // ---
true
The with keyword acts as a modifier on expressions. A single expression is allowed to have zero or morewith modifiers. The with keyword has the following syntax The <target>s must be references to values in the input document (or the input document itself) or data document, or references to functions (built-in or not).
1
<expr> with <target-1> as <value-1> [with <target-2> as <value-2> [...]]
When applied to the data document, the <target> must not attempt to partially define virtual documents. For example, given a virtual document at path data.foo.bar, the compiler will generate an error if the policy attempts to replace data.foo.bar.baz.
The with keyword only affects the attached expression. Subsequent expressions will see the unmodified value.
1 2 3 4 5 6 7 8 9 10 11 12 13
inner := [x, y] if { # {"foo": 100, "bar": 300} x := input.foo y := input.bar }
middle := [a, b] if { # {"foo": 200, "bar": 300} a := inner with input.foo as 100 b := input # foo still as 200 }
outer := result if { result := middle with input as {"foo": 200, "bar": 300} }
When <target> is a reference to a function, like http.send, then its <value> can be any of the following:
a value: with http.send as {"body": {"success": true }}
a reference to another function: with http.send as mock_http_send
a reference to another (possibly custom) built-in function: with custom_builtin as less_strict_custom_builtin
a reference to a rule that will be used as the value.
When the replacement value is a function, its arity needs to match the replaced function’s arity; and the types must be compatible.
1 2 3 4 5 6 7
f(x) := count(x)
mock_count(x) := 0 if "x" in x mock_count(x) := count(x) if not "x" in x
f([1, 2, 3]) with count as mock_count # 3 f(["x", "y", "z"]) with count as mock_count # 0
Default Keyword
The default keyword allows policies to define a default value for documents produced by rules with Complete Definitions. The default value is used when all of the rules sharing the same name are undefined.
1 2 3 4 5 6 7 8
default allow := false # false
allow if { input.user == "bob" input.method == "GET" }
allow if input.user == "alice"
When the allow document is queried, the return value will be either true or false Without the default definition, the allow document would simply be undefined for the same input.
When the default keyword is used, the rule syntax is restricted to
1
default <name> := <term>
The term may be any scalar, composite, or comprehension value but it may not be a variable or reference. If the value is a composite then it may not contain variables or references. Comprehensions however may, as the result of a comprehension is never undefined.
Similar to rules, the default keyword can be applied to functions as well.
1 2 3 4 5 6 7 8 9 10 11
package play
import future.keywords
default clamp_positive(_) := 0
clamp_positive(x) = x if { x > 0 }
r := clamp_positive(-1)
When clamp_positive is queried, the return value will be either the argument provided to the function or 0.
output.json
1 2 3
{ "r":0 }
The value of a default function follows the same conditions as that of a default rule. In addition, a default function satisfies the following properties:
same arity as other functions with the same name
arguments should only be plain variables ie. no composite values
argument names should not be repeated
Else Keyword
The else keyword is a basic control flow construct that gives you control over rule evaluation order. Rules grouped together with the else keyword are evaluated until a match is found. Once a match is found, rule evaluation does not proceed to rules further in the chain.
The else keyword is useful if you are porting policies into Rego from an order-sensitive system like IPTables.
1 2 3 4 5 6
authorize := "allow" if { input.user == "superuser" # allow 'superuser' to perform any operation. } else := "deny" if { input.path[0] == "admin" # disallow 'admin' operations... input.source_network == "external" # from external networks. } # ... more rules
In the example below, evaluation stops immediately after the first rule even though the input matches the second rule as well.
The else keyword may be used repeatedly on the same rule and there is no limit imposed on the number of else clauses on a rule.
Operators
Membership and iteration: in
Membership and iteration
The membership operator in lets you check if an element is part of a collection (array, set, or object). It always evaluates to true or false
1 2 3 4 5
p := [x, y, z] if { x := 3 in [1, 2, 3] # array y := 3 in {1, 2, 3} # set z := 3 in {"foo": 1, "bar": 3} # object }
1 2 3 4 5 6 7
{ "p": [ true, true, true ] }
When providing two arguments on the left-hand side of the in operator, and an object or an array on the right-hand side, the first argument is taken to be the key (object) or index (array), respectively
1 2 3 4
p := [x, y] if { x := "foo", "bar" in {"foo": "bar"} # key, val with object y := 2, "baz" in ["foo", "bar", "baz"] # key, val with array }
output.json
1 2 3 4 5 6
{ "p":[ true, true ] }
that in list contexts, like set or array definitions and function arguments, parentheses are required to use the form with two left-hand side arguments
1 2 3 4 5 6 7
p := x if { x := { 0, 2 in [2] } }
q := x if { x := { (0, 2 in [2]) } }
output.json
1 2 3 4 5 6 7 8 9
{ "p":[ true, 0 ], "q":[ true ] }
1 2 3 4 5 6 7 8 9 10
g(x) := sprintf("one function argument: %v", [x]) f(x, y) := sprintf("two function arguments: %v, %v", [x, y])
w := x if { x := g((0, 2 in [2])) }
z := x if { x := f(0, 2 in [2]) }
output.json
1 2 3 4
{ "w":"one function argument: true", "z":"two function arguments: 0, true" }
Combined with not, the operator can be handy when asserting that an element is not member of an array
1 2 3 4 5
deny if not "admin" in input.user.roles
test_deny if { deny with input.user.roles as ["operator", "user"] }
output.json
1 2 3
{ "test_deny":true }
that expressions using the in operator always return true or false, even when called in non-collection arguments
1 2 3
q := x if { x := 3 in "three" } # false
Using the some variant, it can be used to introduce new variables based on a collections’ items
Any argument to the some variant can be a composite, non-ground value
1 2 3 4 5 6 7
p[x] = y if { some x, {"foo": y} in [{"foo": 100}, {"bar": 200}] }
p[x] = y if { some {"bar": x}, {"foo": y} in {{"bar": "b"}: {"foo": "f"}} }
output.json
1 2 3 4 5 6
{ "p":{ "0":100, "b":"f" } }
Equality: Assignment, Comparison, and Unification
Rego supports three kinds of equality: assignment (:=), comparison (==), and unification=. We recommend using assignment (:=) and comparison (==) whenever possible for policies that are easier to read and write.
Assignment :=
The assignment operator (:=) is used to assign values to variables. Variables assigned inside a rule are locally scoped to that rule and shadow global variables.
1 2 3 4 5 6
x := 100
p if { x := 1 # declare local variable 'x' and assign value 1 x != 100 # true because 'x' refers to local variable } # true
Assigned variables are not allowed to appear before the assignment in the query.
1 2 3 4 5 6 7 8 9
p if { x != 100 x := 1 # error because x appears earlier in the query. } # rego_compile_error: var x referenced above
q if { x := 1 x := 2 # error because x is assigned twice. } # rego_compile_error: var x assigned above
A simple form of destructuring can be used to unpack values from arrays and assign them to variables
in_london if { [_, _, city, country] := address city == "London" country == "England" } # true
Comparison ==
Comparison checks if two values are equal within a rule. If the left or right hand side contains a variable that has not been assigned a value, the compiler throws an error.
1 2 3 4
p if { x := 100 x == 100 # true because x refers to the local variable } # true
1 2 3 4
y := 100 q if { y == 100 # true because y refers to the global variable } # true
1 2 3
r if { z == 100 # compiler error because z has not been assigned a value } # rego_unsafe_var_error: var z is unsafe
Unification =
Unification (=) combines assignment and comparison. Rego will assign variables to values that make the comparison true. Unification lets you ask for values for variables that make an expression true.
1 2 3
p[x] = y if { [x, "world"] = ["hello", y] }
output.json
1 2 3 4 5
{ "p":{ "hello":"world" } }
As opposed to when assignment (:=) is used, the order of expressions in a rule does not affect the document’s content.
1 2 3 4 5
s if { x > y y = 41 x = 42 } # true
Best Practices for Equality
Here is a comparison of the three forms of equality.
Equality
Applicable
Compiler Errors
Use Case
:=
Everywhere
Var already assigned
Assign variable
==
Everywhere
Var not assigned
Compare values
=
Everywhere
Values cannot be computed
Express query
Best practice is to use assignment:= and comparison== wherever possible. The additional compiler checks help avoid errors when writing policy, and the additional syntax helps make the intent clearer when reading policy.
Under the hood := and == are syntactic sugar for =, local variable creation, and additional compiler checks.
Comparison Operators
The following comparison operators are supported:
1 2 3 4 5 6
a == b # `a` is equal to `b`. a != b # `a` is not equal to `b`. a < b # `a` is less than `b`. a <= b # `a` is less than or equal to `b`. a > b # `a` is greater than `b`. a >= b # `a` is greater than or equal to `b`.
None of these operators bind variables contained in the expression. As a result, if either operand is a variable, the variable must appear in another expression in the same rule that would cause the variable to be bound, i.e., an equality expression or the target position of a built-in function.
Built-in Functions
Built-ins can be easily recognized by their syntax. All built-ins have the following form Built-ins usually take one or more input values and produce one output value. Unless stated otherwise, all built-ins accept values or variables as output arguments.
1
<name>(<arg-1>, <arg-2>, ..., <arg-n>)
If a built-in function is invoked with a variable as input, the variable must be safe, i.e., it must be assigned elsewhere in the query.
Built-ins can include . characters in the name. This allows them to be namespaced. If you are adding custom built-ins to OPA, consider namespacing them to avoid naming conflicts, e.g., org.example.special_func.
Errors
By default, built-in function calls that encounter runtime errors evaluate to undefined (which can usually be treated as false) and do not halt policy evaluation. This ensures that built-in functions can be called with invalid inputs without causing the entire policy to stop evaluating.
In most cases, policies do not have to implement any kind of error handling logic. If error handling is required, the built-in function call can be negated to test for undefined.
reason contains "invalid JWT supplied as input" if { not io.jwt.decode(input.token) }
input.json
1 2 3
{ "token":"a poorly formatted token" }
output.json
1 2 3 4 5
{ "reason":[ "invalid JWT supplied as input" ] }
If you wish to disable this behaviour and instead have built-in function call errors treated as exceptions that halt policy evaluation enable strict built-in errors in the caller
Metadata
The package and individual rules in a module can be annotated with a rich set of metadata.
1 2 3 4 5 6 7 8 9
# METADATA # title: My rule # description: A rule that determines if x is allowed. # authors: # - John Doe <[email protected]> # entrypoint: true allow { ... }
Annotations are grouped within a metadata block, and must be specified as YAML within a comment block that must start with # METADATA. Also, every line in the comment block containing the annotation must start at Column 1 in the module/file, or otherwise, they will be ignored.
Strict Mode
The Rego compiler supports strict mode, where additional constraints and safety checks are enforced during compilation.
Compiler rules that will be enforced by future versions of OPA, but will be a breaking change once introduced, are incubated in strict mode.
This creates an opportunity for users to verify that their policies are compatible with the next version of OPA before upgrading.
Compiler Strict mode is supported by the check command, and can be enabled through the -S flag.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
$ opa help check Check Rego source files for parse and compilation errors.
If the 'check' command succeeds in parsing and compiling the source file(s), no output is produced. If the parsing or compiling fails, 'check' will output the errors and exit with a non-zero exit code.
Usage: opa check <path> [path [...]] [flags]
Flags: -b, --bundle load paths as bundle files or root directories --capabilities string set capabilities version or capabilities.json file path -f, --format {pretty,json} set output format (default pretty) -h, --help help for check --ignore strings set file and directory names to ignore during loading (e.g., '.*' excludes hidden files) -m, --max-errors int set the number of errors to allow before compilation fails early (default 10) -s, --schema string set schema file path or directory path -S, --strict enable compiler strict mode
Policy Testing
To help you verify the correctness of your policies, OPA also gives you a framework that you can use to write tests for your policies. By writing tests for your policies you can speed up the development process of new rules and reduce the amount of time it takes to modify rules as requirements evolve.
Getting Started
The file below implements a simple policy that allows new users to be created and users to access their own profile.
example.rego
1 2 3 4 5 6 7 8 9 10 11 12
package authz import future.keywords
allow if { input.path == ["users"] input.method == "POST" }
Tests are expressed as standard Rego rules with a convention that the rule name is prefixed with test_.
1 2 3 4 5 6
package mypackage import future.keywords
test_some_descriptive_name if { # test logic }
Test Discovery
The opa test subcommand runs all of the tests (i.e., rules prefixed with test_) found in Rego files passed on the command line.
If directories are passed as command line arguments, opa test will load their file contents recursively.
Specifying Tests to Run
The opa test subcommand supports a --run/-rregex option to further specify which of the discovered tests should be evaluated. The option supports re2 syntax
allow if { input.path == ["users"] input.method == "POST" }
allow if { input.path == ["users", input.user_id] input.method == "GET" } # This test will pass. test_ok if true # This test will fail. test_failure if 1 == 2 # This test will error. test_error if 1 / 0 # This test will be skipped. todo_test_missing_implementation if { allow with data.roles as ["not", "implemented"] }
By default, OPA prints the test results in a human-readable format. If you need to consume the test results programmatically, use the JSON output format.
test_allow_with_data if { allow with input as {"user": "alice", "role": "admin"} with data.policies as policies # [{"name": "test_policy"}] with data.roles as roles # {"admin": ["alice"]} }
In simple cases, a function can also be replaced with a value, as in Every invocation of the function will then return the replacement value, regardless of the function’s arguments.
authz_test.rego
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
package authz import future.keywords
mock_decode_verify("my-jwt", _) := [true, {}, {}] mock_decode_verify(x, _) := [false, {}, {}] if x != "my-jwt"
test_allow if { allow with input.headers["x-token"] as "my-jwt" with data.jwks.cert as "mock-cert" with io.jwt.decode_verify as mock_decode_verify }
test_allow_value if { allow with input.headers["x-token"] as "my-jwt" with data.jwks.cert as "mock-cert" with io.jwt.decode_verify as [true, {}, {}] }
In addition to reporting pass, fail, and error results for tests, opa test can also report coverage for the policies under test.
The coverage report includes all of the lines evaluated and not evaluated in the Rego files provided on the command line. When a line is not covered it indicates one of two things
If the line refers to the head of a rule, the body of the rule was never true.
If the line refers to an expression in a rule, the expression was never evaluated.
It is also possible that rule indexing has determined some path unnecessary for evaluation, thereby affecting the lines reported as covered.
If we run the coverage report on the original example.rego file without test_get_user_allowed from example_test.rego.
OPA exposes domain-agnostic APIs that your service can call to manage and enforce policies.
When integrating with OPA there are two interfaces to consider
Evaluation
OPA’s interface for asking for policy decisions.
Integrating OPA is primarily focused on integrating an application, service, or tool with OPA’s policy evaluation interface.
This integration results in policy decisions being decoupled from that application, service, or tool.
Management
OPA’s interface for deploying policies, understanding status, uploading logs, and so on.
This integration is typically the same across all OPA instances, regardless what software the evaluation interface is integrated with.
Distributing policy, retrieving status, and storing logs in the same way across all OPAs provides a unified management plane for policy across many different software systems.
This page focuses predominantly on different ways to integrate with OPA’s policy evaluation interface and how they compare
Management Interface
Desc
Bundle API
distributing policy and data to OPA
Status API
collecting status reports on bundle activation and agent health
Decision Log API
collecting a log of policy decisions made by agents
Health API
checking agent deployment readiness and health
Prometheus API endpoint
obtain insight into performance and errors
Evaluating Policies
OPA supports different ways to evaluate policies.
The REST API returns decisions as JSON over HTTP.
The Go API (GoDoc) returns decisions as simple Go types (bool, string, map[string]interface{}, etc.)
WebAssembly compiles Rego policies into Wasm instructions so they can be embedded and evaluated by any WebAssembly runtime.
Custom compilers and evaluators may be written to parse evaluation plans in the low-level Intermediate Representation format, which can be emitted by the opa build command
The SDK provides high-level APIs for obtaining the output of query evaluation as simple Go types (bool, string, map[string]interface{}, etc.)
Integrating with the REST API
To integrate with OPA outside of Go, we recommend you deploy OPA as a host-level daemon or sidecar container.
When your application or service needs to make policy decisions it can query OPA locally via HTTP.
Running OPA locally on the same host as your application or service helps ensure policy decisions are fast and highly-available.
Named Policy Decisions
Use the Data API to query OPA for named policy decisions
1 2
POST /v1/data/<path> Content-Type: application/json
1 2 3
{ "input": <the input document> }
The <path> in the HTTP request identifies the policy decision to ask for. In OPA, every rule generates a policy decision.
In the example below there are two decisions: example/authz/allow and example/authz/is_admin.
You can request specific decisions by querying for <package path>/<rule name>. For example to request the allow decision execute the following HTTP request
1 2
POST /v1/data/example/authz/allow Content-Type: application/json
1 2 3
{ "input": <the input document> }
The body of the request specifies the value of the input document to use during policy evaluation.
OPA returns an HTTP 200 response code if the policy was evaluated successfully. Non-HTTP 200 response codes indicate configuration or runtime errors. The policy decision is contained in the "result" key of the response message body.
If the requested policy decision is undefined OPA returns an HTTP 200 response without the "result" key. For example, the following request for is_admin is undefined because there is no default value for is_admin and the input does not satisfy the is_admin rule body:
The SDK package contains high-level APIs for embedding OPA inside of Go programs and obtaining the output of query evaluation.
A typical workflow when using the sdk package would involve first creating a new sdk.OPA object by calling sdk.New and then invoking its Decision method to fetch the policy decision. The sdk.New call takes the sdk.Options object as an input which allows specifying the OPA configuration, console logger, plugins, etc.
Use the low-levelgithub.com/open-policy-agent/opa/rego package to embed OPA as a library inside services written in Go, when only policy evaluation and no other capabilities of OPA, like the management features — are desired.
If you’re unsure which one to use, the SDK is probably the better option.
1
import"github.com/open-policy-agent/opa/rego"
The rego package exposes different options for customizing how policies are evaluated. Through the rego package you can supply policies and data, enable metrics and tracing, toggle optimizations, etc. In most cases you will
Use the rego package to construct a prepared query.
Execute the prepared query to produce policy decisions.
Interpret and enforce the policy decisions.
Preparing queries in advance avoids parsing and compiling the policies on each query and improves performance considerably. Prepared queries are safe to share across multiple Go routines.
Comparison
A comparison of the different integration choices are summarized below.
Dimension
REST API - sidecar
Go Lib
Wasm
Evaluation
Fast
Faster
Fastest
Language
Any
Only Go
Any with Wasm
Operations
Update just OPA
Update entire service
Update service rarely
Security
Must secure API
Enable only what is needed
Enable only what is needed
REST API
Integrating OPA via the REST API is the most common, at the time of writing.
OPA is most often deployed either as a sidecar or less commonly as an external service.
Operationally this makes it easy to upgrade OPA and to configure it to use its management services (bundles, status, decision logs, etc.).
Because it is a separate process it requires monitoring and logging (though this happens automatically for any sidecar-aware environment like Kubernetes).
OPA’s configuration and APIs must be secured according to the security guide.
Go API
Integrating OPA via the Go API only works for Go software.
Updates to OPA require re-vendoring and re-deploying the software.
Evaluation has less overhead than the REST API because all the communication happens in the same operating-system process.
All of the management functionality (bundles, decision logs, etc.) must be either enabled or implemented.
Security concerns are limited to those management features that are enabled or implemented.
Wasm
Wasm policies are embeddable in any programming language that has a Wasm runtime.
Evaluation has less overhead than the REST API (because it is evaluated in the same operating-system process)
and should outperform the Go API (because the policies have been compiled to a lower-level instruction set).
Each programming language will need its own SDKs that implement the management functionality and the evaluation interface.
Typically new OPA language features will not require updating the service since neither the Wasm runtime nor the SDKs will be impacted.
Updating the SDKs will require re-deploying the service.
Security is analogous to the Go API integration: it is mainly the management functionality that presents security risks.