Important
This documentation is a work in progress, and while the hope is that it provides an adequate reference of the functionality the Roto engine provides, it does not teach you the language. For the time being, look at the See Also section for examples to learn from.
This reference documents the iocaine 2.5.1 release. Some of the functionality is not available in earlier versions.
Overview
Starting with iocaine 2.2.0, it is possible to configure custom request handler, in multiple languages. The first and most performant one is Roto, a statically typed, compiled language. It is fast, efficient, but also more limited than the other options, and certainly less familiar.
The main entry point of the script is a decide function, which takes a single argument, the incoming request, and returns a verdict with the Outcome. A minimal implementation that accepts everything to serve it garbage (mimicking the default behaviour without a custom request handler) looks like this:
function decide(request: Request) -> Verdict[Outcome, Outcome] {
accept Outcome.garbage()
}
It is also possible to run a function once, on startup, to pre-compile patterns and regexes for example, so that work won’t need to be performed for each and every request. To faciliate this, iocaine looks for an init function:
function init() -> Verdict[Unit, String] {
accept
}
As roto does not support global variables, if you want to store anything you compiled at init() time, those need to be put into one of the variables the Roto engine makes available for scripts - see below!
Types and variables available in the runtime
There are a number of types and variables exposed in the request handler runtime, largely matching the core features provided by all engines. These are documented below.
Request
Each request iocaine handles will pass through the decide function, which takes a single parameter, of the Request type. This is a read-only representation of the incoming request, and has the following methods:
request.method()returns the HTTP method used for the request.request.path()returns the request path, without the leading/.request.header(<name>)returns the value of the<name>HTTP header, if it exists, or an empty string if it doesn’t. The name is matched case insensitively.request.cookie(<name>)returns the value of the<name>cookie, if set, or an empty string if it is not. (Available since iocaine 2.3.0)request.query(<name>)returns the value of the<name>query parameter if set, or an empty string if it s not. (Available since iocaine 2.3.0)
Additionally, for logging purposes, the Request type can convert its headers, cookies, and query parameters into metadata using the request.headers_into_metadata(<metadata>, <prefix>), request.cookies_into_metadata(<metadata>, <prefix>), and request.queries_into_metadata(<metadata>, <prefix>) functions. When inserting into the metadata map, each header, cookie, or query name will be prefixed by <prefix>.
if request.query("log-me") == "yes" {
request.headers_into_metadata(metadata, "request.header.");
}
The above snippet will add request headers to the metadata map, each header prefixed with request.header.. As such, the user-agent header will, in this case, end up as request.header.user-agent.
Outcome
An Outcome - combined with the verdict whether to accept or reject a request - is the result of the decision the request handler makes, and is used as the return value of the decide() function. It can be Outcome.garbage(), Outcome.challenge(), or Outcome.not_for_us().
The former two are to be used with an accept verdict, while the latter with reject. Using accept with Outcome.not_for_us(), or reject with anything other than that results in an internal server error. Don’t do that.
Metadata
The Metadata type is a key-value map, primarily used during logging. This makes it possible to attach arbitrary key-value pairs to log messages.
To construct them, two functions are available:
Metadata.new()creates an empty map.Metadata.with(<key>, <value>)creates a new map, with a single key-value pair already inserted.
To add new keys to the map, use metadata.with(<key>, <value>) - which will return the metadata instance itself, so it can be chained. However, chaining will eventually exhaust the stack, so only do that if you have a handful of pairs to add.
Should you wish to retrieve the value of a key, metadata.get(<key>) will let you do that. It returns the value if the key is found, or an empty string if it is not.
MutableStringList
Because Roto does not support lists, nor variadic arguments, if any function needs to receive a list of arguments, those arguments need to be constructed first, using a specific type of their own. For list of strings, that’s MutableStringList. It is not advised to use these outside of init()-time setup.
For construction purposes, the following functions can be used:
MutableStringList.new()creates a new, mutable string list, and returns it.list.push(<string>)pushes the given<string>into the list, and returns the list itself. This allows chaining them, but be aware that doing so will eventually exhaust the stack, do so only when there’s only a few pushes.
let list = MutableStringList.new();
list.push("foo").push("bar");
It is also possible to join the elements of a string list with a separator: list.join(<separator>) will do just that:
let list = MutableStringList.new();
list.push("foo").push("bar");
let result = list.join(", "); # result == "foo, bar"
iocaine_patterns
To match substrings effectively, the Roto engine provides access to the iocaine_patterns variable. This - under the hood - is a name-value map, allowing you to collect patterns into a “variable”: Roto does not support global variables, so to be able to refer to things we compiled at init() time, we have to put them into an iocaine-provided variable, such as iocaine_patterns.
Under the hood, each key has an associated PatternFinder, which does the bulk of the work, the actual matching. To insert finders into the map, you can use either of these two functions:
iocaine_patterns.insert(<key>, <pattern-finder>)will insert aPatternFinderinstance directly.iocaine_patterns.insert_patterns(<key>, <string-list>)will turn theMutableStringListinto aPatternFinder, and then insert it.
Which one to use, depends on circumstances. These are generally run at init() time, and the performance is the same, whether you turn a string list into a finder, or iocaine does so under the hood. Use whichever feels more natural.
To retrieve a key, one should use iocaine_patterns.get(<key>). If a key is not found, this returns a finder that will always fail.
PatternFinder
A PatternFinder is the AhoCorasick instance that does the actual pattern matching. An instance of it can be directly constructed from a MutableStringList with PatternFinder.new(<string-list>), or indirectly through iocaine_patterns.insert_patterns(). Once constructed, or retrieved via iocaine_patterns.get(), it provides one method only:
finder.is_match(<string>)returns true if<string>matches any of the patterns in the finder, false otherwise.
iocaine_regexes
For regular expression matching, the Roto engine provides access to the iocaine_regexes variable. This - under the hood - is a name-value map, allowing you to collect compiled regexes into a “variable”: Roto does not support global variables, so to be able to refer to things we compiled at init() time, we have to put them into an iocaine-provided variable, such as iocaine_regexes.
Note that this variable allows inserting a single regexp into the map! If you wish to match multiple, possible overlapping regexes against the same string, use iocaine_regexsets instead, that is much more efficient. This variable should only be used if you want to extract capture groups, or there’s only a single expression in the set.
Under the hood, each key has an associated RegexFinder, which does the bulk of the work, the actual matching, and capture group extraction, if need be. To insert finders into the map, you can use either of these two functions:
iocaine_regexes.insert(<key>, <regex-finder>)will insert aRegexFinderinstance directly.iocaine_regexes.insert_regex(<key>, <string>)will compile the expression in<string>into aRegexFinder, and then insert that.
Which one to use, depends on circumstances. These are generally run at init() time, and the performance is the same, whether you turn a string list into a finder, or iocaine does so under the hood. Use whichever feels more natural.
To retrieve a key, one should use iocaine_regexes.get(<key>). If a key is not found, this returns a finder that will always fail.
RegexFinder
A RegexFinder is the Regex instance that does the actual pattern matching. An instance of it can be directly constructed from a string with RegexFinder.new(<string>), or indirectly through iocaine_regexes.insert_regex(). Once constructed, or retrieved via iocaine_regexes.get(), it provides the following methods:
finder.is_match(<string>)returns true if the regex matches<string>, false otherwise.finder.capture(<string>, <group>)returns the part of<string>captured by the named capture group<group>.
iocaine_regexsets
Available since iocaine 2.5.0.
Similar to iocaine_regexes, iocaine_regexsets provides an efficient way to match multiple regexes against a single string. The major difference between the two is that iocaine_regexsets allows matching multiple regexes in a single pass, while iocaine_regexes can match only one - but it can also extract capture groups, which regex sets can’t.
This is most useful when you don’t need to extract captures, and you have multiple sets of regular expressions you wish to match.
The variable provides two methods:
iocaine_regexsets.insert(<key>, <string-list>), which compiles all the expressions in the given<string-list>into regular expressions, and stores it in the map under the<key>key.iocaine_regexsets.get(<key>)will return the finder associated with<key>, or one that will always return false if<key>is not found.
The finder has a single method:
finder.is_match(<string>), which returns true if<string>matches any of the regexes in the set, false otherwise.
iocaine_networks
Available since iocaine 2.3.0.
The iocaine_networks variable allows one to store named network sets they can later use to compare IP addresses against. This makes it possible to efficiently check whether an IP address falls into a given range, or range sets (for example, an entire ASN!).
The variable has two methods:
iocaine_networks.insert(<key>, <cidr-list>)to insert a list of IP ranges in CIDR notation (<cidr-list>) under<key>.iocaine_networks.get(<key>)to return a finder associated with<key>, or one that will always return false if<key>is not found.
To match an IP address against a set of networks, retrieve the finder via iocaine_networks.get(), and use its finder.contains(<ip-addr>) to do the matching. This function takes an IP address, and returns true if the address is part of any of the networks contained within the finder, false otherwise.
Roto natively supports IP addresses, one can use them as literals:
if iocaine_networks.get("local").contains(127.0.0.1) {
reject Outcome.not_for_us()
}
PrefixList
Because Roto does not support lists, nor variadic arguments, if any function needs to receive a list of arguments, those arguments need to be constructed first, using a specific type of their own. For list of IP ranges in CIDR notation, this specific type is PrefixList. It is not advised to use these outside of init()-time setup.
For construction purposes, the following functions can be used:
PrefixList.new()creates a new, mutable prefix list, and returns it.list.push(<cidr>)compile the prefix given in<cidr>(in CIDR notiation) into an efficient representation, and push it to the end of the list, returning the list itself. By returning the list, this allows chaining, but that eventually exhausts the stack, and as such, is not recommended, unless you only have a handfull of prefixes to insert.
let list = PrefixList.new();
list.push("192.168.0.0/24").push("10.0.0.0/16");
metrics
Available since iocaine 2.3.0.
The metrics variable provides access to the iocaine_request_handler_hits metric, and has a single method: .inc(<id>).
This will increase the metric with the <id> label by one. The intended use is to create script-specific metrics, for example, to count how many times a given ruleset was hit.
Logger
Logger provides a primitive framework for logging. It lets one build a map of key-value pairs, and output them in JSON format to standard output. It also contains a message, and information about the outcome of evaluation.
There are multiple ways of constructing a Logger object:
Logger.new()will create one with empty values.Logger.with_message(<message>)will create one with a custom message. (Available since iocaine 2.5.0)Logger.with_verdict(<verdict>, <outcome>)creates a logger with<verdict>(“accept” or “reject”) set as the verdict type, and<outcome>as the verdict outcome. It also sets the message based on the value of<verdict>.Logger.with_metadata(<metadata>)creates a logger with custom metadata.
An existing Logger instance has the following methods:
logger.with_metadata(<metadata>)replaces the metadata of the instance with the one provided.logger.with_verdict(<verdict>, <outcome>)replaces the verdict, the parameters are the same as in the constructor’s case.logger.with_message(<message>)replaces the message property. (Available since iocaine 2.5.0)logger.stdout()serializes the data contained within the logger into JSON, and prints it to standard output.logger.to_json()serializes the data contained within the logger into JSON, and returns it as a string. This is intended for testing purposes only.
Env
With Env.get(<name>), a request handler script can retrieve the value of the <name> environment variable (or an empty string, if the variable isn’t set). This can be used to allow some limited configuration of a reusable request handler, through environment variables.
JSON parsing
The Roto engine provides very limited support for loading and working with JSON files: you can load a file, and extract the keys from it into a string list:
let ai_robots_txt = Json.load_file("/some/path/robots.json").get_keys();
iocaine_patterns.insert_patterns("ai.robots.txt", ai_robots_txt);
Other extensions to the Roto language
There are a few small things the Roto runtime within iocaine provides for types native to Roto: for example, it is possible to split a string into a list with the split_by(<delimiter>) method:
let list = "foo/bar".split_by("/"); # list = ["foo", "bar"]
Since iocaine 2.5.0, it is also possible to try and parse a string as a Sec-CH-UA header:
let secchua = request.header("sec-ch-ua").as_secchua();
if secchua.contains_item("Chrome") {
accept outcome.Garbage()
}
As can be seen in the example, if the string is successfully parsed, the returned object will have a .contains_item(<key>) method, which does what its name suggests.
Testing
Important
Documenting the testing support is a work in progress. Please look at examples in the See Also section for more information for the time being.
Roto supports embedding tests right in the Roto scripts using the following pattern:
test name_of_the_test {
# do some checks
accept
}
The tests should accept when passing, and reject when failing.
See also
For a complete example, see Nam-Shub of Enki, the bot detection & classification system of iocaine’s author. It’s a complete example used in production with great success.
For smaller examples, there are Roto tests in the source code, too. Very basic, not particularly great as a starting point, but they perhaps provide an overview of the language.
There’s also a short Getting Started guide, with a complete request handler example in Roto. More useful than the tests as a starting point, less complex and featureful than Nam-Shub of Enki.