Rule Engine (Conditional Logic) User Guide

What this is

Think of this as conditional logic for validation data:

If this condition is true
Then create this result data

In this package, each "then create this data" output is written as a RuleEngineResult record.

Technically this is powered by SHACL-AF (sh:rule / sh:SPARQLRule), but users can work with the simpler mental model: condition -> created result record.

Typical examples:

classify equipment as Critical / High / Normal
flag overdue work orders
compute KPI labels used by dashboards or automations

Unlike normal validation failures, Rule Engine outputs are written as RuleEngineResult records.

When to use it

Use Rule Engine when you need derived state written as records that downstream systems can consume.

Use standard validation constraints for pure pass/fail checks without derived outputs.

What it is

Validation runs can produce two output streams:

DataQualityValidationRecord for pass/fail validation outcomes
RuleEngineResult for inference outputs produced by SHACL-AF CONSTRUCT rules

Both are produced in the same validation pass.

User-first mental model

From a user perspective, each rule has three parts:

Input: Which data points are used (for example status fields, timestamps, counts, source links).
Calculation: The logic that evaluates those inputs (thresholds, worst-case aggregation, conditional checks).
Output: A derived result that users can consume (for example Green/Yellow/Red, Overdue, Critical).

You can think of:

Rule -> the IF condition and evaluation logic
Indicator -> a tracked rule+subject combination over time
Indication -> one "THEN create this data" output at a point in time

In cognite-data-quality, each indication is persisted as a RuleEngineResult record.

What users get

Rule Engine outputs are designed for operational use, not just technical debugging:

stable ruleId values for dashboards and alerts
machine-readable resultPayload for downstream automation
optional resultValue for simple status displays
lineage via causedBy when one derived result depends on another

This lets users answer questions like:

"What is the current status for this asset/process?"
"Which rule produced this status?"
"What upstream result caused this output?"

When to use Rule Engine

Use Rule Engine when you want to:

derive new state from existing data
keep rule logic declarative in SHACL
persist rule outputs as queryable records
chain inference rules with lineage (dqs:causedBy)

Use regular validation constraints when you only need pass/fail quality checks.

Practical examples (generic)

Timeliness: mark an entity as Stale if last update is older than a configured window.
Completeness score: classify an object as Good/Warning/Critical based on missing required attributes.
Aggregated risk state: derive a top-level state from multiple lower-level checks (for example worst-case rollup).
Cross-source consistency: emit a status when two systems disagree on a key value.

These are all represented as RuleEngineResult outputs and can be queried in Records API.

Authoring model

A rule typically:

Targets a node shape (sh:NodeShape)
Uses sh:rule with sh:SPARQLRule
CONSTRUCTs the "then create this data" output as a subject typed dqs:RuleEngineResult

Minimum useful fields in the result:

dqs:focusNode (what instance the rule fired on)
dqs:ruleId (rule identifier)
dqs:resultType (usually "Inference")

Optional fields:

dqs:resultValue (scalar label)
dqs:resultPayload (JSON string)
dqs:causedBy (links to upstream RuleEngineResult subjects)

For full SHACL snippets, see Rule sources.

Output record shape (`RuleEngineResult`)

Main properties written per inference:

ruleSetId
ruleSetVersion
ruleId
runId
resultType
focusNode
focusNodeInstance
resultValue (optional)
resultPayload (optional)
causedBy (optional lineage)
dataDomainExternalId (optional)
producedAt

See Records output for schema and query examples.

Runtime behavior

SHACL validation and SHACL-AF inference run together.
One RuleEngineResult record is written per inferred result subject.
Records are posted through the same Records API pipeline as normal validation records.
runId is shared with the validation run for correlation.
Rules can be scheduled like normal validation workflows, so users get repeated status updates over time.

For cross-run chaining (consume new upstream RuleEngineResult records and trigger downstream stages), see Chained Conditional Logic.

In deployments where filter queries have strict max interval limits, configure chaining listeners with bounded first-run watermark bootstrap (see chained guide).

Minimal happy path

Define one sh:NodeShape targeting the relevant class.
Add one sh:rule with sh:SPARQLRule that constructs dqs:RuleEngineResult.
Include dqs:focusNode, dqs:ruleId, and dqs:resultType.
Deploy and verify records in RuleEngineResult.

Concrete SHACL-AF examples

Example 1: Timeliness rule (`OnTime` vs `Stale`)

This example emits a RuleEngineResult for each work order and classifies it by due date.

@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix dqs: <http://purl.org/cognite/dqs#> .
@prefix ex: <https://example.com/ns#> .

ex:WorkOrderTimelinessShape
    a sh:NodeShape ;
    sh:targetClass ex:WorkOrder ;
    sh:rule [
        a sh:SPARQLRule ;
        sh:construct """
            CONSTRUCT {
              ?result a dqs:RuleEngineResult ;
                      dqs:focusNode ?wo ;
                      dqs:ruleId "workorder_timeliness_v1" ;
                      dqs:resultType "Inference" ;
                      dqs:resultValue ?status ;
                      dqs:resultPayload ?payload .
            }
            WHERE {
              ?wo a ex:WorkOrder ;
                  ex:dueDate ?dueDate .
              BIND(NOW() AS ?now)
              BIND(IF(?now <= ?dueDate, "OnTime", "Stale") AS ?status)
              BIND(IRI(CONCAT("urn:rule-result:workorder-timeliness:", ENCODE_FOR_URI(STR(?wo)))) AS ?result)
              BIND(CONCAT("{\\"status\\":\\"", ?status, "\\"}") AS ?payload)
            }
        """ ;
    ] .

Example 2: Chained rule with lineage (`Critical` if stale and high priority)

This example consumes a previous RuleEngineResult and emits a new one, linking lineage via dqs:causedBy.

@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix dqs: <http://purl.org/cognite/dqs#> .
@prefix ex: <https://example.com/ns#> .

ex:CriticalBacklogShape
    a sh:NodeShape ;
    sh:targetClass ex:WorkOrder ;
    sh:rule [
        a sh:SPARQLRule ;
        sh:order 20 ;
        dqs:ruleId "critical_backlog_v1" ;
        dqs:dependsOn "workorder_timeliness_v1" ;
        sh:construct """
            CONSTRUCT {
              ?result a dqs:RuleEngineResult ;
                      dqs:focusNode ?wo ;
                      dqs:ruleId "critical_backlog_v1" ;
                      dqs:resultType "Inference" ;
                      dqs:resultValue "Critical" ;
                      dqs:causedBy ?upstreamResult .
            }
            WHERE {
              ?upstreamResult a dqs:RuleEngineResult ;
                              dqs:focusNode ?wo ;
                              dqs:ruleId "workorder_timeliness_v1" ;
                              dqs:resultValue "Stale" .
              ?wo ex:priority "High" .
              BIND(IRI(CONCAT("urn:rule-result:critical-backlog:", ENCODE_FOR_URI(STR(?wo)))) AS ?result)
            }
        """ ;
    ] .

Use these as templates: keep dqs:ruleId stable, keep result IRIs deterministic, and populate dqs:causedBy when chaining.

Best practices

Keep ruleId stable and meaningful for analytics.
Treat resultPayload as machine-readable output for downstream apps.
Use dqs:causedBy for explainability between chained rules.
Use ordered rules (sh:order) and valid dependencies (dqs:dependsOn) for deterministic chaining.
Keep quality constraints and inference rules in the same ruleset only when both belong to the same domain workflow.

Troubleshooting

No Rule Engine records written:
Ensure SHACL contains sh:rule blocks.
Ensure rules actually construct rdf:type dqs:RuleEngineResult (or a dqs:RuleEngineResult).
Ensure each result includes dqs:focusNode and dqs:ruleId.
Dependency errors:
Validate dqs:dependsOn and sh:order consistency in the ruleset.
Records not visible in dashboard:
Query RuleEngineResult container directly in Records API to verify ingestion.

Previous section

Conditional logic

Next section

Chained conditional logic