Rule Engine (Conditional Logic) User Guide
What this is
Think of this as conditional logic for validation data:
- If this condition is true
- Then create this result data
In this package, each "then create this data" output is written as a RuleEngineResult record.
Technically this is powered by SHACL-AF (sh:rule / sh:SPARQLRule), but users can work with the simpler mental model: condition -> created result record.
Typical examples:
- classify equipment as
Critical/High/Normal - flag overdue work orders
- compute KPI labels used by dashboards or automations
Unlike normal validation failures, Rule Engine outputs are written as RuleEngineResult records.
When to use it
Use Rule Engine when you need derived state written as records that downstream systems can consume.
Use standard validation constraints for pure pass/fail checks without derived outputs.
What it is
Validation runs can produce two output streams:
DataQualityValidationRecordfor pass/fail validation outcomesRuleEngineResultfor inference outputs produced by SHACL-AFCONSTRUCTrules
Both are produced in the same validation pass.
User-first mental model
From a user perspective, each rule has three parts:
- Input: Which data points are used (for example status fields, timestamps, counts, source links).
- Calculation: The logic that evaluates those inputs (thresholds, worst-case aggregation, conditional checks).
- Output: A derived result that users can consume (for example
Green/Yellow/Red,Overdue,Critical).
You can think of:
- Rule -> the IF condition and evaluation logic
- Indicator -> a tracked rule+subject combination over time
- Indication -> one "THEN create this data" output at a point in time
In cognite-data-quality, each indication is persisted as a RuleEngineResult record.
What users get
Rule Engine outputs are designed for operational use, not just technical debugging:
- stable
ruleIdvalues for dashboards and alerts - machine-readable
resultPayloadfor downstream automation - optional
resultValuefor simple status displays - lineage via
causedBywhen one derived result depends on another
This lets users answer questions like:
- "What is the current status for this asset/process?"
- "Which rule produced this status?"
- "What upstream result caused this output?"
When to use Rule Engine
Use Rule Engine when you want to:
- derive new state from existing data
- keep rule logic declarative in SHACL
- persist rule outputs as queryable records
- chain inference rules with lineage (
dqs:causedBy)
Use regular validation constraints when you only need pass/fail quality checks.
Practical examples (generic)
- Timeliness: mark an entity as
Staleif last update is older than a configured window. - Completeness score: classify an object as
Good/Warning/Criticalbased on missing required attributes. - Aggregated risk state: derive a top-level state from multiple lower-level checks (for example worst-case rollup).
- Cross-source consistency: emit a status when two systems disagree on a key value.
These are all represented as RuleEngineResult outputs and can be queried in Records API.
Authoring model
A rule typically:
- Targets a node shape (
sh:NodeShape) - Uses
sh:rulewithsh:SPARQLRule CONSTRUCTs the "then create this data" output as a subject typeddqs:RuleEngineResult
Minimum useful fields in the result:
dqs:focusNode(what instance the rule fired on)dqs:ruleId(rule identifier)dqs:resultType(usually"Inference")
Optional fields:
dqs:resultValue(scalar label)dqs:resultPayload(JSON string)dqs:causedBy(links to upstream RuleEngineResult subjects)
For full SHACL snippets, see Rule sources.
Output record shape (RuleEngineResult)
Main properties written per inference:
ruleSetIdruleSetVersionruleIdrunIdresultTypefocusNodefocusNodeInstanceresultValue(optional)resultPayload(optional)causedBy(optional lineage)dataDomainExternalId(optional)producedAt
See Records output for schema and query examples.
Runtime behavior
- SHACL validation and SHACL-AF inference run together.
- One
RuleEngineResultrecord is written per inferred result subject. - Records are posted through the same Records API pipeline as normal validation records.
runIdis shared with the validation run for correlation.- Rules can be scheduled like normal validation workflows, so users get repeated status updates over time.
For cross-run chaining (consume new upstream RuleEngineResult records and trigger downstream stages), see Chained Conditional Logic.
In deployments where filter queries have strict max interval limits, configure chaining listeners with bounded first-run watermark bootstrap (see chained guide).
Minimal happy path
- Define one
sh:NodeShapetargeting the relevant class. - Add one
sh:rulewithsh:SPARQLRulethat constructsdqs:RuleEngineResult. - Include
dqs:focusNode,dqs:ruleId, anddqs:resultType. - Deploy and verify records in
RuleEngineResult.
Concrete SHACL-AF examples
Example 1: Timeliness rule (OnTime vs Stale)
This example emits a RuleEngineResult for each work order and classifies it by due date.
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix dqs: <http://purl.org/cognite/dqs#> .
@prefix ex: <https://example.com/ns#> .
ex:WorkOrderTimelinessShape
a sh:NodeShape ;
sh:targetClass ex:WorkOrder ;
sh:rule [
a sh:SPARQLRule ;
sh:construct """
CONSTRUCT {
?result a dqs:RuleEngineResult ;
dqs:focusNode ?wo ;
dqs:ruleId "workorder_timeliness_v1" ;
dqs:resultType "Inference" ;
dqs:resultValue ?status ;
dqs:resultPayload ?payload .
}
WHERE {
?wo a ex:WorkOrder ;
ex:dueDate ?dueDate .
BIND(NOW() AS ?now)
BIND(IF(?now <= ?dueDate, "OnTime", "Stale") AS ?status)
BIND(IRI(CONCAT("urn:rule-result:workorder-timeliness:", ENCODE_FOR_URI(STR(?wo)))) AS ?result)
BIND(CONCAT("{\\"status\\":\\"", ?status, "\\"}") AS ?payload)
}
""" ;
] .
Example 2: Chained rule with lineage (Critical if stale and high priority)
This example consumes a previous RuleEngineResult and emits a new one, linking lineage via dqs:causedBy.
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix dqs: <http://purl.org/cognite/dqs#> .
@prefix ex: <https://example.com/ns#> .
ex:CriticalBacklogShape
a sh:NodeShape ;
sh:targetClass ex:WorkOrder ;
sh:rule [
a sh:SPARQLRule ;
sh:order 20 ;
dqs:ruleId "critical_backlog_v1" ;
dqs:dependsOn "workorder_timeliness_v1" ;
sh:construct """
CONSTRUCT {
?result a dqs:RuleEngineResult ;
dqs:focusNode ?wo ;
dqs:ruleId "critical_backlog_v1" ;
dqs:resultType "Inference" ;
dqs:resultValue "Critical" ;
dqs:causedBy ?upstreamResult .
}
WHERE {
?upstreamResult a dqs:RuleEngineResult ;
dqs:focusNode ?wo ;
dqs:ruleId "workorder_timeliness_v1" ;
dqs:resultValue "Stale" .
?wo ex:priority "High" .
BIND(IRI(CONCAT("urn:rule-result:critical-backlog:", ENCODE_FOR_URI(STR(?wo)))) AS ?result)
}
""" ;
] .
Use these as templates: keep dqs:ruleId stable, keep result IRIs deterministic, and populate dqs:causedBy when chaining.
Best practices
- Keep
ruleIdstable and meaningful for analytics. - Treat
resultPayloadas machine-readable output for downstream apps. - Use
dqs:causedByfor explainability between chained rules. - Use ordered rules (
sh:order) and valid dependencies (dqs:dependsOn) for deterministic chaining. - Keep quality constraints and inference rules in the same ruleset only when both belong to the same domain workflow.
Troubleshooting
- No Rule Engine records written:
- Ensure SHACL contains
sh:ruleblocks. - Ensure rules actually construct rdf:type dqs:RuleEngineResult (or a dqs:RuleEngineResult).
- Ensure each result includes
dqs:focusNodeanddqs:ruleId. - Dependency errors:
- Validate
dqs:dependsOnandsh:orderconsistency in the ruleset. - Records not visible in dashboard:
- Query
RuleEngineResultcontainer directly in Records API to verify ingestion.