Skip to content

Cognite Data Quality

DQS System Decomposition

DQS System Decomposition Overview

What this is

This page summarizes the Data Quality Service (DQS) system decomposition and maps it to how users work with the product.

Why this decomposition exists

DQS is intentionally split into three parts so teams can evolve:

rule logic,
runtime/deployment behavior,
and output/consumer workflows

independently, without breaking the product contract.

The three parts

1) Rule Definition Layer

This layer defines what quality means and how checks are expressed.

SHACL Core for structural constraints
SHACL-AF and sh:sparql for advanced/contextual rules
Namespace functions (cdf_sdk, cdf_indsl) for platform and time-series logic
Conditional and chained conditional logic for context-aware checks

User impact:

domain experts and data engineers can encode business semantics explicitly
rules stay declarative and versionable

2) Infrastructure and Orchestration Layer

This layer defines how rules are published, deployed, triggered, and executed.

Workflows, Functions, and triggers for scheduled and event-driven execution
DataProduct + RuleSet APIs for versioned lifecycle management
YAML compatibility path for transitional and fallback operation

Runtime guarantees:

idempotent publish/deploy behavior
payload-aware version reuse
semver-stable latest selection
preflight risk checks for version-cap scenarios

3) Output and Consumption Layer

This layer defines what users and systems consume after execution.

DataQualityValidationRecord for validation outcomes
RuleEngineResult for conditional/inference outputs
deterministic typing for grouped high-volume outcomes (group_violation)

User impact:

auditable, queryable records for triage and analytics
stable contracts for dashboards, automations, and agent workflows

End-to-end flow

In practice, DQS follows:

define rules,
deploy and run,
persist outputs,
triage and remediate.

This keeps quality governance, runtime operations, and consumer UX aligned.

How this maps to personas

Data Product Owner / Steward: governs contract and release quality
Data Engineer / Domain Owner: authors rules and operates runtime pipelines
Data Consumer / Operator: consumes records, triages issues, tracks health

Previous section

Usage Guide

Next section