Skip to content

DQS System Decomposition Overview

What this is

This page summarizes the Data Quality Service (DQS) system decomposition and maps it to how users work with the product.

Why this decomposition exists

DQS is intentionally split into three parts so teams can evolve:

  • rule logic,
  • runtime/deployment behavior,
  • and output/consumer workflows

independently, without breaking the product contract.

The three parts

1) Rule Definition Layer

This layer defines what quality means and how checks are expressed.

  • SHACL Core for structural constraints
  • SHACL-AF and sh:sparql for advanced/contextual rules
  • Namespace functions (cdf_sdk, cdf_indsl) for platform and time-series logic
  • Conditional and chained conditional logic for context-aware checks

User impact:

  • domain experts and data engineers can encode business semantics explicitly
  • rules stay declarative and versionable

2) Infrastructure and Orchestration Layer

This layer defines how rules are published, deployed, triggered, and executed.

  • Workflows, Functions, and triggers for scheduled and event-driven execution
  • DataProduct + RuleSet APIs for versioned lifecycle management
  • YAML compatibility path for transitional and fallback operation

Runtime guarantees:

  • idempotent publish/deploy behavior
  • payload-aware version reuse
  • semver-stable latest selection
  • preflight risk checks for version-cap scenarios

3) Output and Consumption Layer

This layer defines what users and systems consume after execution.

  • DataQualityValidationRecord for validation outcomes
  • RuleEngineResult for conditional/inference outputs
  • deterministic typing for grouped high-volume outcomes (group_violation)

User impact:

  • auditable, queryable records for triage and analytics
  • stable contracts for dashboards, automations, and agent workflows

End-to-end flow

In practice, DQS follows:

  1. define rules,
  2. deploy and run,
  3. persist outputs,
  4. triage and remediate.

This keeps quality governance, runtime operations, and consumer UX aligned.

How this maps to personas

  • Data Product Owner / Steward: governs contract and release quality
  • Data Engineer / Domain Owner: authors rules and operates runtime pipelines
  • Data Consumer / Operator: consumes records, triages issues, tracks health

Previous section

Next section