Skip to content

Data Quality Use Case and Dimensions

What this is

This page explains the DQS use case in practical terms and how the seven data quality dimensions are applied in validation workflows.

Core use case

DQS helps teams move from "data exists" to "data is trusted enough for operations and AI."

Typical goals:

  • detect quality issues early,
  • prioritize remediation with clear evidence,
  • and track quality trends over time.

User journey

The operational loop is:

  1. Define rules for expected quality behavior
  2. Assess data through scheduled/event-driven validation
  3. Monitor outputs in records and frontend workflows
  4. Act by fixing source data or updating rules

Validation narrative used in this project

Across DQS usage and deployment flows, the industrial validation narrative is:

  1. Single instance data quality validation
  2. Graph consistency
  3. Uniqueness
  4. Time Series
  5. Conditional logic
  6. Chained conditional logic

Industrial examples

Validation type Industrial example from configs
Single instance Work orders require order, description, type, and start/end timestamps for completed jobs.
Graph consistency Operation tags must exist on linked equipment, and critical relations must point to existing nodes.
Uniqueness Work order numbers and operation names are enforced as globally unique identifiers.
Time Series Sensor rules validate datapoint count, recency, outliers, and expected value ranges.
Conditional logic If status is Completed, infer a finding when endTime is missing.
Chained conditional logic Stage-2 rules consume stage-1 inferred results using dqs:dependsOn and keep lineage via dqs:causedBy.

The seven quality dimensions

1) Accuracy

Checks whether values reflect expected reality.

Examples:

  • datatype and range constraints in SHACL
  • cross-reference checks against trusted signals

2) Completeness

Checks that required data is present.

Examples:

  • required properties (sh:minCount)
  • datapoint presence in expected windows

3) Consistency

Checks that data is coherent across systems and structures.

Examples:

  • naming/pattern consistency
  • relationship consistency across linked nodes

4) Timeliness

Checks that data is sufficiently fresh for its operational use.

Examples:

  • stale-data checks via timestamps
  • window-based datapoint recency checks

5) Uniqueness

Checks that key identifiers are not duplicated.

Examples:

  • dqs:uniquenessConstraint on indexed properties
  • grouped overflow signaling for high-cardinality duplicates

6) Validity

Checks conformance to business and schema rules.

Examples:

  • class/value constraints
  • domain-specific SPARQL checks

7) Plausibility

Checks whether values are physically or statistically believable.

Examples:

  • outlier and flatline checks
  • unexpected decreases in cumulative signals

How dimensions map to DQS outputs

  • Validation outcomes are stored as DataQualityValidationRecord
  • Conditional/inference outputs are stored as RuleEngineResult
  • High-volume uniqueness scenarios can emit group_violation summaries

This enables both human triage and machine-driven automation on top of typed records.

Practical adoption pattern

  • Start with completeness + validity for fast early value
  • Add timeliness + uniqueness for operational reliability
  • Add consistency + plausibility + conditional logic for deeper trust

Previous section

Next section