Data Quality Use Case and Dimensions
What this is
This page explains the DQS use case in practical terms and how the seven data quality dimensions are applied in validation workflows.
Core use case
DQS helps teams move from "data exists" to "data is trusted enough for operations and AI."
Typical goals:
- detect quality issues early,
- prioritize remediation with clear evidence,
- and track quality trends over time.
User journey
The operational loop is:
- Define rules for expected quality behavior
- Assess data through scheduled/event-driven validation
- Monitor outputs in records and frontend workflows
- Act by fixing source data or updating rules
Validation narrative used in this project
Across DQS usage and deployment flows, the industrial validation narrative is:
- Single instance data quality validation
- Graph consistency
- Uniqueness
- Time Series
- Conditional logic
- Chained conditional logic
Industrial examples
| Validation type | Industrial example from configs |
|---|---|
| Single instance | Work orders require order, description, type, and start/end timestamps for completed jobs. |
| Graph consistency | Operation tags must exist on linked equipment, and critical relations must point to existing nodes. |
| Uniqueness | Work order numbers and operation names are enforced as globally unique identifiers. |
| Time Series | Sensor rules validate datapoint count, recency, outliers, and expected value ranges. |
| Conditional logic | If status is Completed, infer a finding when endTime is missing. |
| Chained conditional logic | Stage-2 rules consume stage-1 inferred results using dqs:dependsOn and keep lineage via dqs:causedBy. |
The seven quality dimensions
1) Accuracy
Checks whether values reflect expected reality.
Examples:
- datatype and range constraints in SHACL
- cross-reference checks against trusted signals
2) Completeness
Checks that required data is present.
Examples:
- required properties (
sh:minCount) - datapoint presence in expected windows
3) Consistency
Checks that data is coherent across systems and structures.
Examples:
- naming/pattern consistency
- relationship consistency across linked nodes
4) Timeliness
Checks that data is sufficiently fresh for its operational use.
Examples:
- stale-data checks via timestamps
- window-based datapoint recency checks
5) Uniqueness
Checks that key identifiers are not duplicated.
Examples:
dqs:uniquenessConstrainton indexed properties- grouped overflow signaling for high-cardinality duplicates
6) Validity
Checks conformance to business and schema rules.
Examples:
- class/value constraints
- domain-specific SPARQL checks
7) Plausibility
Checks whether values are physically or statistically believable.
Examples:
- outlier and flatline checks
- unexpected decreases in cumulative signals
How dimensions map to DQS outputs
- Validation outcomes are stored as
DataQualityValidationRecord - Conditional/inference outputs are stored as
RuleEngineResult - High-volume uniqueness scenarios can emit
group_violationsummaries
This enables both human triage and machine-driven automation on top of typed records.
Practical adoption pattern
- Start with completeness + validity for fast early value
- Add timeliness + uniqueness for operational reliability
- Add consistency + plausibility + conditional logic for deeper trust