Time Series Datapoint Rule Examples
What this is
This guide documents practical datapoint checks for YourOrgTimeSeries, including scheduling guidance and end-to-end test scripts that use the same handler code deployed to Cognite Functions.
When to use it
Use this guide when authoring and validating datapoint-oriented SHACL checks for time series quality.
Use it with deployment docs when moving from test scripts to scheduled production workflows.
User mental model
- Input: SHACL rules + selected time windows
- Execution: time-window datapoint functions (
cdf_sdk,cdf_indsl) evaluate each series - Output:
DataQualityValidationRecordpass/fail entries
Runtime behavior
Scheduling requirement
Time series datapoint validation should run on a schedule. The deployed workflow executes the time series handler on a cron cadence and evaluates SHACL window expressions (for example "1h-ago" to "now").
Recommended pattern:
- Run on an hourly cron (for example
2 * * * *). - Use time-window expressions in SHACL that match your cadence.
- Keep cadence and window aligned so records are predictable over time.
For deployment mechanics, see docs/usage/deploy.md (time series workflows are deployed when timeseries_dir is provided).
Included example rule types
The test_and_deploy examples now cover:
- minimum datapoint count (explicit pass + fail)
- stale latest datapoint (modeled as no datapoints in the active window)
- value range bounds (
datapoints_min/datapoints_max) - flatline detection (
max - minspan threshold) - cross-signal consistency (average delta between two series)
Minimal happy path
Scripts
1) Run validation with fake rule id and post to Records
uv run python test_and_deploy/test_yourorg_timeseries_example_rules.py \
--rule-set-id FAKE_YourOrgTimeSeriesExampleRules \
--job-run-id job_manual_ts_examples_001
What this script does:
- selects relevant
YourOrgTimeSeriesexamples from live data - generates SHACL for all rule types above
- uploads SHACL to CDF Files
- calls
handle_timeseries_validationfrom_function_code(same deployed handler path) - posts validation results to
DataQualityValidationRecord
2) Verify what was written to Records
uv run python test_and_deploy/verify_yourorg_timeseries_example_records.py \
--rule-set-id FAKE_YourOrgTimeSeriesExampleRules \
--job-run-id job_job_manual_ts_examples_001 \
--lookback-hours 24
This script filters Records by ruleSetId + jobRunId and reports matched / passed / failed counts.
Real examples
Example snippets
A) Minimum datapoint count (pass/fail pattern)
PREFIX cdf_sdk: <https://cognite.com/cdf/sdk/>
SELECT $this ?count
WHERE {
BIND(cdf_sdk:datapoints_count($this, "24h-ago", "now") AS ?count)
FILTER(?count < 1)
}
B) Stale latest datapoint
PREFIX cdf_sdk: <https://cognite.com/cdf/sdk/>
SELECT $this ?count
WHERE {
BIND(cdf_sdk:datapoints_count($this, "24h-ago", "now") AS ?count)
FILTER(?count = 0)
}
C) Value range bounds
PREFIX cdf_sdk: <https://cognite.com/cdf/sdk/>
SELECT $this ?minVal ?maxVal
WHERE {
BIND(cdf_sdk:datapoints_min($this, "24h-ago", "now") AS ?minVal)
BIND(cdf_sdk:datapoints_max($this, "24h-ago", "now") AS ?maxVal)
FILTER(?minVal < -100 OR ?maxVal > 1000)
}
D) Flatline detection
PREFIX cdf_sdk: <https://cognite.com/cdf/sdk/>
SELECT $this ?span ?count
WHERE {
BIND(cdf_sdk:datapoints_min($this, "24h-ago", "now") AS ?minVal)
BIND(cdf_sdk:datapoints_max($this, "24h-ago", "now") AS ?maxVal)
BIND(cdf_sdk:datapoints_count($this, "24h-ago", "now") AS ?count)
BIND((?maxVal - ?minVal) AS ?span)
FILTER(?count >= 2 && ?span <= 0.001)
}
E) Cross-signal consistency
PREFIX cdf_sdk: <https://cognite.com/cdf/sdk/>
SELECT $this ?avgA ?avgB ?delta
WHERE {
BIND(cdf_sdk:datapoints_average($this, "24h-ago", "now") AS ?avgA)
BIND(cdf_sdk:datapoints_average(<http://purl.org/cognite/springfield_instances/pi:160554>, "24h-ago", "now") AS ?avgB)
BIND(ABS(?avgA - ?avgB) AS ?delta)
FILTER(?delta > 1.0)
}
Best practices
- Keep strict operational rules as
sh:Violation, and advisory checks assh:Warning. - Use a longer window (for example
24h) when data density is low. - For flatline/range/cross checks, only apply to numeric series.
- Always validate new rules with one known pass and one known fail series before scheduling broadly.
Troubleshooting
- No records in output: confirm schedule is enabled and rule set is bound to the correct data model/series.
- Unexpected stale findings: verify cadence and window alignment (
cronvs"xh-ago" -> "now"). - Noisy flatline/range alerts: tune thresholds and window size before broad rollout.