Time Series Datapoint Rule Examples

What this is

This guide documents practical datapoint checks for YourOrgTimeSeries, including scheduling guidance and end-to-end test scripts that use the same handler code deployed to Cognite Functions.

When to use it

Use this guide when authoring and validating datapoint-oriented SHACL checks for time series quality.

Use it with deployment docs when moving from test scripts to scheduled production workflows.

User mental model

Input: SHACL rules + selected time windows
Execution: time-window datapoint functions (cdf_sdk, cdf_indsl) evaluate each series
Output: DataQualityValidationRecord pass/fail entries

Runtime behavior

Scheduling requirement

Time series datapoint validation should run on a schedule. The deployed workflow executes the time series handler on a cron cadence and evaluates SHACL window expressions (for example "1h-ago" to "now").

Recommended pattern:

Run on an hourly cron (for example 2 * * * *).
Use time-window expressions in SHACL that match your cadence.
Keep cadence and window aligned so records are predictable over time.

For deployment mechanics, see docs/usage/deploy.md (time series workflows are deployed when timeseries_dir is provided).

Included example rule types

The test_and_deploy examples now cover:

minimum datapoint count (explicit pass + fail)
stale latest datapoint (modeled as no datapoints in the active window)
value range bounds (datapoints_min / datapoints_max)
flatline detection (max - min span threshold)
cross-signal consistency (average delta between two series)

Minimal happy path

Scripts

1) Run validation with fake rule id and post to Records

uv run python test_and_deploy/test_yourorg_timeseries_example_rules.py \
  --rule-set-id FAKE_YourOrgTimeSeriesExampleRules \
  --job-run-id job_manual_ts_examples_001

What this script does:

selects relevant YourOrgTimeSeries examples from live data
generates SHACL for all rule types above
uploads SHACL to CDF Files
calls handle_timeseries_validation from _function_code (same deployed handler path)
posts validation results to DataQualityValidationRecord

2) Verify what was written to Records

uv run python test_and_deploy/verify_yourorg_timeseries_example_records.py \
  --rule-set-id FAKE_YourOrgTimeSeriesExampleRules \
  --job-run-id job_job_manual_ts_examples_001 \
  --lookback-hours 24

This script filters Records by ruleSetId + jobRunId and reports matched / passed / failed counts.

Real examples

Example snippets

A) Minimum datapoint count (pass/fail pattern)

PREFIX cdf_sdk: <https://cognite.com/cdf/sdk/>

SELECT $this ?count
WHERE {
  BIND(cdf_sdk:datapoints_count($this, "24h-ago", "now") AS ?count)
  FILTER(?count < 1)
}

B) Stale latest datapoint

PREFIX cdf_sdk: <https://cognite.com/cdf/sdk/>

SELECT $this ?count
WHERE {
  BIND(cdf_sdk:datapoints_count($this, "24h-ago", "now") AS ?count)
  FILTER(?count = 0)
}

C) Value range bounds

PREFIX cdf_sdk: <https://cognite.com/cdf/sdk/>

SELECT $this ?minVal ?maxVal
WHERE {
  BIND(cdf_sdk:datapoints_min($this, "24h-ago", "now") AS ?minVal)
  BIND(cdf_sdk:datapoints_max($this, "24h-ago", "now") AS ?maxVal)
  FILTER(?minVal < -100 OR ?maxVal > 1000)
}

D) Flatline detection

PREFIX cdf_sdk: <https://cognite.com/cdf/sdk/>

SELECT $this ?span ?count
WHERE {
  BIND(cdf_sdk:datapoints_min($this, "24h-ago", "now") AS ?minVal)
  BIND(cdf_sdk:datapoints_max($this, "24h-ago", "now") AS ?maxVal)
  BIND(cdf_sdk:datapoints_count($this, "24h-ago", "now") AS ?count)
  BIND((?maxVal - ?minVal) AS ?span)
  FILTER(?count >= 2 && ?span <= 0.001)
}

E) Cross-signal consistency

PREFIX cdf_sdk: <https://cognite.com/cdf/sdk/>

SELECT $this ?avgA ?avgB ?delta
WHERE {
  BIND(cdf_sdk:datapoints_average($this, "24h-ago", "now") AS ?avgA)
  BIND(cdf_sdk:datapoints_average(<http://purl.org/cognite/springfield_instances/pi:160554>, "24h-ago", "now") AS ?avgB)
  BIND(ABS(?avgA - ?avgB) AS ?delta)
  FILTER(?delta > 1.0)
}

Best practices

Keep strict operational rules as sh:Violation, and advisory checks as sh:Warning.
Use a longer window (for example 24h) when data density is low.
For flatline/range/cross checks, only apply to numeric series.
Always validate new rules with one known pass and one known fail series before scheduling broadly.

Troubleshooting

No records in output: confirm schedule is enabled and rule set is bound to the correct data model/series.
Unexpected stale findings: verify cadence and window alignment (cron vs "xh-ago" -> "now").
Noisy flatline/range alerts: tune thresholds and window size before broad rollout.

Previous section

Time Series

Next section

Conditional logic