Skip to content

Time Series Datapoint Rule Examples

What this is

This guide documents practical datapoint checks for YourOrgTimeSeries, including scheduling guidance and end-to-end test scripts that use the same handler code deployed to Cognite Functions.

When to use it

Use this guide when authoring and validating datapoint-oriented SHACL checks for time series quality.

Use it with deployment docs when moving from test scripts to scheduled production workflows.

User mental model

  • Input: SHACL rules + selected time windows
  • Execution: time-window datapoint functions (cdf_sdk, cdf_indsl) evaluate each series
  • Output: DataQualityValidationRecord pass/fail entries

Runtime behavior

Scheduling requirement

Time series datapoint validation should run on a schedule. The deployed workflow executes the time series handler on a cron cadence and evaluates SHACL window expressions (for example "1h-ago" to "now").

Recommended pattern:

  • Run on an hourly cron (for example 2 * * * *).
  • Use time-window expressions in SHACL that match your cadence.
  • Keep cadence and window aligned so records are predictable over time.

For deployment mechanics, see docs/usage/deploy.md (time series workflows are deployed when timeseries_dir is provided).

Included example rule types

The test_and_deploy examples now cover:

  • minimum datapoint count (explicit pass + fail)
  • stale latest datapoint (modeled as no datapoints in the active window)
  • value range bounds (datapoints_min / datapoints_max)
  • flatline detection (max - min span threshold)
  • cross-signal consistency (average delta between two series)

Minimal happy path

Scripts

1) Run validation with fake rule id and post to Records

uv run python test_and_deploy/test_yourorg_timeseries_example_rules.py \
  --rule-set-id FAKE_YourOrgTimeSeriesExampleRules \
  --job-run-id job_manual_ts_examples_001

What this script does:

  • selects relevant YourOrgTimeSeries examples from live data
  • generates SHACL for all rule types above
  • uploads SHACL to CDF Files
  • calls handle_timeseries_validation from _function_code (same deployed handler path)
  • posts validation results to DataQualityValidationRecord

2) Verify what was written to Records

uv run python test_and_deploy/verify_yourorg_timeseries_example_records.py \
  --rule-set-id FAKE_YourOrgTimeSeriesExampleRules \
  --job-run-id job_job_manual_ts_examples_001 \
  --lookback-hours 24

This script filters Records by ruleSetId + jobRunId and reports matched / passed / failed counts.

Real examples

Example snippets

A) Minimum datapoint count (pass/fail pattern)

PREFIX cdf_sdk: <https://cognite.com/cdf/sdk/>

SELECT $this ?count
WHERE {
  BIND(cdf_sdk:datapoints_count($this, "24h-ago", "now") AS ?count)
  FILTER(?count < 1)
}

B) Stale latest datapoint

PREFIX cdf_sdk: <https://cognite.com/cdf/sdk/>

SELECT $this ?count
WHERE {
  BIND(cdf_sdk:datapoints_count($this, "24h-ago", "now") AS ?count)
  FILTER(?count = 0)
}

C) Value range bounds

PREFIX cdf_sdk: <https://cognite.com/cdf/sdk/>

SELECT $this ?minVal ?maxVal
WHERE {
  BIND(cdf_sdk:datapoints_min($this, "24h-ago", "now") AS ?minVal)
  BIND(cdf_sdk:datapoints_max($this, "24h-ago", "now") AS ?maxVal)
  FILTER(?minVal < -100 OR ?maxVal > 1000)
}

D) Flatline detection

PREFIX cdf_sdk: <https://cognite.com/cdf/sdk/>

SELECT $this ?span ?count
WHERE {
  BIND(cdf_sdk:datapoints_min($this, "24h-ago", "now") AS ?minVal)
  BIND(cdf_sdk:datapoints_max($this, "24h-ago", "now") AS ?maxVal)
  BIND(cdf_sdk:datapoints_count($this, "24h-ago", "now") AS ?count)
  BIND((?maxVal - ?minVal) AS ?span)
  FILTER(?count >= 2 && ?span <= 0.001)
}

E) Cross-signal consistency

PREFIX cdf_sdk: <https://cognite.com/cdf/sdk/>

SELECT $this ?avgA ?avgB ?delta
WHERE {
  BIND(cdf_sdk:datapoints_average($this, "24h-ago", "now") AS ?avgA)
  BIND(cdf_sdk:datapoints_average(<http://purl.org/cognite/springfield_instances/pi:160554>, "24h-ago", "now") AS ?avgB)
  BIND(ABS(?avgA - ?avgB) AS ?delta)
  FILTER(?delta > 1.0)
}

Best practices

  • Keep strict operational rules as sh:Violation, and advisory checks as sh:Warning.
  • Use a longer window (for example 24h) when data density is low.
  • For flatline/range/cross checks, only apply to numeric series.
  • Always validate new rules with one known pass and one known fail series before scheduling broadly.

Troubleshooting

  • No records in output: confirm schedule is enabled and rule set is bound to the correct data model/series.
  • Unexpected stale findings: verify cadence and window alignment (cron vs "xh-ago" -> "now").
  • Noisy flatline/range alerts: tune thresholds and window size before broad rollout.

Previous section

Next section