Skip to main content
Data validation lets you define rules that detect anomalies or unexpected results after a workflow run. Use it to detect missing values, outliers, and schema issues before downstream use.

Rule States

Rules are organized by state throughout their lifecycle:
StateDescription
PREVIEWSuggested rule awaiting review and approval
ENABLEDActive rule generating validation issues
DISABLEDInactive rule (manually disabled or auto-disabled due to schema changes)
Rules are automatically disabled by Kadoa when schema changes would cause them to error.

Issue Status Indicators

When viewing validation results, issues are marked with status indicators:
StatusDescription
NEWFirst time the issue appears
RESOLVEDIssue no longer present
Summary chips show change since previous run: +n new issues, –n resolved.

Rule Structure

Validation rules are expressed as SQL WHERE clauses that identify problematic rows:
-- Check email formats are valid
WHERE email NOT REGEXP '^[A-Za-z0-9+_.-]+@[A-Za-z0-9.-]+\\.[A-Za-z]{2,}$'

-- All prices should be positive
WHERE price <= 0 OR price IS NULL

-- Product URLs should contain the domain
WHERE url NOT LIKE '%example.com%'

-- Check that publication dates are not in the future
WHERE publication_date > CURRENT_DATE()

Key Fields

Defining key fields lets Kadoa track rows across runs for richer insights.

Requirements

  • Values should be present for most rows
  • Values should be unique per row (no duplicates)
  • Prefer stable identifiers (e.g., product ID, URL, SKU)

How to Pick Key Fields

  • If a row cannot be matched via the key, it is treated as a new row
  • Common key fields: id, url, link, sku, product_id

Key-Based Insights

When key fields are set, the validation report shows change indicators between runs:
  • +n: new issues discovered since the previous run
  • -n: issues resolved since the previous run
Individual issues are labeled as “new” or “resolved” when applicable.

Validation Results Structure

The validation results include:
{
  "workflowId": "workflow-123",
  "runId": "run-456",
  "issues": [
    {
      "ruleId": "rule-789",
      "ruleName": "Valid email format",
      "status": "NEW",
      "affectedRows": 5,
      "rows": [...]
    }
  ],
  "summary": {
    "totalIssues": 12,
    "newIssues": 5,
    "resolvedIssues": 2
  }
}

Rule Approval

Rules are created in PREVIEW status when:
  • Auto-suggested after a workflow run
  • Generated on-demand via the “Suggest Rules” feature
  • Created via the SDK or API with preview status
Preview rules must be approved before they detect validation issues. Approval transitions rules from PREVIEW to ENABLED status. You can approve rules:
  • In the UI: Select rules and click “Approve” (see UI guide)
  • Via SDK: Use bulkApproveRules() method (see SDK guide)
  • Via API: Call the bulk approve endpoint (see API reference)

Rule Deletion

Rules can be permanently deleted when they are no longer needed. You can delete rules individually or in bulk. You can delete rules:
  • In the UI: Select rules and click “Delete” (see UI guide)
  • Via SDK: Use bulkDeleteRules() method (see SDK guide)
  • Via API: Call the bulk delete endpoint (see API reference)
Deleting rules is permanent. Consider disabling rules instead if you may need them later.

Rule Execution

  • Validation is executed at the end of each pipeline run
  • Preview rules require approval before detecting validation issues
  • Changes take effect on the next run
  • Invalid rules auto-disable when schema changes break them

Learn More