Data quality in Kadoa has two layers that run at the end of every workflow run:Documentation Index
Fetch the complete documentation index at: https://docs.kadoa.com/llms.txt
Use this file to discover all available pages before exploring further.
- Rules are per-field checks you define for each schema field (e.g. the format of a string). Kadoa automatically generates rules that make sense for your data, but you can customize them at any time.
- Platform metrics are built-in metrics that Kadoa computes automatically (e.g. completeness).
Rules
Each schema field can have its own validation rules, specific to the field’s data type.Type
The type defines which rules are available. The default matches the schema’s data type, but you can cast values to the desired type for validation.Presence
Presence sets the minimum percentage of rows where the field must have a value. If fewer rows than the target have a value, the field is flagged.Uniqueness
Uniqueness sets the minimum percentage of values that must be unique in the column. If fewer rows than the target are unique, the field is flagged.String fields
Strings can be validated against one of three mutually-exclusive format kinds:- Free text: Any value is allowed. Optionally match a character set (e.g. “Alphanumeric”) and cap min and/or max length.
- Defined format: Match a pattern preset (e.g. “URL”), or add a custom regular expression.
- Option from a list: Match an options preset (e.g. “Language (2-letter)”) or add a custom list of allowed values.
Selecting “Customize” on any preset prefills the custom regex or list so you can take it as a starting point and tweak it.
Number fields
- Maximum decimal places: The maximum allowed number of decimal places (e.g. “0” for whole numbers).
- Range: The minimum and/or maximum allowed values.
Platform metrics
Platform metrics are built-in and require no configuration.Completeness
Compares the actual row count against the expected row count for the run, and reports:- Actual vs expected row count
- Any qualifying signals detected from the source (such as pagination information)
- Whether the result was capped by a configured limit
Suspicious values
Inspects each row for values that do not fit with the rest of the row (e.g. implausible or out-of-domain values) and reports how many columns have any flagged values with examples.When changes take effect
- Validation runs at the end of every workflow run.
- Rule edits are saved immediately, but only apply on the next run.
- Workflows linked to a template inherit their rules from the template and are read-only. Unlink the workflow from its template to edit rules directly.