REST API

The REST API gives you complete control over when and how you retrieve your extracted data. Perfect for batch processing, scheduled jobs, or on-demand access.

Basic Usage

Get Latest Data

Retrieve the most recent data from a workflow: View full API reference →

GET https://api.kadoa.com/v4/workflows/{workflowId}/data

{
  "workflowId": "workflow-123",
  "runId": "run-456",
  "executedAt": "2024-01-15T10:30:00Z",
  "data": [
    {
      "id": "123",
      "title": "Product Name",
      "price": 99.99,
      "extractedAt": "2024-01-15T10:30:00Z"
    }
  ],
  "pagination": {
    "totalCount": 150,
    "page": 1,
    "totalPages": 6,
    "limit": 25
  }
}

The total row count is pagination.totalCount. There is no hasMore flag; to check whether further pages exist, compare pagination.page < pagination.totalPages.

Key field deduplication: If your workflow schema marks one or more fields as key fields (isKey: true), Kadoa automatically deduplicates results so there is at most one record per unique key combination. Key fields should be scalar STRING, NUMBER, or LINK fields. Records where any key field is missing or empty are treated as distinct and are not merged.

Pagination and Filtering

Handle Large Datasets

Use pagination for efficient data retrieval:

GET https://api.kadoa.com/v4/workflows/{workflowId}/data?page=1&limit=100

Query Parameters

Parameter	Type	Default	Description
`page`	integer	`1`	Page number
`limit`	integer	`25`	Rows per page. Set to `0` to stream all rows without paging
`sortBy`	string	—	Field name to sort by
`order`	string	`asc`	Sort order: `asc` or `desc`
`filters`	string	—	JSON-encoded array of filter objects (see below)
`runId`	string	—	Retrieve data from a specific historical run instead of the latest
`format`	string	`json`	Response format: `json` or `csv`

Filtering

Pass a URL-encoded JSON array to filters. Each entry specifies a field, an operator, and a value:

GET https://api.kadoa.com/v4/workflows/{workflowId}/data?limit=50&filters=[{"field":"jobTitle","operator":"CONTAINS","value":"Manager"},{"field":"postedDate","operator":"AFTER","value":"2024-01-01"}]

Available operators:

Operator	Description
`EQUALS` / `NOT_EQUALS`	Exact match
`CONTAINS` / `NOT_CONTAINS`	Substring match (case-insensitive)
`STARTS_WITH` / `ENDS_WITH`	Prefix / suffix match
`GREATER_THAN` / `LESS_THAN` / `GREATER_THAN_OR_EQUAL` / `LESS_THAN_OR_EQUAL`	Numeric or date comparison
`IN` / `NOT_IN`	Value must (or must not) be in an array: `"value": ["Sales","Marketing"]`
`IS_NULL` / `IS_NOT_NULL`	Field presence check
`IS_EMPTY` / `IS_NOT_EMPTY`	Null or empty string check
`BEFORE` / `AFTER`	Date field comparison
`WITHIN_LAST_DAYS`	Date field within the last N days: `"value": 7`

Data Formats

JSON (Default)

Standard JSON format, perfect for modern applications:

{
  "data": [...],
  "pagination": {...}
}

CSV Format

Add ?format=csv to receive a CSV file instead of JSON:

GET https://api.kadoa.com/v4/workflows/{workflowId}/data?format=csv

All pagination and filter parameters apply. For large exports that would exceed the response timeout, use download=link:

GET https://api.kadoa.com/v4/workflows/{workflowId}/data?format=csv&download=link

This materializes the file in object storage and returns a downloadPath you can fetch separately.

Parquet option

Create a signed URL for workflow data as a typed Parquet file: View full API reference →

GET https://api.kadoa.com/v4/workflows/{workflowId}/data/export?format=parquet

Use Parquet when you want to load Kadoa output directly into analytical tools such as DuckDB, Spark, Polars, Snowflake, BigQuery, or pandas. The export endpoint materializes the requested result set and returns a signed url. Pass runId to export a specific historical run:

curl "https://api.kadoa.com/v4/workflows/{workflowId}/data/export?format=parquet&runId={runId}" \
  -H "x-api-key: YOUR_API_KEY"

Filtering, sorting, and row selection use the same query parameters as CSV and JSON exports. For direct streaming of the complete run artifact, use GET /v4/workflows/{workflowId}/data/parquet. Typed columns are preserved where the workflow schema provides native types, including booleans, numbers, dates, timestamps, JSON-compatible objects, and arrays. Older workflow runs may still reflect the type information available when that run was produced.

Error Handling

Common HTTP Status Codes

Code	Meaning	Action
200	Success	Process data normally
400	Bad Request	Check query parameters
401	Unauthorized	Verify API key
404	Not Found	Check workflow ID
429	Rate Limited	Wait and retry
500	Server Error	Contact support

API Reference

For complete API documentation, see:

Foundations

Security & Compliance

Build with UI

Build with Code

Integrations

Basic Usage

Get Latest Data

Handle Large Datasets

Query Parameters

Filtering

Data Formats

JSON (Default)

CSV Format

Parquet option

Error Handling

Common HTTP Status Codes

API Reference

​Basic Usage

​Get Latest Data

​Pagination and Filtering

​Handle Large Datasets

​Query Parameters

​Filtering

​Data Formats

​JSON (Default)

​CSV Format

​Parquet option

​Error Handling

​Common HTTP Status Codes

​API Reference

Basic Usage

Get Latest Data

Pagination and Filtering

Handle Large Datasets

Query Parameters

Filtering

Data Formats

JSON (Default)

CSV Format

Parquet option

Error Handling

Common HTTP Status Codes

API Reference