> ## Documentation Index
> Fetch the complete documentation index at: https://docs.kadoa.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Databricks Integration

> Query Kadoa workflow output through Databricks Delta Sharing

Kadoa can publish workflow output as read-only Delta tables and share them with
your Databricks workspace through Delta Sharing. If you do not use Databricks,
Kadoa can also provide an open Delta Sharing recipient token for compatible
clients.

<Note>
  This connector is set up by Kadoa. It is separate from direct S3 access through
  the [Cloud Storage](/docs/integrations/cloud-storage) connector.
</Note>

## Required Inputs

Choose the recipient mode that matches your environment.

### Databricks Recipient

Use this mode if you have a Unity Catalog-enabled Databricks workspace.

| Input              | Description                                                                       |
| ------------------ | --------------------------------------------------------------------------------- |
| Sharing identifier | The Databricks sharing identifier for the workspace that should receive the share |
| Workflows          | All workflows by default, or a selected workflow list                             |
| Activity log       | Enabled by default, unless you ask Kadoa to disable it                            |

Your Databricks administrator can find the sharing identifier in the Databricks
Delta Sharing recipient setup flow.

### Token Recipient

Use this mode if you want to read the share with an open Delta Sharing client
instead of a Databricks workspace.

| Input          | Description                                                               |
| -------------- | ------------------------------------------------------------------------- |
| Secure contact | The person or channel that should receive the one-time activation details |
| Workflows      | All workflows by default, or a selected workflow list                     |
| Activity log   | Enabled by default, unless you ask Kadoa to disable it                    |

Token credentials are activation material. Kadoa does not store the credential
file or bearer token in the connector configuration. Token lifetime is currently
up to 365 days.

## Setup Flow

1. Kadoa creates a Databricks connector for your team and selected workflows.
2. Kadoa provisions a dedicated recipient, share, schema, and storage path.
3. Kadoa verifies the provider-side share and grant setup before customer data
   is published.
4. After each successful workflow run, Kadoa stages and validates public data,
   publishes Delta tables, and updates the share.
5. For initial rollout, Kadoa publishes and verifies one live workflow delivery
   before running the full historical backfill.

## Shared Tables

Each connector exposes physical Delta tables. Internal staging tables and raw
provider tables are not shared.

| Table                    | Purpose                                                            |
| ------------------------ | ------------------------------------------------------------------ |
| `WF_<id>__V<n>`          | Versioned workflow output for a specific schema version            |
| `WF_<id>__LATEST`        | Latest workflow output table for quick exploration                 |
| Workflow metadata tables | Workflow, schema version, and field mapping metadata               |
| `ACTIVITY_LOG`           | Customer-visible activity events, when activity sharing is enabled |

Kadoa writes only public workflow fields into shared tables. Private fields,
including fields whose names start with `_`, are excluded before publish.

## Schema Changes

Kadoa keeps historical schema versions available. When a workflow schema changes,
Kadoa publishes a new `WF_<id>__V<n>` table and updates `WF_<id>__LATEST` only
after the version table and share checks pass. `WF_<id>__LATEST` follows the
newest validated schema; older rows remain available in their `WF_<id>__V<n>`
tables and metadata rows, and `__LATEST` may be rebuilt for the new schema.

Use versioned tables for downstream models that need a stable schema. Use
`__LATEST` for exploration or workloads that intentionally follow the newest
schema.

## Activity Log

Activity sharing is enabled by default. The activity table uses a fixed public
column allowlist and does not include user emails, internal values, raw change
payloads, or private workflow fields.

## Backfill

For initial setup, Kadoa backfills historical workflow output for all scoped
workflows by default. Activity history is also backfilled when activity sharing
is enabled.

Backfill uses the same delivery path as normal workflow runs, so data is staged,
validated, published, and checked before being marked shared.

## Freshness And Delivery State

New workflow runs are expected to appear after Kadoa stages, validates, and
publishes the Delta tables. Delivery is asynchronous and can take longer during
large backfills or schema changes.

Kadoa tracks delivery state internally, including staging, schema, load,
publish, share, and proof failures.

## Disable And Revocation

Disabling a Databricks connector stops future deliveries. Revocation removes the
recipient's access to the share. Historical Delta data is not deleted by default.

## Example Queries

Query the latest workflow output:

```sql theme={null}
select *
from wf_<id>__latest
limit 100;
```

Query a fixed schema version:

```sql theme={null}
select *
from wf_<id>__v1
limit 100;
```

Query activity events:

```sql theme={null}
select occurred_at, event_title, resource_name, workflow_id, request_source
from activity_log
order by occurred_at desc;
```
