Create a new schema
- Go to My Schemas and click ‘Create Schema’
-
Choose how to start:
- Create your own - Define your own data structure with custom fields
- Copy from an existing workflow - Use the schema from one of your workflows as a starting point
- Copy from an existing schema - Duplicate and modify one of your existing schemas
- Add, remove, or modify fields to match your data extraction needs
- Save your schema to use it in future workflows
Using schemas in workflows
When you create a new workflow, you can select one of your saved schemas to ensure consistent data extraction across different sources. This saves time and ensures your data always follows the same structure, making it easier to work with your extracted information.Data Types
When defining schemas, you specify the data type for each field to ensure accurate extraction and validation. Kadoa supports the following data types:Data Type | Description | Example Use Cases |
---|---|---|
STRING | String/text content | Product names, descriptions, article headlines |
NUMBER | Numeric values (integers, decimals) | Quantities, ratings, scores, counts |
BOOLEAN | True/false values | Availability status, feature flags, yes/no indicators |
DATE | Date values | Publication dates, deadlines, event dates |
DATETIME | Date and time values | Timestamps, scheduled times, last updated |
MONEY | Currency and monetary values | Prices, costs, revenue, discounts |
IMAGE | Image URLs and references | Product photos, thumbnails, profile pictures |
LINK | URLs and hyperlinks | Product pages, external links, social media |
OBJECT | Nested/complex JSON structures | Structured metadata, complex configurations |
ARRAY | Lists/arrays of values | Tags, categories, multiple images, feature lists |
Special Field Types
Beyond regular data fields, Kadoa supports special field types for advanced use cases:Classification Fields
Automatically categorize content into predefined labels. Useful for:- Sentiment analysis (Positive/Negative/Neutral)
- Content categorization (Technology/Business/Sports)
- Priority classification (High/Medium/Low)
Metadata Fields (Raw Content)
Extract raw page content in different formats:- HTML - Raw HTML source code
- MARKDOWN - Markdown formatted content
- PAGE_URL - Page URL