What is a data schema?

Every workflow at Kadoa needs a data schema. It tells Kadoa:

  • what entity to look for in source data (such as products, jobs, news)
  • what fields to extract for the selected entity (such as title, price, description)

How do I manage my data schemas?

Kadoa offers two ways to create and manage data schemas:

1. Create a custom schema from scratch

  • Create a new custom workflow
  • Configure the workflow to extract the desired data
  • Click the ”…” menu of the workflow on the dashboard and select “Save as schema”. Note: This option is currently not available for workflows that already use a pre-defined schema.
  • Give your schema a name and description
  • Your custom schema will be available on the data schemas page

2. Customize a pre-defined schema

  • Navigate to the data schemas page on your dashboard
  • Click “Add new data schema”
  • Choose a pre-defined schema as a starting point, based on your use case (e.g., products, jobs, news)
  • Customize the schema by adding, removing, or modifying fields
  • Save your customized schema
  • Create a new workflow and select your customized schema

Using your schemas

When you create a new workflow, you can choose to use one of your custom or customized schemas.

Pre-defined data schemas

Kadoa offers pre-defined schemas for common use cases. These schemas come with a set of standard fields that Kadoa will extract, but you can still customize them based on your specific needs.

Currently available data schemas:

more to come soon!

Job postings schema

We offer a dedicated API endpoint optimized for fetching structured job posting data. When you create a new workflow using this schema, Kadoa will extract the following common fields:

NameDescriptionExample
descriptionJob description including roles, responsibilities, and company overviewResponsible for developing and maintaining web applications…
jobTitleThe title of the job being postedSenior Software Engineer
datePostedThe date when the job was posted, formatted as YYYY:MM:DD:hh:mm:ss2023:11:20:10:45:00
applyUrlURL where applicants can apply for the jobhttps://example.com/apply
urlDirect URL to the job postinghttps://example.com/job-posting
jobLocationStructured location data of the job including city, country code, and postal code{"@type": "Place", "address": {"@type": "PostalAddress", "addressLocality": "San Francisco", "addressCountry": "US", "postalCode": "94103"}}
baseSalarySalary range for the job, including currency and timeframe{"@type": "MonetaryAmount", "currency": "USD", "value": {"minValue": 100000, "maxValue": 120000, "unitText": "ANNUALLY", "@type": "QuantitativeValue"}}
workHoursTypical working hours for the job9am to 5pm, Monday to Friday
jobBenefitsArray of benefits offered by the company[“Health insurance”, “Retirement plan”]
qualificationsArray of required qualifications other than work experience[“Bachelor’s degree in Computer Science”, “Strong problem-solving skills”]
experienceRequirementsArray of work experience requirements[“At least 5 years of experience in software development”]
recruiterEmailEmail address of the recruiter or hiring managerrecruiter@example.com
occupationalCategoryThe job category as per a predefined classificationSoftware and Web Developers
applicationDeadlineThe deadline for job application submissions, formatted as YYYY:MM:DD:hh:mm:ss2024:01:15:23:59:59
languageThe primary language of the job postingEnglish
logoURL to the company’s logohttps://example.com/logo.png
employmentTypeType of employment offered in the job posting (e.g., FULL_TIME, PART_TIME)FULL_TIME
idUnique identifier for the job postingabc123xyz

Ecommerce schema

When you create a new workflow using this schema, Kadoa will extract the following common fields:

NameDescriptionExample
linkURL link to the product pagehttps://example.com/product/samsung-galaxy-s23
nameName of the productSamsung Galaxy S23 256GB
brandBrand name of the productSamsung
pricePrice of the product999.99
priceCurrencyCurrency code for the product priceUSD
specsTechnical specifications of the product{"Processor": "Snapdragon 8 Gen 1", "RAM": "8GB"}
descriptionFull product descriptionThe latest Samsung Galaxy S23 comes with a Snapdragon 8 Gen 1 processor, 8GB RAM, and a 256GB storage.
skuStock Keeping Unit identifier for the productSGS23-256GB-BLK
gtinGlobal Trade Item Number for the product00012345678905
eanEuropean Article Number for the product1234567890123
imagesArray of image URLs of the product[”https://example.com/images/product/samsung-galaxy-s23-front.jpg”, ”https://example.com/images/product/samsung-galaxy-s23-back.jpg”\]
reviewsNumber of reviews for the product102

News article schema

When you create a new workflow using this schema, Kadoa will extract the following common fields:

NameDescriptionExample
titleThe title of the news articleReading news improves cognitive function
fullTextFull text of the articleThis is a full article about something interesting - hopefully
authorAuthor of the job postingJohn Doe
datePostedThe date when the news article was posted2023:11:20:10:45:00
readTimeExpected reading time for this article24min
languageThe primary language of the articleEN
dateScrapedThe date when the news article was extracted2023:11:20:10:45:00
urlDirect URL to the news articlehttps://example.com/news
scraperNameName of the scraper from the dashboardNews Site Name
idUnique identifier for the news postingabc123xyz