Algorithm Overview

The change detection process follows a three-phase matching algorithm:

  1. Exact Matches: Find objects that are completely identical (with whitespace normalization)
  2. Partial Matches: Find objects that are similar enough to be considered the same but with changes
  3. Leftovers: Classify remaining objects as either new or removed

Whitespace Normalization

All string comparisons use whitespace normalization to avoid false positives:

  • Leading and trailing whitespace is trimmed
  • Multiple consecutive whitespace characters are replaced with single spaces
  • Different types of whitespace (tabs, newlines) are converted to standard spaces
// These are considered identical:
"Product  Name\t\n" === "Product Name"
"  Hello    World  " === "Hello World"

Phase 1: Exact Matches

Objects are considered unchanged if they match exactly across all fields after whitespace normalization:

// Previous run
{
  "title": "MacBook Pro  ",
  "price": "$1999",
  "availability": "In Stock"
}

// Current run  
{
  "title": "MacBook Pro",
  "price": "$1999", 
  "availability": "In Stock"
}

Result: unchanged - No notification sent (whitespace differences ignored)

Phase 2: Partial Matches (Changed Objects)

Objects are considered changed if they meet either of these criteria:

Key Field Matching (When Configured)

When key fields are configured in your monitoring setup, objects are matched based on these fields. ALL key fields must match for objects to be considered the same:

// Previous run
{
  "id": "12345",
  "title": "Product A",
  "price": "$100",
  "category": "Electronics"
}

// Current run
{
  "id": "12345",           // Key field matches
  "title": "Product A Pro", // Changed
  "price": "$150",         // Changed
  "category": "Tech"       // Changed
}

Result: changed - ID key field matches, so it’s the same object with updates

Key Field Behavior:

  • Key fields are defined in your monitoring configuration with isKeyField: true
  • Common key fields include: id, url, link, title
  • If ANY key field changes, objects are treated as removed + added (not changed)
  • Key field matching ignores whitespace differences

Traditional Field Match Ratio (Default Method)

When no key fields are configured, objects are considered the same if they share >50% of their fields:

// Previous run
{
  "name": "John Doe",
  "age": "25",
  "city": "New York"
}

// Current run
{
  "name": "John Doe",     // Same
  "age": "26",           // Changed
  "city": "New York"     // Same
}

Field Match Calculation: 2 out of 3 fields match = 66.7% > 50%

Result: changed - Field match ratio exceeds threshold

Important Notes:

  • Only string fields are considered in the ratio calculation
  • Empty fields and non-string values are excluded from matching
  • Objects with different field structures can still match if the ratio is high enough

Phase 3: New and Removed Objects

Objects that don’t match in the previous phases are classified as:

New Objects

Objects that exist in the current run but have no suitable match in the previous run:

// Previous run: []

// Current run
[
  {
    "title": "New Product Launch",
    "price": "$299",
    "category": "Electronics"
  }
]

Result: added - Completely new object

Removed Objects

Objects that existed in the previous run but have no suitable match in the current run:

// Previous run
[
  {
    "title": "Discontinued Product", 
    "price": "$199",
    "category": "Electronics"
  }
]

// Current run: []

Result: removed - Object no longer exists

Key Fields vs Traditional Matching

MethodWhen UsedMatching LogicAdvantages
Key FieldsWhen isKeyField: true is set on monitored fields (optional)ALL key fields must match exactlyMore reliable, handles major content changes
TraditionalWhen no key fields are configured (default)>50% of fields must matchWorks without configuration, good for similar objects

Advanced Examples

Example 1: Key Field Matching with Major Changes

// Previous run
{
  "id": "article-123",
  "headline": "Tech News Update",
  "author": "John Doe",
  "content": "Short article...",
  "category": "Technology"
}

// Current run
{
  "id": "article-123",           // Key field matches
  "title": "Breaking Tech News", // Field name changed
  "writer": "Jane Smith",        // Field name changed
  "body": "Extended article...", // Field name changed
  "section": "Tech"             // Field name changed
}

Result: changed - ID key field matches despite 0% field name overlap

Example 2: Key Field Change = New Object

// Previous run
{
  "id": "product-123",
  "name": "Laptop",
  "price": "$999"
}

// Current run
{
  "id": "product-456",  // Key field changed
  "name": "Laptop",     // Same content
  "price": "$999"       // Same content
}

Result: removed + added - Different ID means different objects

Example 3: Traditional Matching with High Similarity

// Previous run (no key fields configured)
{
  "product": "Laptop",
  "brand": "TechCorp",
  "price": "$999",
  "specs": "8GB RAM",
  "color": "Silver"
}

// Current run
{
  "product": "Laptop",      // Same
  "brand": "TechCorp",      // Same
  "price": "$1299",        // Changed
  "specs": "16GB RAM",     // Changed
  "color": "Silver"        // Same
}

Result: changed - 3 out of 5 fields match = 60% > 50%

Example 4: Low Similarity = Separate Objects

// Previous run (no key fields configured)
{
  "name": "John",
  "age": "25",
  "city": "NYC"
}

// Current run
{
  "title": "Manager",
  "salary": "$50000",
  "department": "IT"
}

Result: removed + added - 0% field match, treated as separate objects

Configuration Best Practices

For Reliable Change Detection

  1. Key Fields (Optional): Configure isKeyField: true on unique identifiers for more accurate matching
  2. Choose Stable Fields: If using key fields, select fields that rarely change (ID, URL, SKU)
  3. Avoid Timestamps: Don’t use frequently changing fields as key fields
  4. Traditional Matching Works: The default 50% field matching works well for most use cases

Good Key Field Examples

// E-commerce products
{
  "sku": "PROD-123",      // ✅ Good key field
  "name": "Product Name",
  "price": "$99",
  "updated": "2025-01-16" // ❌ Don't use as key field
}

// News articles
{
  "url": "https://site.com/article", // ✅ Good key field
  "title": "Article Title",
  "content": "Article content...",
  "published": "2025-01-16"          // ❌ Don't use as key field
}

Schema Design Tips

// Optimal schema design
{
  "id": "unique-identifier",     // Primary key field
  "url": "https://example.com",  // Secondary key field
  "title": "Content Title",     // Descriptive field
  "price": "$99",               // Trackable change
  "content": "Main content...",  // Trackable change
  "last_updated": "2025-01-16"  // Metadata (not key field)
}

Troubleshooting Common Issues

Objects Not Matching When They Should

Problem: Related objects treated as separate added/removed instead of changed

Solutions:

  • Consider configuring key fields on unique identifiers for more reliable matching
  • Check if key fields (if configured) are changing between runs
  • Verify field names are consistent
  • Ensure objects share >50% of fields for traditional matching

Objects Incorrectly Matched

Problem: Unrelated objects incorrectly matched as changed

Solutions:

  • Use more specific key fields
  • Ensure key fields are truly unique
  • Check for data quality issues

Missing Changes

Problem: Changes not detected

Solutions:

  • Verify monitored fields are configured correctly
  • Check if objects are being matched at all
  • Review field match ratios in logs

Integration with Monitoring

Understanding these rules helps you:

  • Predict notification behavior based on your data structure
  • Configure key fields for accurate change detection
  • Interpret change notifications correctly
  • Design robust schemas for reliable monitoring

For more information on configuring monitoring, see Real-time Monitoring Setup.